doc/charm++/msa.tex

   1 \section{Multiphase Shared Arrays}
   2 \label{msa}
   3
   4 Charm++ includes \emph{multiphase shared arrays} (MSA), a distributed
   5 shared array library designed to safely enable common patterns of
   6 shared-memory scientific applications while retaining the benefits of
   7 an asynchronous adaptive runtime system. The general pattern of usage
   8 that MSA supports is one in which the application does one sort of
   9 access on an array for a while, and then changes to another kind of
  10 access at some distinguished point in the execution. For instance, one
  11 part of the program might accumulate a physical quantity onto a grid
  12 representing the simulation space, and another part might read from
  13 that grid once it is fully calculated.
  14
  15 MSA formalizes this pattern in its access modes and phases. This
  16 formality allows the runtime to optimize data movement and exclude
  17 race conditions and other hard-to-debug errors that can be common in
  18 shared-array programming.
  19
  20 Development on MSA is ongoing. This means that its API may not be
  21 completely stable, but it also means that its developers are actively
  22 seeking suggestions of new use cases and features. At present, it is
  23 generic over element type, and supports 1, 2, and 3 dimensional
  24 rectangular arrays with row- and column-major element orderings.
  25
  26 \subsection{Access Modes}
  27
  28 At any given time, each MSA is in a particular \emph{access mode} that
  29 allows one kind of operation on the array. At present, MSA defines the
  30 following access modes:
  31
  32 \begin{description}
  33 \item[Read-Only Mode]
  34 As its name suggests, read-only mode makes the array immutable,
  35 permitting reads of any element but writes to none.
  36 E.g. {\tt x = r(i, j, k);}
  37
  38 \item[Write-Once Mode]
  39 MSA's write mode allows only direct assignment of values to elements
  40 of the array. E.g. {\tt w(i, j) = y;}
  41
  42 When assigning values to elements of a shared array, the primary
  43 safety concern is write-after-write conflicts, in which some
  44 particular entry is assigned twice, possibly by different parallel
  45 entities. MSA checks for conflicting assignments at
  46 runtime\footnote{When {\tt CMK\_ERROR\_CHECKING} is defined}, and will
  47 raise an error on detection.
  48
  49 \item[Accumulate Mode]
  50
  51 Accumulation is the application of a commutative associative operator,
  52 such as addition or multiplication, to some entry in an
  53 array. E.g. {\tt a(i) += z;}. MSA allows arbitrary user-defined
  54 accumulation operations, including things like set union.
  55
  56 Frequently, applications do a set of read-modify-write accesses across
  57 all or parts of a shared array. When the modification is an
  58 accumulation, the code that has the value to be contributed needn't
  59 actually read the current value, as long as all contributions to each
  60 element are accounted for before the next read from that element. As
  61 we will see, MSA's structure ensures that this is the case, without
  62 requiring atomic remote operations or locking.
  63
  64 \end{description}
  65
  66 To provide the guarantees described above, MSA requires that clients
  67 of an array collectively synchronize when they wish to change the mode
  68 in which the array will be accessed. Each period between
  69 synchronizations is called a \emph{phase}. At phase boundaries, writes
  70 are flushed and accumulations are totaled up.
  71
  72 MSA enforces most of the restrictions of its access modes discipline
  73 at compile time through the static types of \emph{handle}
  74 objects. During a phase, each client of an array will hold an
  75 appropriately typed handle to that array, presenting only the allowed
  76 operations in its interface. At synchronization, clients ``trade in''
  77 the old handle to be invalidated, and receive a new one representing
  78 the array's mode for the upcoming phase.
  79
  80 \subsection{Declaration and construction}
  81
  82 The MSA package is defined as a set of C++ templates that make it
  83 generic over the type of its elements and the accumulation operation.
  84 The templates are named {\tt MSAKD<>}, where K is the number of
  85 dimensions in the array, and the template arguments are the contained
  86 type and a class defining the desired accumulation operator:
  87
  88 \begin{verbatim}
  89 template<class ENTRY, class ENTRY_OPS_CLASS>
  90 class MSAKD { ... };
  91 \end{verbatim}
  92
  93 The second argument must be a class (or class template) meeting the
  94 following interface for the corresponding type {\tt ENTRY}:
  95 \begin{verbatim}
  96 struct entry_ops {
  97   /// Accumulate b into element a
  98   static void accumulate(ENTRY &a, const ENTRY &b);
  99   /// An identity value for the accumulate operation
 100   static ENTRY getIdentity();
 101   /// Should the MSA PUP its elements when shipping them around?
 102   static bool pupEveryElement();
 103 };
 104 \end{verbatim}
 105 The library provides templated implementations of this interface for
 106 addition, multiplication, and maximum. See the MSA headers for details
 107 on these classes.
 108
 109 When constructing an MSA, the application must specify the extent of
 110 the array in each dimension and the number of clients that will access
 111 the array (so that the library can tell when all clients have
 112 synchronized). The array creation constructors are as follows:
 113 \begin{verbatim}
 114 /// Indices range over [0, N]
 115 MSA1D(unsigned int num_entries, unsigned int num_workers);
 116 MSA2D(unsigned int rows, unsigned int cols, unsigned int num_workers);
 117 MSA3D(unsigned int x, unsigned int y, unsigned int z, unsigned int num_workers);
 118 /// Indices range over [da, db]
 119 MSA3D(int xa, int xb, int ya, int yb, int za, int zb, unsigned int num_wrkrs);
 120 \end{verbatim}
 121 At construction, the array is filled with its identity element.
 122
 123 After construction, each client must call the {\tt enroll()} method on
 124 a copy of the MSA object. These clients must run all code operating on
 125 the MSA in a user-level thread, via entry methods with the {\tt
 126   [threaded]} attribute, AMPI ranks, or explicitly created {\sc
 127   TCharm} threads.
 128
 129 Once enrollment is complete, each of the clients can get a handle to
 130 enter data into the array:
 131 \begin{verbatim}
 132 MYMSA::Write wMymsa = mymsa.getInitialWrite();
 133 // or
 134 MYMSA::Accum aMymsa = mymsa.getInitialAccum();
 135 \end{verbatim}
 136
 137 When a client is done with its work on an array in a phase, it must
 138 call a synchronization method on its handle. If it will operate on the
 139 array in the next phase, it asks for a new handle, and if it won't, it
 140 simply indicates that it's done:
 141 \begin{verbatim}
 142 MYMSA::Read rMymsa = aMymsa.syncToRead();
 143 MYMSA::Write wMymsa = rMymsa.syncToWrite();
 144 MYMSA::Accum aMymsa = wMymsa.syncToAccum();
 145 aMymsa.syncDone();
 146 \end{verbatim}
 147 All clients must synchronize to the same mode in each phase. If the
 148 array sees clients synchronizing into different modes, it will raise
 149 an error at runtime.
 150
 151 \subsection{MSA Usage Example}
 152
 153 The constructs described so far are demonstrated in this example
 154 implementation of a parallel distributed $k$-means clustering
 155 algorithm:
 156
 157 \lstset{language=c++, commentstyle=\textit, emphstyle=\textbf,
 158   numbers=left, keywordstyle=, label=lst:kmeans, captionpos=b,
 159   basicstyle=\small\tt,
 160 %
 161 caption={Parallel $k$-Means Clustering implemented using an MSA named
 162   \textbf{clusters}. This function is run in a thread on every
 163   processor. First, processors selected as initial `seeds' write their
 164   locations into the array (call on line 5). Then, all the processors
 165   iterate finding the closest seed (lines 14--20) and moving
 166   themselves into it (22--33). They all test for convergence by
 167   checking an entry indicating whether any processor moved (37). }
 168 %
 169 }
 170 \lstinputlisting{kmeans.cpp}
 171