Bug #757: disable unnecassary network polling in multicore build
[charm.git] / doc / charm++ / sdag.tex
blobedaf52e6b0ac5fecf0004dc8289afc8d1e88c307
1 \charmpp\ is based on the message-driven parallel programming paradigm. In
2 contrast to many other approaches, \charmpp\ programmers encode entry points to
3 their parallel objects, but do not explicitly wait (i.e. block) on the runtime
4 to indicate completion of posted `receive' requests. Thus, a \charmpp\ object's
5 overall flow of control can end up fragmented across a number of separate
6 methods, obscuring the sequence in which code is expected to
7 execute. Furthermore, there are often constraints on when different pieces of
8 code should execute relative to one another, related to data and
9 synchronization dependencies.
11 Consider one way of expressing these constraints using flags, buffers, and
12 counters, as in the following example:
14 %\begin{figure}[ht]
15 \begin{center}
16 \begin{alltt}
17 // in .ci file
18 chare ComputeObject \{
19 entry void ComputeObject();
20 entry void startStep();
21 entry void firstInput(Input i);
22 entry void secondInput(Input j);
23 \};
25 // in C++ file
26 class ComputeObject : public CBase_ComputeObject \{
27 int expectedMessageCount;
28 Input first, second;
30 public:
31 ComputeObject() \{
32 startStep();
34 void startStep() \{
35 expectedMessageCount = 2;
38 void firstInput(Input i) \{
39 first = i;
40 if (\verb|--expectedMessageCount == 0|)
41 computeInteractions(first, second);
43 void recv_second(Input j) \{
44 second = j;
45 if (\verb|--expectedMessageCount == 0|)
46 computeInteractions(first, second);
49 void computeInteractions(Input a, Input b) \{
50 // do computations using a and b
51 . . .
52 // send off results
53 . . .
54 // reset for next step
55 startStep();
57 \};
58 \end{alltt}
59 \end{center}
60 %\caption{Compute Object in a Molecular Dynamics Application}
61 %\label{figchareexample}
62 %\end{figure}
63 In each step, this object expects pairs of messages, and waits to process the
64 incoming data until it has both of them. This sequencing is encoded across 4
65 different functions, which in real code could be much larger and more numerous,
66 resulting in a spaghetti-code mess.
68 Instead, it would be preferable to express this flow of control using
69 structured constructs, such as loops. \charmpp\ provides such constructs for
70 structured control flow across an object's entry methods in a notation called
71 Structured Dagger. The basic constructs of Structured Dagger (SDAG) provide for
72 \emph{program-order execution} of the entry methods and code blocks that they
73 define. These definitions appear in the {\tt .ci} file definition of the
74 enclosing chare class as a `body' of an entry method following its signature.
76 The most basic construct in SDAG is the {\tt serial} (aka the {\tt atomic}) block.
77 Serial blocks
78 contain sequential \CC code. They're also called atomic because the code within
79 them executes without returning control to the \charmpp\ runtime scheduler, and
80 thus avoiding interruption from incoming messages. The keywords atomic and serial
81 are synonymous, and you can find example programs that use atomic. However, we
82 recommend the use of serial and are considering the deprecation of the atomic keyword.
83 Typically serial blocks hold
84 the code that actually deals with incoming messages in a {\tt when} statement,
85 or to do local operations before a message is sent or after it's received. The
86 earlier example can be adapted to use serial blocks as follows:
87 \begin{center}
88 \begin{alltt}
89 // in .ci file
90 chare ComputeObject \{
91 entry void ComputeObject();
92 entry void startStep();
93 entry void firstInput(Input i) \{
94 serial \{
95 first = i;
96 if (\verb|--expectedMessageCount == 0|)
97 computeInteractions(first, second);
99 \};
100 entry void secondInput(Input j) \{
101 serial \{
102 second = j;
103 if (\verb|--expectedMessageCount == 0|)
104 computeInteractions(first, second);
109 // in C++ file
110 class ComputeObject : public CBase\_ComputeObject \{
111 ComputeObject\_SDAG\_CODE
112 int expectedMessageCount;
113 Input first, second;
115 public:
116 ComputeObject() \{
117 startStep();
119 void startStep() \{
120 expectedMessageCount = 2;
123 void computeInteractions(Input a, Input b) \{
124 // do computations using a and b
125 . . .
126 // send off results
127 . . .
128 // reset for next step
129 startStep();
132 \end{alltt}
133 \end{center}
134 Note that chare classes containing SDAG code must include a few additional declarations
135 in addition to inheriting from their {\tt CBase\_Foo} class, by incorporating the
136 {\tt Foo\_SDAG\_CODE} generated-code macro in the class.
138 Serial blocks can also specify a textual `label' that will appear in traces, as
139 follows:
140 \begin{center}
141 \begin{alltt}
142 entry void firstInput(Input i) \{
143 serial "process first" \{
144 first = i;
145 if (\verb|--expectedMessageCount == 0|)
146 computeInteractions(first, second);
149 \end{alltt}
150 \end{center}
152 In order to control the sequence in which entry methods are processed, SDAG
153 provides the {\tt when} construct. These statements, also called triggers,
154 indicate that we expect an incoming message of a particular type, and provide
155 code to handle that message when it arrives. From the perspective of a chare
156 object reaching a {\tt when} statement, it is effectively a `blocking
157 receive.'
159 Entry methods defined by a {\tt when} are
160 not executed immediately when a message targeting them is delivered, but
161 instead are held until control flow in the chare reaches a corresponding {\tt
162 when} clause. Conversely, when control flow reaches a {\tt when} clause, the
163 generated code checks whether a corresponding message has arrived: if one has
164 arrived, it is processed; otherwise, control is returned to the
165 \charmpp\ scheduler.
167 The use of {\tt when} substantially simplifies the example from above:
168 \begin{center}
169 \begin{alltt}
170 // in .ci file
171 chare ComputeObject \{
172 entry void ComputeObject();
173 entry void startStep() \{
174 when firstInput(Input first)
175 when secondInput(Input second)
176 serial \{
177 computeInteractions(first, second);
180 entry void firstInput(Input i);
181 entry void secondInput(Input j);
184 // in C++ file
185 class ComputeObject : public CBase_ComputeObject \{
186 ComputeObject_SDAG_CODE
188 public:
189 ComputeObject() \{
190 startStep();
193 void computeInteractions(Input a, Input b) \{
194 // do computations using a and b
195 . . .
196 // send off results
197 . . .
198 // reset for next step
199 startStep();
202 \end{alltt}
203 \end{center}
204 Like an {\tt if} or {\tt while} in C code, each {\tt when} clause has a body
205 made up of the statement or block following it. The variables declared as
206 arguments to the entry method triggering the when are available in the scope of
207 the body. By using the sequenced execution of SDAG code and the availability of
208 parameters to when-defined entry methods in their bodies, the counter {\tt
209 expectedMessageCount} and the intermediate copies of the received input are
210 eliminated. Note that the entry methods {\tt firstInput} and {\tt secondInput}
211 are still declared in the {\tt .ci} file, but their definition is in the SDAG
212 code. The interface translator generates code to handle buffering and
213 triggering them appropriately.
215 For simplicity, {\tt when} constructs can also specify multiple expected entry
216 methods that all feed into a single body, by separating their prototypes with
217 commas:
218 \begin{center}
219 \begin{alltt}
220 entry void startStep() \{
221 when firstInput(Input first),
222 secondInput(Input second)
223 serial \{
224 computeInteractions(first, second);
227 \end{alltt}
228 \end{center}
230 A single entry method is allowed to appear in more than one {\tt when} statement.
231 If only one of those {\tt when} statements has been triggered when the runtime
232 delivers a message to that entry method, that {\tt when} statement is guaranteed
233 to process it. If there is no trigger waiting for that entry method, then the
234 next corresponding {\tt when} to be reached will receive that message. If there is
235 more than one {\tt when} waiting on that method, which one will receive it is not
236 specified, and should not be relied upon. For an example of multiple {\tt when}
237 statements handling the same entry method without reaching the unspecified case,
238 see the CharmLU benchmark.
240 To more finely control the correspondence between incoming messages and {\tt when}
241 clauses associated with the target entry method, SDAG supports \emph{matching} on
242 reference numbers. Matching is typically used to denote an iteration of a program
243 that executes asynchronously (without any sort of barrier or other synchronization
244 between steps) or a particular piece of the problem being solved.
245 Matching is requested by placing an expression denoting the desired reference
246 number in square brackets between the entry method name and its parameter list.
247 For parameter marshalled entry methods, the reference number expression will be
248 compared for equality with the entry method's first argument. For entry methods
249 that accept an explicit message (\S~\ref{messages}), the reference number on
250 the message can be set by calling the function
251 \verb|CkSetRefNum(void *msg, CMK_REFNUM_TYPE ref)|.
252 Matching is used in the loop example below, and in
253 \examplereffile{jacobi2d-sdag/jacobi2d.ci}. Multiple {\tt when} triggers for
254 an entry method with different matching reference numbers will not conflict - each
255 will receive only corresponding messages.
257 SDAG supports the {\tt for} and {\tt while} loop constructs mostly as if they
258 appeared in plain C or C++ code. In the running example, {\tt
259 computeInteractions()} calls {\tt startStep()} when it is finished to start
260 the next step. Instead of this arrangement, the loop structure can be made
261 explicit:
262 \begin{center}
263 \begin{alltt}
264 // in .ci file
265 chare ComputeObject \{
266 entry void ComputeObject();
267 entry void runForever() \{
268 while(true) \{
269 when firstInput(Input first),
270 secondInput(Input second) serial \{
271 computeInteractions(first, second);
275 entry void firstInput(Input i);
276 entry void secondInput(Input j);
279 // in C++ file
280 class ComputeObject : public CBase_ComputeObject \{
281 ComputeObject_SDAG_CODE
283 public:
284 ComputeObject() \{
285 runForever();
288 void computeInteractions(Input a, Input b) \{
289 // do computations using a and b
290 . . .
291 // send off results
292 . . .
295 \end{alltt}
296 \end{center}
297 If this code should instead run for a fixed number of iterations, we can
298 instead use a for loop:
299 \begin{center}
300 \begin{alltt}
301 // in .ci file
302 chare ComputeObject \{
303 entry void ComputeObject();
304 entry void runForever() \{
305 for(iter = 0; iter < n; ++iter) \{
306 // Match to only accept inputs for the current iteration
307 when firstInput[iter](int a, Input first),
308 secondInput[iter](int b, Input second) serial \{
309 computeInteractions(first, second);
313 entry void firstInput(int a, Input i);
314 entry void secondInput(int b, Input j);
317 // in C++ file
318 class ComputeObject : public CBase_ComputeObject \{
319 ComputeObject_SDAG_CODE
320 int n, iter;
322 public:
323 ComputeObject() \{
324 n = 10;
325 runForever();
328 void computeInteractions(Input a, Input b) \{
329 // do computations using a and b
330 . . .
331 // send off results
332 . . .
335 \end{alltt}
336 \end{center}
337 Note that {\tt int iter;} is declared in the chare's class definition and not
338 in the {\tt .ci} file. This is necessary because the \charmpp\ interface
339 translator does not fully parse the declarations in the {\tt for} loop header,
340 because of the inherent complexities of C++.
342 SDAG also supports conditional execution of statements and blocks with {\tt if}
343 statements. The syntax of SDAG {\tt if} statements matches that of C and
344 C++. However, if one encounters a syntax error on correct-looking code in a
345 loop or conditional statement, try assigning the condition expression to a
346 boolean variable in a serial block preceding the statement and then testing that
347 boolean's value. This can be necessary because of the complexity of parsing C++
348 code.
350 In cases where multiple tasks must be processed before execution continues, but
351 with no dependencies or interactions among them, SDAG provides the {\tt
352 overlap} construct. Overlap blocks contain a series of SDAG statements within
353 them which can occur in any order. Commonly these blocks are used to hold a
354 series of {\tt when} triggers which can be received and processed in any
355 order. Flow of control doesn't leave the overlap block until all the statements
356 within it have been processed.
358 In the running example, suppose each input needs to be preprocessed independently
359 before the call to {\tt computeInteractions}. Since we don't care which order
360 they get processed in, and want it to happen as soon as possible, we can apply
361 {\tt overlap}:
362 \begin{center}
363 \begin{alltt}
364 // in .ci file
365 chare ComputeObject \{
366 entry void ComputeObject();
367 entry void startStep() \{
368 overlap \{
369 when firstInput(Input i)
370 serial \{ first = preprocess(i); \}
371 when secondInput(Input j)
372 serial \{ second = preprocess(j); \}
374 serial \{
375 computeInteractions(first, second);
378 entry void firstInput(Input i);
379 entry void secondInput(Input j);
382 // in C++ file
383 class ComputeObject : public CBase_ComputeObject \{
384 ComputeObject_SDAG_CODE
386 public:
387 ComputeObject() \{
388 startStep();
391 void computeInteractions(Input a, Input b) \{
392 // do computations using a and b
393 . . .
394 // send off results
395 . . .
396 // reset for next step
397 startStep();
400 \end{alltt}
401 \end{center}
403 Another construct offered by SDAG is the {\tt forall} loop. These loops are
404 used when the iterations of a loop can be performed independently and in any
405 order. This is in contrast to a regular {\tt for} loop, in which each iteration
406 is executed sequentially. The {\tt forall} loop can be seen as an overlap with
407 an indexed set of otherwise identical statements in the body. Its syntax is
408 \begin{center}
409 \begin{alltt}
410 forall [IDENT] (MIN:MAX,STRIDE) BODY
411 \end{alltt}
412 \end{center}
413 The range from MIN to MAX is inclusive. Its use is demonstrated through
414 distributed parallel matrix-matrix multiply shown in
415 \examplereffile{matmul/matmul.ci}
417 \subsection{The \texttt{case} Statement}
419 The \texttt{case} statement in SDAG expresses a disjunction over a set of
420 \texttt{when} clauses. In other words, if it is known that one dependency out
421 of a set will be satisfied, but which one is not known, this statement allows
422 the set to be specified and will execute the corresponding block based on which
423 dependency ends up being fulfilled.
425 The following is a basic example of the \texttt{case} statement. Note that the
426 trigger \texttt{b(), d()} will only be fulfilled if both \texttt{b()} and
427 \texttt{d()} arrive. If only one arrives, then it will partially match, and the
428 runtime will not ``commit'' to this branch until the second arrives. If another
429 dependency fully matches, the partial match will be ignored and can be used to
430 trigger another \texttt{when} later in the execution.
432 \begin{verbatim}
433 case {
434 when a() { }
435 when b(), d() { }
436 when c() { }
438 \end{verbatim}
440 A full example of the \texttt{case} statement can be found
441 \testreffile{sdag/case/caseTest.ci}.
443 \section{Usage Notes}
445 If you've added \sdag\ code to your class, you must link in the code
446 by adding ``{\it className}\_SDAG\_CODE'' inside the class declaration
447 in the .h file. This macro defines the entry points and support code
448 used by \sdag{}. Forgetting this results in a compile error (undefined
449 SDAG entry methods referenced from the .def.h file).
451 For example, an array named ``Foo'' that uses sdag code might contain:
452 \begin{center}
453 \begin{alltt}
454 class Foo : public CBase_Foo \{
455 public:
456 Foo_SDAG_CODE
457 Foo(...) \{
460 Foo(CkMigrateMessage *m) \{ \}
462 void pup(PUP::er &p) \{
465 . . .
467 \end{alltt}
468 \end{center}
470 \zap{
471 \section{Relationship to Threads}
473 Threads are typically used to perform the abovementioned sequencing.
474 Lets us code our previous example using threads.
476 %\begin{figure}[ht]
477 \begin{center}
478 \begin{alltt}
479 void compute_thread(int first_index, int second_index)
481 getPatch(first_index);
482 getPatch(second_index);
483 threadId[0] = createThread(recvFirst);
484 threadId[1] = createThread(recvSecond);
485 threadJoin(2, threadId);
486 computeInteractions(first, second);
488 void recvFirst(void)
490 recv(first, sizeof(Patch), ANY_PE, FIRST_TAG);
491 filter(first);
493 void recvSecond(void)
495 recv(second, sizeof(Patch), ANY_PE, SECOND_TAG);
496 filter(second);
498 \end{alltt}
499 \end{center}
500 %\caption{Compute Thread in a Molecular Dynamics Application}
501 %\label{figthreadexample}
502 %\end{figure}
504 Contrast the compute chare-object example in figure~\ref{figchareexample} with
505 a thread-based implementation of the same scheme in
506 figure~\ref{figthreadexample}. Functions \uw{getFirst}, and \uw{getSecond} send
507 messages asynchronously to the PatchManager, requesting that the specified
508 patches be sent to them, and return immediately. Since these messages with
509 patches could arrive in any order, two threads, \uw{recvFirst} and
510 \uw{recvSecond}, are created. These threads block, waiting for messages to
511 arrive. After each message arrives, each thread performs the filtering
512 operation. The main thread waits for these two threads to complete, and then
513 computes the pairwise interactions. Though the programming complexity of
514 buffering the messages and maintaining the counters has been eliminated in this
515 implementation, considerable overhead in the form of thread creation, and
516 synchronization in the form of {\em join} has been added. Let us now code the
517 same example in \sdag. It reduces the parallel programming complexity without
518 adding any significant overhead.
520 %\begin{figure}[ht]
521 \begin{center}
522 \begin{alltt}
523 array[1D] compute_object \{
524 entry void recv_first(Patch *first);
525 entry void recv_second(Patch *first);
526 entry void compute_object(MSG *msg)\{
527 serial \{
528 PatchManager->Get(msg->first_index,\dots);
529 PatchManager->Get(msg->second_index,\dots);
531 overlap \{
532 when recv_first(Patch *first) serial \{ filter(first); \}
533 when recv_second(Patch *second) serial \{ filter(second); \}
535 serial \{ computeInteractions(first, second); \}
538 \end{alltt}
539 \end{center}
540 %\caption{\sdag\ Implementation of the Compute Object}
541 %\label{figsdagexample}
542 %\end{figure}
546 \zap{
547 \section{Grammar}
549 \paragraph{Tokens}
551 \begin{alltt}
552 <ident> = Valid \CC{} identifier
553 <int-expr> = Valid \CC{} integer expression
554 <\CC{}-code> = Valid \CC{} code
555 \end{alltt}
557 \paragraph{Grammar in EBNF Form}
559 \begin{alltt}
560 <sdag> := <class-decl> <sdagentry>+
562 <class-decl> := "class" <ident>
564 <sdagentry> := "sdagentry" <ident> "(" <ident> "*" <ident> ")" <body>
566 <body> := <stmt>
567 | "\{" <stmt>+ "\}"
569 <stmt> := <overlap-stmt>
570 | <when-stmt>
571 | <atomic-stmt>
572 | <if-stmt>
573 | <while-stmt>
574 | <for-stmt>
575 | <forall-stmt>
577 <overlap-stmt> := "overlap" <body>
579 <atomic-stmt> := "serial" "\{" <\CC-code> "\}"
581 <if-stmt> := "if" "(" <int-expr> ")" <body> [<else-stmt>]
583 <else-stmt> := "else" <body>
585 <while-stmt> := "while" "(" <int-expr> ")" <body>
587 <for-stmt> := "for" "(" <c++-code> ";" <int-expr> ";" <c++-code> ")" <body>
589 <forall-stmt> := "forall" "[" <ident> "]" "(" <range-stride> ")" <body>
591 <range-stride> := <int-expr> ":" <int-expr> "," <int-expr>
593 <when-stmt> := "when" <entry-list> <body>
595 <entry-list> := <entry>
596 | <entry> [ "," <entry-list> ]
598 <entry> := <ident> [ "[" <int-expr> "]" ] "(" <ident> "*" <ident> ")"
600 \end{alltt}
603 \zap{
604 \sdag\ is a coordination language built on top of \charmpp\ that supports the
605 sequencing mentioned above, while overcoming limitations of thread-based
606 languages, and facilitating a clear expression of flow of control within the
607 object without losing the performance benefits of adaptive message-driven
608 execution. In other words, \sdag\ is a structured notation for specifying
609 intra-process control dependences in message-driven programs. It combines the
610 efficiency of message-driven execution with the explicitness of control
611 specification. \sdag\ allows easy expression of dependences among messages and
612 computations and also among computations within the same object using
613 when-blocks and various structured constructs. \sdag\ is adequate for
614 expressing control-dependencies that form a series-parallel control-flow graph.
615 \sdag\ has been developed on top of \charmpp\. \sdag\ allows \charmpp\ entry
616 methods (in chares, groups or arrays) to specify code (a when-block body) to be
617 executed upon occurrence of certain events. These events (or guards of a
618 when-block) are entry methods of the object that can be invoked remotely. While
619 writing a \sdag\ program, one has to declare these entries in \charmpp\
620 interface file. The implementation of the entry methods that contain the
621 when-block is written using the \sdag\ language. Grammar of \sdag\ is given in
622 the EBNF form below.
624 \subsubsection{Usage}
626 You can use SDAG to implement entry methods for any chare, chare array, group,
627 or nodegroup. Any entry method implemented using SDAG must be implemented in the
628 interface (.ci) file for its class. An SDAG entry method consists of a series of
629 SDAG constructs of the following kinds:
631 \begin{itemize}
632 \item {\tt forall} loops:
633 \item {\tt if}, {\tt for}, and {\tt while} statements: these statements have
634 the same meaning as the normal {\tt if}, {\tt for}, and {\tt while}
635 loops in sequential \CC programs. This allows the programmer to use
636 common control flow constructs outside the context of serial blocks.
637 \end{itemize}
639 \sdag{} code can be inserted into the .ci file for any array, group, or chare's entry methods.
641 For more details regarding \sdag{}, look at \examplerefdir{hello/sdag}