Bug #757: disable unnecassary network polling in multicore build
[charm.git] / doc / faq / charm.tex
blob384d6ad428a66393720d784ff9c21923d2792bff
1 \section{\charmpp{} Programming}
3 \subsubsection{What's the basic programming model for Charm++?}
5 Parallel objects using "Asynchronous Remote Method Invocation":
7 \begin{description}
8 \item[Asynchronous] in that you {\em do not block} until the method returns--the
9 caller continues immediately.
11 \item[Remote] in that the two objects may be separated by a network.
13 \item[Method Invocation] in that it's just C++ classes calling each other's
14 methods.
15 \end{description}
17 \subsubsection{What is an ``entry method''?}
19 Entry methods are all the methods of a chare where messages can be sent by other chares.
20 They are declared in the .ci files, and they must be defined as public methods
21 of the C++ object representing the chare.
23 \subsubsection{When I invoke a remote method, do I block until that method returns?}
25 No! This is one of the biggest differences between Charm++ and most
26 other ``remote procedure call'' systems like, Java RMI, or RPC.
27 ``Invoke an asynchronous method'' and ``send a message'' have exactly the same
28 semantics and implementation.
29 Since the invoking method does not wait for the remote method to terminate, it
30 normally cannot receive any return value. (see later for a way to return values)
32 \subsubsection{Why does Charm++ use asynchronous methods?}
34 Asynchronous method invocation is more efficient because it can be
35 implemented as a single message send. Unlike with synchronous methods,
36 thread blocking and unblocking and a return message are not needed.
38 Another big advantage of asynchronous methods is that it's easy to make
39 things run in parallel. If I execute:
40 \begin{alltt}
41 a->foo();
42 b->bar();
43 \end{alltt}
44 Now foo and bar can run at the same time; there's no reason bar has
45 to wait for foo.
47 \subsubsection{Can I make a method synchronous? Can I then return a value?}
49 Yes. If you want synchronous methods, so the caller will block, use the {\tt [sync]}
50 keyword before the method in the .ci file. This requires the sender to be a threaded
51 entry method, as it will be suspended until the callee finishes.
52 Sync entry methods are allowed to return values to the caller.
54 \subsubsection{What is a threaded entry method? How does one make an entry method threaded?}
56 A threaded entry method is an entry method for a chare that executes
57 in a separate user-level thread. It is useful when the entry method wants
58 to suspend itself (for example, to wait for more data). Note that
59 threaded entry methods have nothing to do with kernel-level threads or
60 pthreads; they run in user-level threads that are scheduled by Charm++
61 itself.
63 In order to make an entry method threaded, one should add the keyword
64 {\em threaded} withing square brackets after the {\em entry} keyword in the
65 interface file:
66 \begin{alltt}
67 module M \{
68 chare X \{
69 entry [threaded] E1(void);
70 \};
71 \};
72 \end{alltt}
74 \subsubsection{If I don't want to use threads, how can an asynchronous method return a value?}
76 The usual way to get data back to
77 your caller is via another invocation in the opposite direction:
78 \begin{alltt}
79 void A::start(void) \{
80 b->giveMeSomeData();
82 void B::giveMeSomeData(void) \{
83 a->hereIsTheData(data);
85 void A::hereIsTheData(myclass_t data) \{
86 ...use data somehow...
88 \end{alltt}
89 This is contorted, but it exactly matches what the machine has to do.
90 The difficulty of accessing remote data encourages programmers to use local
91 data, bundle outgoing requests, and develop higher-level abstractions,
92 which leads to good performance and good code.
94 \subsubsection{Isn't there a better way to send data back to whoever called me?}
96 The above example is very non-modular, because {\em b} has to know
97 that {\em a} called it, and what method to call a back on. For
98 this kind of request/response code, you can abstract away the ``where to
99 return the data'' with a {\em CkCallback} object:
100 \begin{alltt}
101 void A::start(void) \{
102 b->giveMeSomeData(CkCallback(CkIndex_A::hereIsTheData,thisProxy));
104 void B::giveMeSomeData(CkCallback returnDataHere) \{
105 returnDataHere.send(data);
107 void A::hereIsTheData(myclass_t data) \{
108 ...use data somehow...
110 \end{alltt}
111 Now {\em b} can be called from several different places in {\em a},
112 or from several different modules.
114 \subsubsection{Why should I prefer the callback way to return data rather than using {\tt [sync]} entry methods?}
116 There are a few reasons for that:
118 \begin{itemize}
120 \item
121 The caller needs to be threaded, which implies some overhead in creating the
122 thread. Moreover the threaded entry method will suspend waiting for the data,
123 preventing any code after the remote method invocation to proceed in parallel.
125 \item
126 Threaded entry methods are still methods of an object. While they are suspended
127 other entry methods for the same object (or even the same threaded entry method)
128 can be called. This allows for potential problems if the suspending method does
129 leave some objects in an inconsistent state.
131 \item
132 Finally, and probably most important, {\tt [sync]} entry methods can only be
133 used to return a value that can be computed by a single chare. When more
134 flexibility is needed, such in cases where the resulting value needs to the
135 contribution of multiple objects, the callback methodology is the only one
136 available. The caller could for example send a broadcast to a chare array, which
137 will use a reduction to collect back the results after they have been computed.
139 \end{itemize}
141 \subsubsection{How does the initializazion in Charm work?}
143 Each processor executes the following operations strictly in order:
144 \begin{enumerate}
145 \item All methods registered as {\em initnode};
146 \item All methods registered as {\em initproc};
147 \item On processor zero, all {\em mainchares} constructor method is invoked (the ones taking a {\tt CkArgMsg*});
148 \item The read-onlies are propagated from processor zero to all other processors;
149 \item The nodegroups are created;
150 \item The groups are created. During this phase, for all the chare arrays have been created with a block allocation, the corresponding array elements are instantiated;
151 \item Initialization terminated and all messages are available for processing, including the messages responsible for the instantiation of array elements manually inserted.
152 \end{enumerate}
154 This implies that you can assume that the previous steps has completely finished
155 before the next one starts, and any side effect from all the previous steps are
156 committed (and can therefore be used).
158 Inside a single step there is no order guarantee. This implies that, for example,
159 two groups allocated from mainchare can be instantiated in any order. The only
160 exception to this is processor zero, where chare objects are instantiated
161 immediately when allocated in the mainchare, i.e if two groups are allocated,
162 their order is fixed by the allocation order in the mainchare constructing them.
163 Again, this is only valid for processor zero, and in no other processor this
164 assumption should be made.
166 To notice that if array elements are allocated in block (by specifying the
167 number of elements at the end of the {\tt ckNew} function), they are all
168 instantiated before normal execution is resumed; if manual insertion is used,
169 each element can be constructed at any time on its home processor, and not
170 necessarily before other regular communication messages have been delivered to
171 other chares (including other array elements part of the same array).
173 \subsubsection{Does Charm++ support C and Fortran?}
175 C and Fortran routines can be called from Charm++ using the usual API conventions for accessing them from C++. AMPI supports Fortran directly, but direct use
176 of Charm++ semantics from Fortran is at an immature stage, contact us \htmladdnormallink{charm AT cs.uiuc.edu}{mailto:charm AT cs.uiuc.edu} if you are interested in pursuing this further.
179 \subsubsection{What is a proxy?}
181 A proxy is a local C++ class that represents a remote C++ class. When
182 you invoke a method on a proxy, it sends the request across the network
183 to the real object it represents. In Charm++, all communication is
184 done using proxies.
186 A proxy class for each of your classes is generated based on the methods
187 you list in the .ci file.
189 \subsubsection{What are the different ways one can can create proxies?}
191 Proxies can be:
192 \begin{itemize}
193 \item
194 Created using ckNew. This is the only method that actually creates a new
195 parallel object. ``CProxy\_A::ckNew(...)'' returns a proxy, as described in
196 the \htmladdnormallink{manual}{http://charm.cs.uiuc.edu/manuals/html/charm++/}.
198 \item
199 Copied from an existing proxy. This happens when you assign two proxies
200 or send a proxy in a message.
202 \item
203 Created from a ``handle''. This happens when you say ``CProxy\_A p=thishandle;''
205 \item
206 Created uninitialized. This is the default when you say ``CProxy\_A p;''.
207 You'll get a runtime error ``proxy has not been initialized'' if you try
208 to use an uninitialized proxy.
209 \end{itemize}
211 \subsubsection{What is wrong if I do {\tt A *ap = new CProxy\_A(handle)}?}
213 This will not compile, because a {\em CProxy\_A} is not an {\em A}.
214 What you want is {\em CProxy\_A *ap = new CProxy\_A(handle)}.
218 %<br>&nbsp;
219 %<li>
220 %<b>When sending messages by invoking a method, can we be just in the middle
221 %of executing another method? I tried to invoke one entry method in one
222 %object while that target object was in the middle of execution of another
223 %method, and could not finish until he'd receive the message. Is there something
224 %wrong with this kind of thinking and can we execute only one method at
225 %a time? How can I then make two-way communication between methods of two
226 %objects?</b></li>
228 %<br>Only one method can execute on a processor at any time. Message sends
229 %do not interrupt an ongoing execution. Note the lack of <b>blocking receives</b>
230 %in Charm++.
231 %<p>The way you implement two-way communication in Charm++ between two objects
232 %is as follows:
233 %<p>Object A calls method M on object B. The argument to the method M is
234 %a message Msg, which contains a field that contains object A's handle (or
235 %ChareID). Object B's method gets invoked. It constructs a proxy to A using
236 %A's handle from the message, and invokes a method on A using that proxy.
237 %<br>&nbsp;
239 \subsubsection{Why is the {\em def.h} usually included at the end? Is it
240 necessary or can I just include it at the beginning?}
242 You can include the {\em def.h} file once you've actually declared
243 everything it will reference-- all your chares and readonly variables.
244 If your chares and readonlies are in your own header files, it is legal
245 to include the {\em def.h} right away.
247 However, if the class declaration for a chare isn't visible when you
248 include the {\em def.h} file, you'll get a confusing compiler error.
249 This is why we recommend including the {\em def.h} file at the end.
251 \subsubsection{How can I use a global variable across different processors?}
253 Make the global variable ``readonly'' by declaring it in the .ci file.
254 Remember also that read-onlies can be safely set only in che mainchare
255 constructor. Any change after the mainchare constructor has finished will be
256 local to the processor that made the change. To change a global variable later
257 in the program, every processor must modify it accordingly (e.g by using a chare
258 group. Note that chare arrays are not guaranteed to cover all processors)
260 \subsubsection{Can I have a class static read-only variable?}
262 One can have class-static variables as read-onlies. Inside a chare,
263 group or array declaration in the {\em .ci} file, one can have a readonly
264 variable declaration. Thus:
265 \begin{alltt}
266 chare someChare \{
268 readonly CkGroupID someGroup;
271 \end{alltt}
272 is fine. In the {\em .h} declaration for {\em class someChare},
273 you will have have to put {\em someGroup} as a public static variable,
274 and you are done.
276 You then refer to the variable in your program as {\em someChare::someGroup}.
278 \subsubsection{How do I measure the time taken by a program or operation?}
280 You can use {\tt CkWallTimer()} to determine the time on some particular
281 processor. To time some parallel computation, you need to call CkWallTimer
282 on some processor, do the parallel computation, then call CkWallTimer again
283 on the same processor and subtract.
285 \subsubsection{What do {\tt CmiAssert} and
286 {\tt CkAssert} do?}
288 These are just like the standard C++ {\em assert} calls in {\em <assert.h>}--
289 they call abort if the condition passed to them is false.
291 We use our own version rather than the standard version because we have
292 to call {\em CkAbort}, and because we can turn our asserts off when
293 {\em CMK\_OPTIMIZE} is defined, as it is when {\em --with-production} is used on the build line.
295 \subsubsection{Can I know how many messages are being sent to a chare?}
299 There is no nice library to solve this problem, as some messages might be queued
300 on the receiving processor, some on the sender, and some on the network. You can
301 still:
302 \begin{itemize}
303 \item Send a return receipt message to the sender, and wait until all the
304 receipts for the messages sent have arrived, then go to a barrier;
305 \item Do all the sends, then wait for quiescence.
306 \end{itemize}
308 \subsubsection{What is "quiescence"? How does it work?}
310 Quiescence is When nothing is happening anywhere on the parallel machine.
312 A low-level background task counts sent and received messages.
313 When, across the machine, all the messages that have been sent have been
314 received, and nothing is being processed, quiescence is triggered.
316 \subsubsection{Should I use quiescence detection?}
318 Probably not.
320 See the \htmladdnormallink{Completion Detection}{http://charm.cs.illinois.edu/manuals/html/charm++/12.html\#SECTION02340000000000000000} section of the manual for instructions on a more local inactivity detection scheme.
322 In some ways, quiescence is a very strong property (it guarantees {\em nothing}
323 is happening {\em anywhere}) so if some other library is doing something,
324 you won't reach quiescence. In other ways, quiescence is a very weak property,
325 since it doesn't guarantee anything about the state of your application
326 like a reduction does, only that nothing is happening. Because quiescence
327 detection is on the one hand so strong it breaks modularity, and on the
328 other hand is too weak to guarantee anything useful, it's often better
329 to use something else.
331 Often global properties can be replaced by much easier-to-compute local
332 properties. For example, my object could wait until all {\em its} neighbors
333 have sent it messages (a local property my object can easily detect by
334 counting message arrivals), rather than waiting until {\em all} neighbor
335 messages across the whole machine have been sent (a global property that's
336 difficult to determine). Sometimes a simple reduction is needed instead
337 of quiescence, which has the benefits of being activated explicitly (each
338 element of a chare array or chare group has to call contribute) and allows
339 some data to be collected
340 at the same time. A reduction is also a few times faster than quiescence
341 detection. Finally, there are a few situations, such as some tree-search
342 problems, where quiescence detection is actually the most sensible, efficient
343 solution.
347 %<li>
348 %<b>Can a chare be deleted by using </b><tt>delete this</tt><b>?</b></li>
350 %<br>You can delete a chare using <tt>delete this;</tt> as long as you do
351 %not refer to any of its instance variables, or don't send it a message
352 %after that. <tt>delete this</tt>, by now, is a valid programming construct
353 %after much debate. The ANSI C++ specification specifically mentions it.
354 %To delete array elements, use <tt>ckDestroy()</tt> instead of <tt>delete
355 %this;</tt>.
356 %<br>&nbsp;
357 %<li>
358 %<b>Is there any way to put inheritance in a
359 %</b><tt>.ci</tt><b> file?</b></li>
361 %<br>Yes!
362 %<p>The syntax is exactly like C++, but there's no "public" keyword:
363 %<pre>array [1D] subArray : parentArray {
364 %&nbsp; ...the usual...
365 %};</pre>
366 %Virtual methods work right away, and entry methods which are declared virtual
367 %in the .h file are still virtual, even across processors. Multiple inheritance
368 %works, too. See
369 %<tt>charm/pgms/charm++/ megatest/inherit.[ihC]</tt> for
370 %an exhaustive example.
371 %<br>&nbsp;
372 %<li>
373 %<b>Are accumulators supported in Charm++?</b></li>
375 %<br>No, they are no longer supported. You can get almost exactly the same
376 %behavior by using a reduction or defining your own group.
377 %<br>&nbsp;
378 %<li>
379 %<b>Can I find out if there are any pending messages for a chare?</b></li>
381 %<br>No. On a parallel machine, messages destined for a particular chare
382 %might be queued on the sender, on the network, or queued on the local machine.&nbsp;
383 %Since the first two are never going to be accessible to you, we didn't
384 %make the last accessible either.
385 %<br>&nbsp;</ol>