charmxi: Pass in original filename when preprocessing .ci files
[charm.git] / doc / faq / install.tex
blobeabb16e74cf630280826b76c9743a52283178402
1 \section{Installation and Usage}
3 \subsubsection{How do I get Charm++?}
5 See our \htmladdnormallink{download}{http://charm.cs.uiuc.edu/download/} page.
7 \subsubsection{Should I use the GIT version of Charm++?}
9 The developers of Charm++ routinely use the latest GIT versions, and most of the
10 time this is the best case. Occasionally something breaks, but the GIT version
11 will likely contain bug fixes not found in the releases.
13 \subsubsection{How do I compile Charm++?}
15 Run the interactive build script {\tt ./build} with no extra arguments If this fails,
16 email \htmladdnormallink{charm AT cs.uiuc.edu}{mailto:charm AT cs.uiuc.edu} with the
17 problem. Include the build line used (this is saved automatically in
18 {\tt smart-build.log})
20 If you have a very unusual machine configuration, you will have to run
21 {\tt ./build\ --help} to list all possible build options. You will then choose the closest
22 architecture, and then you may have to modify the associated conf-mach.sh and
23 conv-mach.h files in src/arch to point to your desired compilers and options. If
24 you develop a significantly different platform, send the modified files to
25 \htmladdnormallink{charm AT cs.uiuc.edu}{mailto:charm AT cs.uiuc.edu} so we can include it
26 in the distribution.
28 \subsubsection{How do I compile AMPI?}
30 Run the interactive build script {\tt ./build} and choose the option for building
31 ``Charm++, AMPI, ParFUM, FEM and other libraries''.
33 \subsubsection{Can I remove part of charm tree after compilation to free disk space?}
35 Yes. Keep src, bin, lib, lib\_so, include, tmp. You will not need tests, examples, doc, contrib for normal usage once you have verified that your build is functional.
37 \subsubsection{If the interactive script fails, how do I compile Charm++?}
39 %<p>First, on a typical Linux machine, unpack the tarball, cd into "charm",
40 %and type <tt>./build AMPI net-linux -g</tt>.
42 %<p>Next, if your machine is similar to one of the machines below,
43 %use the listed "build" command:
44 %<table border="1">
45 %<tr><th>Computer</th><th>Type</th><th>Charm Build</th>
47 %<tr>
48 % <td>UIUC CSE <a href="http://www.cse.uiuc.edu/turing/">Turing</a> Cluster</td>
49 % <td>Myrinet Linux Cluster. <br>
50 % 640 2GHz G5 processors as 2-way nodes.</td>
51 % <td>
52 % Debug: ./build AMPI net-ppc-darwin -g<br>
53 % Production: ./build AMPI net-ppc-darwin gm -O3<br>
54 % To run a job, use "rjc" to find nodes, then "charmrun".
55 % </td>
56 %</tr>
57 %<tr>
58 % <td>PSC <a href="http://www.psc.edu/machines/tcs/lemieux.html">Lemieux</a></td>
59 % <td>HP/Compaq AlphaServer. <br>
60 % 3,000 64-bit Alpha processors as 750 4-way SMP nodes. 3TB total RAM.
61 % </td>
62 % <td>
63 % Use: ./build AMPI elan-axp cxx -O<br>
64 % To run a short job, use "charmrun".
65 % To run a long job, prepare a batch script and use "qsub".
66 % </td>
67 %</tr>
68 %<tr>
69 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/">Copper</a></td>
70 % <td>IBM SP (pSeries 690). <br>
71 % Use up to 32 1.3GHz 64-bit IBM Power4 processors as a single SMP node.
72 % </td>
73 % <td>
74 % 32-bit Mode: ./build AMPI mpi-sp -O<br>
75 % 64-bit Mode: ./build AMPI mpi-sp mpcc64 -O<br>
76 % To run an interactive job, use "charmrun".
77 % To run a long job, write a batch script and use "llsubmit".
78 % </td>
79 %</tr>
80 %<tr>
81 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/XeonCluster/">Tungsten</a></td>
82 % <td>Myrinet Linux Cluster. <br>
83 % 2,560 3.2 GHz Intel Xeon processors as 2-way nodes with Myrinet, 3 GB RAM per node.
84 % </td>
85 % <td>
86 % Use: ./build AMPI net-linux gm -O , or<br>
87 % ./build AMPI net-linux gm icc ifort -O (Intel compilers) , or<br>
88 % ./build AMPI mpi-linux cmpi -O (Champion MPI)<br>
89 % </td>
90 %</tr>
91 %<tr>
92 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/TGIA64LinuxCluster/">Mercury</a></td>
93 % <td>Myrinet Linux Cluster. <br>
94 % 1,744 1.3/1.5 GHz Intel Itanium-2 processors as 2-way nodes with Myrinet, 4 GB RAM per node.
95 % </td>
96 % <td>
97 % Use: ./build AMPI net-linux-ia64 gm -O , or<br>
98 % ./build AMPI mpi-linux-ia64 gm -nobs -O (MPICH over GM)<br>
99 % </td>
100 %</tr>
101 %<tr>
102 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/SGIAltix/">Cobalt</a></td>
103 % <td>SGI Altix 3700 <br>
104 % 1,024 1.6 GHz Intel Itanium-2 processors as two 512-processor shared memory systems, 1,024 GB RAM in one system and 2,048 GB RAM in the other
105 % </td>
106 % <td>
107 % Use: ./build AMPI mpi-linux-ia64 ifort mpt icc -O <br>
108 % </td>
109 %</tr>
110 %<tr>
111 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA32LinuxCluster/">Platinum</a></td>
112 % <td>Myrinet Linux Cluster with VMI. <br>
113 % 968 1 GHz 32-bit Intel Pentium III processors as 2-way nodes with Myrinet.
114 % </td>
115 % <td>
116 % Debug: ./build AMPI net-linux gm icc -O<br>
117 % Production: ./build AMPI mpi-linux vmi icc -O<br>
118 % To run an interactive job with net version, use "charmrun". To run with mpi version, use vmirun. Checkout NCSA's webpage for how to write a job script.<br>
119 % </td>
120 %</tr>
121 %<tr>
122 % <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA32LinuxCluster/">Titanium</a></td>
123 % <td>Myrinet Itanium Linux Cluster. <br>
124 % 256 800 MHz 64-bit Intel Itanium processors as 2-way nodes with Myrinet.
125 % </td>
126 % <td>
127 % Use: ./build AMPI net-linux-ia64 gm ecc -O , or<br>
128 % ./build AMPI mpi-linux-ia64 vmi ecc -O<br>
129 % To run an interactive job with net version, use "charmrun". To run with mpi version, use vmirun.<br>
130 % </td>
131 %</tr>
132 %<tr>
133 % <td><a href=http://www.hpcx.ac.uk/">HPCx</td>
134 % <td>IBM eServer 575 LPARs with HPC Federation interconnect. <br>
135 % 96 1.5 GHz 64-bit Power5</td>
136 % <td>Use: ./build AMPI lapi -O3 -qstrict
137 % "charmrun" will allocate jobs to the load leveler. A prototype file is needed for the accounting</td>
138 %</tr>
139 %<tr>
140 % <td>Architecture Cluster</a></td>
141 % <td>Fast Ethernet Linux Cluster. <br>
142 % 140 1.7 GHz 32-bit Athlon MP 2000 processors as 2-way nodes.
143 % </td>
144 % <td>
145 % Debug: ./build AMPI net-linux -g<br>
146 % Production: ./build AMPI net-linux -O<br>
147 % To run interactive use "frun", "fsub" for batch submission.<br>
148 % </td>
149 %</tr>
151 %</table>
153 %<br>
154 %For a computer not in the list above, you can decide which build options to use based
155 %on the following rules of thumb.
156 %<ul>
157 %<li>The compiler defaults to gcc or mpicc on most platforms.
158 %If you'd like to use another compiler, you can usually add an
159 %option like "icc", "pgcc", "xlc", "xlc64", or many others--
160 %see the README file or charm/src/arch/common for a complete list.
161 %For platform specific options, one can run build command with "help" option, for example: <br>
162 %./build charm++ net-linux help
164 %<li>Under Linux, if you or your communication layer uses pthreads,
165 %and you use Charm threads (such as AMPI), and your Glibc is less than
166 %version 3.2, you'll get a horrible crash unless you add the "smp" option.
167 %This version links in a special version of pthreads that works with
168 %our user-level threads.
170 %<li>On the net- version, if you're having trouble with rsh/ssh or just
171 %want to spawn processes on your own machine, add the "local" option,
172 %like <tt>./build AMPI net-linux local -g</tt>
174 %<li>Run "./build --help" to display detailed help page for supported build options
176 %</ul>
178 %See the README file for extensive details.
179 %<br>
180 %<br>
182 %<li>
183 %<b>How do I set up my linux box for parallel runs?</b>
184 %or <b>What does "rsh> protocol failure in circuit setup" mean?</b>
185 %</li>
187 %<br>Charmrun normally uses rsh to start the individual programs in a parallel
188 %run. If you can't execute "rsh localhost echo Hello", you need to set up rsh.
190 %<p>Rsh is normally started by the inetd service. For pre-RedHat 7 Linux systems,
191 %you ask inetd to start rsh by adding, as root, this line to /etc/inetd.conf:
192 %<pre>
193 %shell stream tcp nowait.1000 root /usr/sbin/tcpd in.rshd
194 %</pre>
195 %You'll then need to restart inetd using "/etc/rc.d/init.d/inetd restart".
197 %<p>On a modern Linux system you probably don't have an /etc/inetd.conf file,
198 %because your system uses xinetd. With xinetd, you enable rsh by adding,
199 %as root, a file called "rsh" to the directory /etc/xinetd.d/ containing:
200 %<pre>
201 %# default: on
202 %# description: The rshd server is the server for the rcmd(3) routine and, \
203 %# consequently, for the rsh(1) program. The server provides \
204 %# remote execution facilities with authentication based on \
205 %# privileged port numbers from trusted hosts.
206 %service shell
208 % socket_type = stream
209 % wait = no
210 % user = root
211 % only_from = 127.0.0.1 ...your subnet address here.../24
212 % cps = 1000 30
213 % log_on_success += USERID
214 % log_on_failure += USERID
215 % server = /usr/sbin/in.rshd
217 %</pre>
219 %Afer adding this file, you'll need to restart xinetd using "/etc/rc.d/init.d/xinetd restart".
221 %<p>Finally, no matter what kind of inetd you have, you probably also want a
222 %/etc/hosts.equiv file. This file contains a list of hosts you want to be able to
223 %rsh from without using a password. Since Charm never uses passwords with rsh,
224 %you want to add all the machines you'll be starting Charm jobs from.
225 %For example, if I start jobs on this machine from localhost and foo.bar.edu,
226 %my /etc/hosts.equiv file would contain:
227 %<pre>
228 %localhost
229 %foo.bar.edu
230 %</pre>
232 See Appendix V of the Charm manual for \htmladdnormallink{Installation and Usage}{http://charm.cs.illinois.edu/manuals/html/charm++/V.html}.
235 \subsubsection{How do I specify the processors I want to use?}
237 On machines where MPI has already been wired into the job system, use the --mpiexec flag and -np arguments.
239 For the net versions, you need to write a nodelist file which lists
240 all the machine hostnames available for parallel runs.
241 \begin{alltt}
242 group main
243 host foo1
244 host foo2 ++cpus 4
245 host foo3.bar.edu
246 \end{alltt}
248 %<p>For the net SCYLD version, you don't need nodelist file,
249 %<tt>charmrun
250 %+p n</tt> will automatically find the first n available nodes.
252 For the MPI version, you need to set up an MPI configuration for available
253 machines as for normal MPI applications.
255 \subsubsection{How do I use {\em ssh} instead of the deprecated {\em rsh}?}
257 You need to set up your {\tt .ssh/authorized\_keys} file
258 correctly. Setup no-password logins using ssh by putting the correct host
259 key (ssh-keygen) in the file {\tt .ssh/authorized\_keys}.
261 Finally, in the {\tt .nodelist} file,
262 you specify the shell to use for remote execution of a program using
263 the keyword {\em ++shell}.
264 \begin{alltt}
265 group main ++shell ssh
266 host foo1
267 host foo2
268 host foo3
269 \end{alltt}
271 \subsubsection{Can I use the serial library X from a Charm program?}
273 Yes. Some of the known working serial libraries include:
274 \begin{itemize}
275 \item The Tcl/Tk interpreter (in NAMD)
276 \item The Python interpreter (in Cosmo prototype)
277 \item OpenGL graphics (in graphics demos)
278 \item Metis mesh partitioning (included with charm)
279 \item ATLAS, BLAS, LAPACK, ESSL, FFTW, MASSV, ACML, MKL, BOOST
280 \end{itemize}
282 In general, any serial library should work fine with Charm++.
284 \subsubsection{How do I get the command-line switches available for a specific
285 program?}
287 Try \begin{alltt}./charmrun ./pgm --help\end{alltt} to see a list of parameters
288 at the command line. The charmrun arguments are documented in the
289 \htmladdnormallink{Installation and Usage Manual}{http://charm.cs.uiuc.edu/manuals/html/install/manual.html}
290 the arguments for the installed libraries are listed in the library manuals.
292 %<b>Could you tell me how to run speedshop?</b></li>
294 %<br><b>speedshop</b> is the performance analysis tool on Origin2000. In
295 %order to use it, one need not specify any special compilation or linking
296 %options. It can also be used for parallel programs. Here is what to do
297 %to use speedshop.
298 %<p>Suppose the name of the executable is <tt>pgm</tt>, and it is an MPI
299 %program (or a Charm++ program on mpi-origin version). You run:
300 %<pre>mpirun -np 4 ssrun -ideal pgm &lt;pgm-options></pre>
301 %<tt>ssrun</tt> is the command that instruments the executable (original
302 %executable is not modified), and runs it. The
303 %<tt>-ideal</tt> option specifies
304 %what data needs to be collected. It is called <b>experiment</b> in speedshop
305 %terminology.
306 %<p>This will produce a few files called
307 %<tt>pgm.ideal.*</tt>. To be precise,
308 %when <tt>-np</tt> is 4, it will produce 5 files: one of them is
309 %<tt>pgm.ideal.m*</tt>,
310 %and the rest are
311 %<tt>pgm.ideal.f*</tt>. * is the PID of each process. the
312 %m* file corresponds to the master program of MPI (and also the lazy thread),
313 %so it does not contain any user data. The user processes generate <tt>pgm.ideal.f*</tt>
314 %files.
315 %<p>Now run prof on it: <tt>prof [-gprof] pgm.ideal.f*</tt>
316 %<p>This will print the profiling statistics to stdout. The
317 %<tt>-gprof</tt>
318 %option asks prof to generate gprof style output.
319 %<p>There are various interesting experiements one can run with speedshop.
320 %See speedshop(1) for more details.
321 %<br>&nbsp;</ol>
323 \subsubsection{What should I do if my program hangs while gathering
324 CPU topology information at startup?}
326 This is an indication that your cluster's DNS server is not responding
327 properly. Ideally, the DNS resolver configured to serve your cluster
328 nodes should be able to rapidly map their hostnames to their IP
329 addresses. As an immediate workaround, you can run your program with
330 the {\tt +skip\_cpu\_topology} flag, at the possible cost of reduced
331 performance. Another workaround is installing and running {\tt nscd},
332 the ``name service caching daemon'', on your cluster nodes; this may
333 add some noise on your systems and hence reduce performance. A third
334 workaround is adding the addresses and names of all cluster nodes in
335 each node's {\tt /etc/hosts} file; this poses maintainability problems
336 for ongoing system administration.