doc/faq/install.tex

   1 \section{Installation and Usage}
   2
   3 \subsubsection{How do I get Charm++?}
   4
   5 See our \htmladdnormallink{download}{http://charm.cs.uiuc.edu/download/} page.
   6
   7 \subsubsection{Should I use the GIT version of Charm++?}
   8
   9 The developers of Charm++ routinely use the latest GIT versions, and most of the
  10 time this is the best case. Occasionally something breaks, but the GIT version
  11 will likely contain bug fixes not found in the releases.
  12
  13 \subsubsection{How do I compile Charm++?}
  14
  15 Run the interactive build script {\tt ./build} with no extra arguments If this fails,
  16 email \htmladdnormallink{charm AT cs.uiuc.edu}{mailto:charm AT cs.uiuc.edu} with the
  17 problem. Include the build line used (this is saved automatically in
  18 {\tt smart-build.log})
  19
  20 If you have a very unusual machine configuration, you will have to run
  21 {\tt ./build\ --help} to list all possible build options. You will then choose the closest
  22 architecture, and then you may have to modify the associated conf-mach.sh and
  23 conv-mach.h files in src/arch to point to your desired compilers and options. If
  24 you develop a significantly different platform, send the modified files to
  25 \htmladdnormallink{charm AT cs.uiuc.edu}{mailto:charm AT cs.uiuc.edu} so we can include it
  26 in the distribution.
  27
  28 \subsubsection{How do I compile AMPI?}
  29
  30 Run the interactive build script {\tt ./build} and choose the option for building
  31 ``Charm++, AMPI, ParFUM, FEM and other libraries''.
  32
  33 \subsubsection{Can I remove part of charm tree after compilation to free disk space?}
  34
  35 Yes.  Keep src, bin, lib, lib\_so, include, tmp.  You will not need tests, examples, doc, contrib for normal usage once you have verified that your build is functional.
  36
  37 \subsubsection{If the interactive script fails, how do I compile Charm++?}
  38
  39 %<p>First, on a typical Linux machine, unpack the tarball, cd into "charm",
  40 %and type <tt>./build AMPI net-linux -g</tt>.
  41
  42 %<p>Next, if your machine is similar to one of the machines below,
  43 %use the listed "build" command:
  44 %<table border="1">
  45 %<tr><th>Computer</th><th>Type</th><th>Charm Build</th>
  46
  47 %<tr>
  48 %       <td>UIUC CSE <a href="http://www.cse.uiuc.edu/turing/">Turing</a> Cluster</td>
  49 %       <td>Myrinet Linux Cluster. <br>
  50 %               640 2GHz G5 processors as 2-way nodes.</td>
  51 %       <td>
  52 %               Debug: ./build AMPI net-ppc-darwin -g<br>
  53 %               Production: ./build AMPI net-ppc-darwin gm -O3<br>
  54 %               To run a job, use "rjc" to find nodes, then "charmrun".
  55 %       </td>
  56 %</tr>
  57 %<tr>
  58 %       <td>PSC <a href="http://www.psc.edu/machines/tcs/lemieux.html">Lemieux</a></td>
  59 %       <td>HP/Compaq AlphaServer. <br>
  60 %               3,000 64-bit Alpha processors as 750 4-way SMP nodes. 3TB total RAM.
  61 %       </td>
  62 %       <td>
  63 %               Use: ./build AMPI elan-axp cxx -O<br>
  64 %               To run a short job, use "charmrun".
  65 %               To run a long job, prepare a batch script and use "qsub".
  66 %       </td>
  67 %</tr>
  68 %<tr>
  69 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/">Copper</a></td>
  70 %       <td>IBM SP (pSeries 690). <br>
  71 %               Use up to 32 1.3GHz 64-bit IBM Power4 processors as a single SMP node.
  72 %       </td>
  73 %       <td>
  74 %               32-bit Mode: ./build AMPI mpi-sp -O<br>
  75 %               64-bit Mode: ./build AMPI mpi-sp mpcc64 -O<br>
  76 %               To run an interactive job, use "charmrun".
  77 %               To run a long job, write a batch script and use "llsubmit".
  78 %       </td>
  79 %</tr>
  80 %<tr>
  81 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/XeonCluster/">Tungsten</a></td>
  82 %       <td>Myrinet Linux Cluster. <br>
  83 %               2,560 3.2 GHz Intel Xeon processors as 2-way nodes with Myrinet, 3 GB RAM per node.
  84 %       </td>
  85 %       <td>
  86 %               Use: ./build AMPI net-linux gm -O , or<br>
  87 %                    ./build AMPI net-linux gm icc ifort -O (Intel compilers) , or<br>
  88 %                    ./build AMPI mpi-linux cmpi -O (Champion MPI)<br>
  89 %       </td>
  90 %</tr>
  91 %<tr>
  92 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/TGIA64LinuxCluster/">Mercury</a></td>
  93 %       <td>Myrinet Linux Cluster. <br>
  94 %               1,744 1.3/1.5 GHz Intel Itanium-2 processors as 2-way nodes with Myrinet, 4 GB RAM per node.
  95 %       </td>
  96 %       <td>
  97 %               Use: ./build AMPI net-linux-ia64 gm -O , or<br>
  98 %                    ./build AMPI mpi-linux-ia64 gm -nobs -O (MPICH over GM)<br>
  99 %       </td>
 100 %</tr>
 101 %<tr>
 102 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/SGIAltix/">Cobalt</a></td>
 103 %       <td>SGI Altix 3700 <br>
 104 %               1,024 1.6 GHz Intel Itanium-2 processors as two 512-processor shared memory systems, 1,024 GB RAM in one system and 2,048 GB RAM in the other
 105 %       </td>
 106 %       <td>
 107 %               Use: ./build AMPI mpi-linux-ia64 ifort mpt icc -O <br>
 108 %       </td>
 109 %</tr>
 110 %<tr>
 111 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA32LinuxCluster/">Platinum</a></td>
 112 %       <td>Myrinet Linux Cluster with VMI. <br>
 113 %               968 1 GHz 32-bit Intel Pentium III processors as 2-way nodes with Myrinet.
 114 %       </td>
 115 %       <td>
 116 %               Debug: ./build AMPI net-linux gm icc -O<br>
 117 %               Production: ./build AMPI mpi-linux vmi icc -O<br>
 118 %               To run an interactive job with net version, use "charmrun". To run with mpi version, use vmirun. Checkout NCSA's webpage for how to write a job script.<br>
 119 %       </td>
 120 %</tr>
 121 %<tr>
 122 %       <td>NCSA <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA32LinuxCluster/">Titanium</a></td>
 123 %       <td>Myrinet Itanium Linux Cluster. <br>
 124 %               256 800 MHz 64-bit Intel Itanium processors as 2-way nodes with Myrinet.
 125 %       </td>
 126 %       <td>
 127 %               Use: ./build AMPI net-linux-ia64 gm ecc -O , or<br>
 128 %                    ./build AMPI mpi-linux-ia64 vmi ecc -O<br>
 129 %               To run an interactive job with net version, use "charmrun". To run with mpi version, use vmirun.<br>
 130 %       </td>
 131 %</tr>
 132 %<tr>
 133 %       <td><a href=http://www.hpcx.ac.uk/">HPCx</td>
 134 %       <td>IBM eServer 575 LPARs with HPC Federation interconnect. <br>
 135 %               96 1.5 GHz 64-bit Power5</td>
 136 %       <td>Use: ./build AMPI lapi -O3 -qstrict
 137 %               "charmrun" will allocate jobs to the load leveler. A prototype file is needed for the accounting</td>
 138 %</tr>
 139 %<tr>
 140 %       <td>Architecture Cluster</a></td>
 141 %       <td>Fast Ethernet Linux Cluster. <br>
 142 %               140 1.7 GHz 32-bit Athlon MP 2000 processors as 2-way nodes.
 143 %       </td>
 144 %       <td>
 145 %               Debug: ./build AMPI net-linux -g<br>
 146 %               Production: ./build AMPI net-linux -O<br>
 147 %               To run interactive use "frun", "fsub" for batch submission.<br>
 148 %       </td>
 149 %</tr>
 150
 151 %</table>
 152
 153 %<br>
 154 %For a computer not in the list above, you can decide which build options to use based
 155 %on the following rules of thumb.
 156 %<ul>
 157 %<li>The compiler defaults to gcc or mpicc on most platforms.
 158 %If you'd like to use another compiler, you can usually add an
 159 %option like "icc", "pgcc", "xlc", "xlc64", or many others--
 160 %see the README file or charm/src/arch/common for a complete list.
 161 %For platform specific options, one can run build command with "help" option, for example: <br>
 162 %./build charm++ net-linux help
 163
 164 %<li>Under Linux, if you or your communication layer uses pthreads,
 165 %and you use Charm threads (such as AMPI), and your Glibc is less than
 166 %version 3.2, you'll get a horrible crash unless you add the "smp" option.
 167 %This version links in a special version of pthreads that works with
 168 %our user-level threads.
 169
 170 %<li>On the net- version, if you're having trouble with rsh/ssh or just
 171 %want to spawn processes on your own machine, add the "local" option,
 172 %like <tt>./build AMPI net-linux local -g</tt>
 173
 174 %<li>Run "./build --help" to display detailed help page for supported build options
 175
 176 %</ul>
 177
 178 %See the README file for extensive details.
 179 %<br>
 180 %<br>
 181
 182 %<li>
 183 %<b>How do I set up my linux box for parallel runs?</b>
 184 %or <b>What does "rsh> protocol failure in circuit setup" mean?</b>
 185 %</li>
 186
 187 %<br>Charmrun normally uses rsh to start the individual programs in a parallel
 188 %run. If you can't execute "rsh localhost echo Hello", you need to set up rsh.
 189
 190 %<p>Rsh is normally started by the inetd service.  For pre-RedHat 7 Linux systems,
 191 %you ask inetd to start rsh by adding, as root, this line to /etc/inetd.conf:
 192 %<pre>
 193 %shell  stream  tcp     nowait.1000     root    /usr/sbin/tcpd  in.rshd
 194 %</pre>
 195 %You'll then need to restart inetd using "/etc/rc.d/init.d/inetd restart".
 196
 197 %<p>On a modern Linux system you probably don't have an /etc/inetd.conf file,
 198 %because your system uses xinetd.  With xinetd, you enable rsh by adding,
 199 %as root, a file called "rsh" to the directory /etc/xinetd.d/ containing:
 200 %<pre>
 201 %# default: on
 202 %# description: The rshd server is the server for the rcmd(3) routine and, \
 203 %#      consequently, for the rsh(1) program.  The server provides \
 204 %#      remote execution facilities with authentication based on \
 205 %#      privileged port numbers from trusted hosts.
 206 %service shell
 207 %{
 208 %       socket_type             = stream
 209 %       wait                    = no
 210 %       user                    = root
 211 %       only_from               = 127.0.0.1 ...your subnet address here.../24
 212 %       cps                     = 1000 30
 213 %       log_on_success          += USERID
 214 %       log_on_failure          += USERID
 215 %       server                  = /usr/sbin/in.rshd
 216 %}
 217 %</pre>
 218
 219 %Afer adding this file, you'll need to restart xinetd using "/etc/rc.d/init.d/xinetd restart".
 220
 221 %<p>Finally, no matter what kind of inetd you have, you probably also want a
 222 %/etc/hosts.equiv file.  This file contains a list of hosts you want to be able to
 223 %rsh from without using a password.  Since Charm never uses passwords with rsh,
 224 %you want to add all the machines you'll be starting Charm jobs from.
 225 %For example, if I start jobs on this machine from localhost and foo.bar.edu,
 226 %my /etc/hosts.equiv file would contain:
 227 %<pre>
 228 %localhost
 229 %foo.bar.edu
 230 %</pre>
 231
 232 See Appendix V of the Charm manual for \htmladdnormallink{Installation and Usage}{http://charm.cs.illinois.edu/manuals/html/charm++/V.html}.
 233
 234
 235 \subsubsection{How do I specify the processors I want to use?}
 236
 237 On machines where MPI has already been wired into the job system, use the --mpiexec flag and -np arguments.
 238
 239 For the net versions, you need to write a nodelist file which lists
 240 all the machine hostnames available for parallel runs.
 241 \begin{alltt}
 242 group main
 243   host foo1
 244   host foo2 ++cpus 4
 245   host foo3.bar.edu
 246 \end{alltt}
 247
 248 %<p>For the net SCYLD version, you don't need nodelist file,
 249 %<tt>charmrun
 250 %+p n</tt> will automatically find the first n available nodes.
 251
 252 For the MPI version, you need to set up an MPI configuration for available
 253 machines as for normal MPI applications.
 254
 255 \subsubsection{How do I use {\em ssh} instead of the deprecated {\em rsh}?}
 256
 257 You need to set up your {\tt .ssh/authorized\_keys} file
 258 correctly. Setup no-password logins using ssh by putting the correct host
 259 key (ssh-keygen) in the file {\tt .ssh/authorized\_keys}.
 260
 261 Finally, in the {\tt .nodelist} file,
 262 you specify the shell to use for remote execution of a program using
 263 the keyword {\em ++shell}.
 264 \begin{alltt}
 265 group main ++shell ssh
 266   host foo1
 267   host foo2
 268   host foo3
 269 \end{alltt}
 270
 271 \subsubsection{Can I use the serial library X from a Charm program?}
 272
 273 Yes.  Some of the known working serial libraries include:
 274 \begin{itemize}
 275 \item The Tcl/Tk interpreter (in NAMD)
 276 \item The Python interpreter (in Cosmo prototype)
 277 \item OpenGL graphics (in graphics demos)
 278 \item Metis mesh partitioning (included with charm)
 279 \item ATLAS, BLAS, LAPACK, ESSL, FFTW, MASSV, ACML, MKL, BOOST
 280 \end{itemize}
 281
 282 In general, any serial library should work fine with Charm++.
 283
 284 \subsubsection{How do I get the command-line switches available for a specific
 285 program?}
 286
 287 Try \begin{alltt}./charmrun ./pgm --help\end{alltt} to see a list of parameters
 288 at the command line. The charmrun arguments are documented in the
 289 \htmladdnormallink{Installation and Usage Manual}{http://charm.cs.uiuc.edu/manuals/html/install/manual.html}
 290 the arguments for the installed libraries are listed in the library manuals.
 291
 292 %<b>Could you tell me how to run speedshop?</b></li>
 293
 294 %<br><b>speedshop</b> is the performance analysis tool on Origin2000. In
 295 %order to use it, one need not specify any special compilation or linking
 296 %options. It can also be used for parallel programs. Here is what to do
 297 %to use speedshop.
 298 %<p>Suppose the name of the executable is <tt>pgm</tt>, and it is an MPI
 299 %program (or a Charm++ program on mpi-origin version). You run:
 300 %<pre>mpirun -np 4 ssrun -ideal pgm &lt;pgm-options></pre>
 301 %<tt>ssrun</tt> is the command that instruments the executable (original
 302 %executable is not modified), and runs it. The
 303 %<tt>-ideal</tt> option specifies
 304 %what data needs to be collected. It is called <b>experiment</b> in speedshop
 305 %terminology.
 306 %<p>This will produce a few files called
 307 %<tt>pgm.ideal.*</tt>. To be precise,
 308 %when <tt>-np</tt> is 4, it will produce 5 files: one of them is
 309 %<tt>pgm.ideal.m*</tt>,
 310 %and the rest are
 311 %<tt>pgm.ideal.f*</tt>. * is the PID of each process. the
 312 %m* file corresponds to the master program of MPI (and also the lazy thread),
 313 %so it does not contain any user data. The user processes generate <tt>pgm.ideal.f*</tt>
 314 %files.
 315 %<p>Now run prof on it: <tt>prof [-gprof] pgm.ideal.f*</tt>
 316 %<p>This will print the profiling statistics to stdout. The
 317 %<tt>-gprof</tt>
 318 %option asks prof to generate gprof style output.
 319 %<p>There are various interesting experiements one can run with speedshop.
 320 %See speedshop(1) for more details.
 321 %<br>&nbsp;</ol>
 322
 323 \subsubsection{What should I do if my program hangs while gathering
 324   CPU topology information at startup?}
 325
 326 This is an indication that your cluster's DNS server is not responding
 327 properly. Ideally, the DNS resolver configured to serve your cluster
 328 nodes should be able to rapidly map their hostnames to their IP
 329 addresses. As an immediate workaround, you can run your program with
 330 the {\tt +skip\_cpu\_topology} flag, at the possible cost of reduced
 331 performance. Another workaround is installing and running {\tt nscd},
 332 the ``name service caching daemon'', on your cluster nodes; this may
 333 add some noise on your systems and hence reduce performance. A third
 334 workaround is adding the addresses and names of all cluster nodes in
 335 each node's {\tt /etc/hosts} file; this poses maintainability problems
 336 for ongoing system administration.