README.ampi

   1
   2 Adaptive MPI (AMPI)
   3 -------------------
   4 AMPI is an implementation of the MPI standard written on top of Charm++, meant
   5 to give MPI applications access to high-level, application-independent features
   6 such as overdecomposition (processor virtualization), dynamic load balancing,
   7 automatic fault tolerance, and overlap of computation and communication. For
   8 more information on all topics related to AMPI, consult the AMPI manual here:
   9
  10     http://charm.cs.illinois.edu/manuals/html/ampi/manual.html
  11
  12
  13 Building AMPI
  14 -------------
  15 AMPI has its own target in the build system. You can run the top-level
  16 build script interactively using "./build", or you can specify your
  17 architecture, operating system, compilers, and other options directly.
  18 For example:
  19
  20     ./build AMPI netlrts-linux-x86_64 gfortran gcc --with-production
  21
  22
  23 Compiling and Linking AMPI Programs
  24 -----------------------------------
  25 AMPI source files can be compiled and linked with the wrappers found
  26 in bin/, such as ampicc, ampicxx, ampif77, and ampif90, or with
  27 "charmc -language ampi". For example:
  28
  29     ampif90 pgm.f90 -o pgm
  30
  31 To enable transparent migration of user heap data, link with
  32 "-memory isomalloc". To perform dynamic load balancing, link in a Charm++
  33 load balancer (suite) using "-module <LB>". For example:
  34
  35     ampicc pgm.c -o pgm -memory isomalloc -module CommonLBs
  36
  37 Note that you need to specify a Fortran compiler when building Charm++/AMPI
  38 for Fortran compilation to work.
  39
  40
  41 Running AMPI Programs
  42 ---------------------
  43 AMPI programs can be run with charmrun like any other Charm++ program. In
  44 addition to the number of processes, specified with "+p n", AMPI programs
  45 also take the total number of virtual processors (VPs) to run with as "+vp n".
  46 For example, to run an AMPI program 'pgm' on 4 processors using 32 ranks, do:
  47
  48     ./charmrun +p 4 ./pgm +vp 32
  49
  50 To run with dynamic load balancing, add "+balancer <LB>":
  51
  52     ./charmrun +p 4 ./pgm +vp 32 +balancer RefineLB
  53
  54
  55 Porting to AMPI
  56 ---------------
  57 Global and static variables are unsafe for use in virtualized AMPI programs.
  58 This is because globals are defined at the process level, and AMPI ranks are
  59 implemented as user-level threads, which may share a process with other ranks
  60 Therefore, to run with more than 1 VP per processor, all globals and statics
  61 that are non-readonly and whose value does not depend on rank must be modified
  62 to use local storage. Consult the AMPI manual for more information on global
  63 variable privatization and automated approaches to privatization.
  64
  65 AMPI programs must have the following main function signatures, so that AMPI
  66 can bootstrap before invoking the user's main function:
  67     * C/C++ programs should use "int main(int argc, char **argv)"
  68     * Fortran programs should use "Subroutine MPI_Main" instead of
  69       "Program Main"
  70
  71
  72 Incompatibilities and Extensions
  73 --------------------------------
  74 AMPI has some known flaws and incompatibilities with other MPI implementations:
  75     * MPI_Cancel does not actually cancel pending communication.
  76     * MPI_Sendrecv_replace gives incorrect results.
  77     * Persistent sends with Irsend don't work.
  78     * Isend/Irecv do not work when using MPI_LONG_DOUBLE.
  79     * MPI_Get_elements returns the expected number of elements instead of the
  80       actual number received.
  81     * MPI_Unpack gives incorrect results.
  82     * Data alignment in user defined types does not match the MPI standard.
  83     * Scatter/gather using noncontiguous types gives incorrect results.
  84     * Datatypes are not reused, freed, or reference counted.
  85     * The PMPI profiling interface is not implemented in AMPI.
  86
  87 AMPI also has extensions to the MPI standard to enable use of the high-level
  88 features provided by the Charm++ adaptive runtime system:
  89     * MPI_Migrate checks for load imbalance and rebalances the load using
  90       the strategy linked in and specified at job launch.
  91     * MPI_Checkpoint performs a checkpoint to disk.
  92     * MPI_MemCheckpoint performs a double in-memory checkpoint.
  93     * MPI_Register is used to register PUP routines and user data.
  94     * MPI_Get_userdata returns a pointer to user data managed by the runtime.
  95     * MPI_Register_main is used to register multiple AMPI modules.
  96     * MPI_Set_load sets the calling rank's load to the given user value.
  97     * MPI_Start_measure starts load balance information collection.
  98     * MPI_Stop_measure stops load balance information collection.
  99     * MPI_MigrateTo migrates the calling rank to the given PE.
 100     * MPI_Setmigratable sets the migratability of the given communicator.
 101     * MPI_Num_nodes returns the total number of nodes.
 102     * MPI_Num_pes returns the total number of PEs.
 103     * MPI_My_node returns the local node number.
 104     * MPI_My_pe returns the local node number.
 105     * MPI_Command_argument_count returns the number of command line arguments
 106       given to a Fortran AMPI program excluding charmrun and AMPI parameters.
 107     * MPI_Get_command_argument returns an argument from the command line
 108       to a Fortran AMPI program.
 109
 110 Note that AMPI defines a preprocessor symbol "AMPI" so that user codes can
 111 check for AMPI's presence at compile time using "#ifdef AMPI".