docs/user-guide/cutoff-schemes.rst

   1 Non-bonded cut-off schemes
   2 ==========================
   3
   4 The default cut-off scheme in |Gromacs| |version| is based on classical
   5 buffered Verlet lists. These are implemented extremely efficiently
   6 on modern CPUs and accelerators, and support nearly all of the
   7 algorithms used in |Gromacs|.
   8
   9 Before version 4.6, |Gromacs| always used pair-lists based on groups of
  10 particles. These groups of particles were originally charge-groups, which were
  11 necessary with plain cut-off electrostatics. With the use of PME (or
  12 reaction-field with a buffer), charge groups are no longer necessary
  13 (and are ignored in the Verlet scheme). In |Gromacs| 4.6 and later, the
  14 group-based cut-off scheme is still available, but is **deprecated since
  15 5.0**. It is still available mainly for backwards
  16 compatibility, to support the algorithms that have not yet been
  17 converted, and for the few cases where it may allow faster simulations
  18 with bio-molecular systems dominated by water.
  19
  20 Without PME, the group cut-off scheme should generally be combined
  21 with a buffered pair-list to help avoid artifacts. However, the
  22 group-scheme kernels that can implement this are much slower than
  23 either the unbuffered group-scheme kernels, or the buffered
  24 Verlet-scheme kernels. Use of the Verlet scheme is strongly encouraged
  25 for all kinds of simulations, because it is easier and faster to run
  26 correctly. In particular, GPU acceleration is available only with the
  27 Verlet scheme.
  28
  29 The Verlet scheme uses properly buffered lists with exact cut-offs.
  30 The size of the buffer is chosen with :mdp:`verlet-buffer-tolerance`
  31 to permit a certain level of drift.  Both the LJ and Coulomb potential
  32 are shifted to zero by subtracting the value at the cut-off. This
  33 ensures that the energy is the integral of the force. Still it is
  34 advisable to have small forces at the cut-off, hence to use PME or
  35 reaction-field with infinite epsilon.
  36
  37 Non-bonded scheme feature comparison
  38 ------------------------------------
  39
  40 All |Gromacs| |version| features not directly related to non-bonded
  41 interactions are supported in both schemes. Eventually, all non-bonded
  42 features will be supported in the Verlet scheme. A table describing
  43 the compatibility of just non-bonded features with the two schemes is
  44 given below.
  45
  46 Table: Support levels within the group and Verlet cut-off schemes
  47 for features related to non-bonded interactions
  48
  49 ====================================  ============ =======
  50 Feature                               group        Verlet
  51 ====================================  ============ =======
  52 unbuffered cut-off scheme             default      not by default
  53 exact cut-off                         shift/switch always
  54 potential-shift interactions          yes          yes
  55 potential-switch interactions         yes          yes
  56 force-switch interactions             yes          yes
  57 switched potential                    yes          yes
  58 switched forces                       yes          yes
  59 non-periodic systems                  yes          Z + walls
  60 implicit solvent                      yes          no
  61 free energy perturbed non-bondeds     yes          yes
  62 energy group contributions            yes          only on CPU
  63 energy group exclusions               yes          no
  64 OpenMP multi-threading                only PME     all
  65 native GPU support                    no           yes
  66 Coulomb PME                           yes          yes
  67 Lennard-Jones PME                     yes          yes
  68 virtual sites                         yes          yes
  69 User-supplied tabulated interactions  yes          no
  70 Buckingham VdW interactions           yes          no
  71 rcoulomb != rvdw                      yes          yes
  72 twin-range                            no           no
  73 ====================================  ============ =======
  74
  75 Performance
  76 -----------
  77
  78 The performance of the group cut-off scheme depends very much on the
  79 composition of the system and the use of buffering. There are
  80 optimized kernels for interactions with water, so anything with a lot
  81 of water runs very fast. But if you want properly buffered
  82 interactions, you need to add a buffer that takes into account both
  83 charge-group size and diffusion, and check each interaction against
  84 the cut-off length each time step. This makes simulations much
  85 slower. The performance of the Verlet scheme with the new non-bonded
  86 kernels is independent of system composition and is intended to always
  87 run with a buffered pair-list. Typically, buffer size is 0 to 10% of
  88 the cut-off, so you could win a bit of performance by reducing or
  89 removing the buffer, but this might not be a good trade-off of
  90 simulation quality.
  91
  92 The table below shows a performance comparison of most of the relevant
  93 setups. Any atomistic model will have performance comparable to tips3p
  94 (which has LJ on the hydrogens), unless a united-atom force field is
  95 used. The performance of a protein in water will be between the tip3p
  96 and tips3p performance. The group scheme is optimized for water
  97 interactions, which means a single charge group containing one particle
  98 with LJ, and 2 or 3 particles without LJ. Such kernels for water are
  99 roughly twice as fast as a comparable system with LJ and/or without
 100 charge groups. The implementation of the Verlet cut-off scheme has no
 101 interaction-specific optimizations, except for only calculating half
 102 of the LJ interactions if less than half of the particles have LJ. For
 103 molecules solvated in water the scaling of the Verlet scheme to higher
 104 numbers of cores is better than that of the group scheme, because the
 105 load is more balanced. On the most recent Intel CPUs, the absolute
 106 performance of the Verlet scheme exceeds that of the group scheme,
 107 even for water-only systems.
 108
 109 Table: Performance in ns/day of various water systems under different
 110 non-bonded setups in |Gromacs| using either 8 thread-MPI ranks (group
 111 scheme), or 8 OpenMP threads (Verlet scheme). 3000 particles, 1.0 nm
 112 cut-off, PME with 0.11 nm grid, dt=2 fs, Intel Core i7 2600 (AVX), 3.4
 113 GHz + Nvidia GTX660Ti
 114
 115 ========================  =================  ===============  ================  =====================
 116 system                    group, unbuffered  group, buffered  Verlet, buffered  Verlet, buffered, GPU
 117 ========================  =================  ===============  ================  =====================
 118 tip3p, charge groups      208                116              170               450
 119 tips3p, charge groups     129                63               162               450
 120 tips3p, no charge groups  104                75               162               450
 121 ========================  =================  ===============  ================  =====================
 122
 123 How to use the Verlet scheme
 124 ----------------------------
 125
 126 The Verlet scheme is enabled by default with option :mdp:`cutoff-scheme`.
 127 The value of [.mdp] option :mdp:`verlet-buffer-tolerance` will add a
 128 pair-list buffer whose size is tuned for the given energy drift (in
 129 kJ/mol/ns per particle). The effective drift is usually much lower, as
 130 :ref:`gmx grompp` assumes constant particle velocities. (Note that in single
 131 precision for normal atomistic simulations constraints cause a drift
 132 somewhere around 0.0001 kJ/mol/ns per particle, so it doesn't make sense
 133 to go much lower.) Details on how the buffer size is chosen can be
 134 found in the reference below and in the `reference manual`_.
 135
 136 .. _reference manual: gmx-manual-parent-dir_
 137
 138 For constant-energy (NVE) simulations, the buffer size will be
 139 inferred from the temperature that corresponds to the velocities
 140 (either those generated, if applicable, or those found in the input
 141 configuration). Alternatively, :mdp:`verlet-buffer-tolerance` can be set
 142 to -1 and a buffer set manually by specifying :mdp:`rlist` greater than
 143 the larger of :mdp:`rcoulomb` and :mdp:`rvdw`. The simplest way to get a
 144 reasonable buffer size is to use an NVT mdp file with the target
 145 temperature set to what you expect in your NVE simulation, and
 146 transfer the buffer size printed by :ref:`gmx grompp` to your NVE [.mdp] file.
 147
 148 When a GPU is used, nstlist is automatically increased by :ref:`gmx mdrun`,
 149 usually to 20 or more; rlist is increased along to stay below the
 150 target energy drift. Further information on running :ref:`gmx mdrun` with
 151 GPUs :ref:`is available<gmx-mdrun-on-gpu>`.
 152
 153 Further information
 154 -------------------
 155
 156 For further information on algorithmic and implementation details of
 157 the Verlet cut-off scheme and the MxN kernels, as well as detailed
 158 performance analysis, please consult the following article:
 159
 160 `Páll, S. and Hess, B. A flexible algorithm for calculating pair
 161 interactions on SIMD architectures. Comput. Phys. Commun. 184,
 162 2641–2650 (2013). <http://dx.doi.org/10.1016/j.cpc.2013.06.003>`__