added Verlet scheme and NxN non-bonded functionality
commit7b6508e8318d1d045826616317f8edbaf4658e3b
authorSzilard Pall <pszilard@cbr.su.se>
Tue, 2 Oct 2012 10:27:00 +0000 (2 12:27 +0200)
committerSzilard Pall <pszilard@cbr.su.se>
Tue, 2 Oct 2012 10:57:27 +0000 (2 12:57 +0200)
tree1c53587d3a1f5f5ce69e103d1dea22ba57e8dcf5
parent597c9d5a798e0141d5eeaf602a2069b21e335d1a
added Verlet scheme and NxN non-bonded functionality

This commit implements a new "Verlet" cutoff scheme which uses
a exact cut-offs and standard Verlet lists with an automatically
calculated buffer.

The Verlet code-path supports full multi-level heterogeneous
parallelization using MPI/thread-MPI, OpenMP multi-threading,
and GPU acceleration for the non-bonded calculations.

The non-bonded calculations with the Verlet scheme support highly
optimized CPU SIMD acceleration using SSE/AVX and GPU acceleration using
NVIDIA CUDA. The CPU kernels have been tested on and optimized for most
x86 architectures including recent ones like Intel Sandy/Ivy-Bridge and
AMD Bulldozer. The CUDA GPU kernels support hardware of compute
capability 2.0 and above and are optimized for both Fermi and Kepler
architectures.

The new search code has been added in nbnxn_search.c, new non-bonded
kernels in nbnxn_kernels and nbnxn_cuda:
- plain-C kernels: reference CPU implementation and reference GPU
  (emulation)
- x86 128- and 256-bit SIMD kernels (SSE2, SSE4.1, AVX_128, AVX_256
  intrinsics)
- CUDA (two versions: for recent and legacy toolkit/drivers)

This commit also implements some additional optimizations targeting
performance:
- SSE acceleration for dihedrals;
- automated PP/PME load balancing called "PME tuning" which optimizes
  the electrostatics cut-off to improve load balance between CPU and GPU
  or separate PP and PME processes;
- hardware detection and automated run-configuration selection.

Change-Id: I3e1a15331c174265ec086565b978ffd079df2aaa
206 files changed:
CMakeLists.txt
cmake/ThreadMPI.cmake
cmake/gmxDetectAcceleration.cmake
cmake/gmxGCC44O3BugWorkaround.cmake [new file with mode: 0644]
cmake/gmxGetCompilerVersion.cmake [new file with mode: 0644]
cmake/gmxManageNvccConfig.cmake [new file with mode: 0644]
cmake/gmxSetBuildInformation.cmake
include/bondf.h
include/constr.h
include/coulomb.h
include/domdec.h
include/domdec_network.h
include/force.h
include/futil.h
include/genborn.h
include/gmx_avx_double.h [new file with mode: 0644]
include/gmx_avx_single.h [new file with mode: 0644]
include/gmx_cpuid.h [new file with mode: 0644]
include/gmx_detect_hardware.h [new file with mode: 0644]
include/gmx_detectcpu.h [deleted file]
include/gmx_fatal.h
include/gmx_fatal_collective.h [copied from src/kernel/membed.h with 61% similarity]
include/gmx_hash.h [new file with mode: 0644]
include/gmx_math_x86_avx_128_fma_double.h
include/gmx_math_x86_avx_128_fma_single.h
include/gmx_math_x86_avx_256_double.h
include/gmx_math_x86_avx_256_single.h
include/gmx_math_x86_sse2_double.h
include/gmx_math_x86_sse4_1_double.h
include/gmx_math_x86_sse4_1_single.h
include/gmx_omp.h
include/gmx_omp_nthreads.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 51% similarity]
include/gmx_wallcycle.h
include/gmx_x86_avx_128_fma.h
include/gmx_x86_simd_double.h [new file with mode: 0644]
include/gmx_x86_simd_macros.h [new file with mode: 0644]
include/gmx_x86_simd_single.h [new file with mode: 0644]
include/gpu_utils.h [new file with mode: 0644]
include/main.h
include/maths.h
include/md_logging.h [copied from include/types/graph.h with 61% similarity]
include/md_support.h [new file with mode: 0644]
include/mdebin.h
include/mdrun.h
include/mtop_util.h
include/mvdata.h
include/names.h
include/nbnxn_cuda_data_mgmt.h [new file with mode: 0644]
include/nbnxn_search.h [new file with mode: 0644]
include/network.h
include/nrnb.h
include/nsgrid.h
include/pbc.h
include/physics.h
include/pmalloc_cuda.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 71% similarity]
include/pme.h
include/sim_util.h [new file with mode: 0644]
include/smalloc.h
include/tables.h [copied from src/kernel/membed.h with 61% similarity]
include/thread_mpi/atomic/gcc_intrinsics.h
include/thread_mpi/mpi_bindings.h
include/typedefs.h
include/types/commrec.h
include/types/enums.h
include/types/fcdata.h
include/types/force_flags.h [copied from src/kernel/membed.h with 53% similarity]
include/types/forcerec.h
include/types/graph.h
include/types/group.h
include/types/hw_info.h [new file with mode: 0644]
include/types/idef.h
include/types/ifunc.h
include/types/inputrec.h
include/types/interaction_const.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 50% similarity]
include/types/nb_verlet.h [new file with mode: 0644]
include/types/nbnxn_cuda_types_ext.h [new file with mode: 0644]
include/types/nbnxn_pairlist.h [new file with mode: 0644]
include/types/nrnb.h
include/types/simple.h
include/update.h
include/vsite.h
share/html/online/mdp_opt.html
src/config.h.cmakein
src/gmxlib/CMakeLists.txt
src/gmxlib/bondfree.c
src/gmxlib/calcgrid.c
src/gmxlib/checkpoint.c
src/gmxlib/cuda_tools/CMakeLists.txt [new file with mode: 0644]
src/gmxlib/cuda_tools/cudautils.cu [new file with mode: 0644]
src/gmxlib/cuda_tools/cudautils.cuh [new file with mode: 0644]
src/gmxlib/cuda_tools/pmalloc_cuda.cu [new file with mode: 0644]
src/gmxlib/cuda_tools/vectype_ops.cuh [new file with mode: 0644]
src/gmxlib/disre.c
src/gmxlib/ewald_util.c
src/gmxlib/gmx_cpuid.c [new file with mode: 0644]
src/gmxlib/gmx_detect_hardware.c [new file with mode: 0644]
src/gmxlib/gmx_detectcpu.c [deleted file]
src/gmxlib/gmx_fatal.c
src/gmxlib/gmx_omp.c
src/gmxlib/gmx_omp_nthreads.c [new file with mode: 0644]
src/gmxlib/gpu_utils/CMakeLists.txt [moved from src/kernel/gmx_gpu_utils/CMakeLists.txt with 63% similarity]
src/gmxlib/gpu_utils/gpu_utils.cu [moved from src/kernel/gmx_gpu_utils/gmx_gpu_utils.cu with 54% similarity]
src/gmxlib/gpu_utils/memtestG80_core.cu [moved from src/kernel/gmx_gpu_utils/memtestG80_core.cu with 100% similarity]
src/gmxlib/gpu_utils/memtestG80_core.h [moved from src/kernel/gmx_gpu_utils/memtestG80_core.h with 100% similarity]
src/gmxlib/main.c
src/gmxlib/maths.c
src/gmxlib/md_logging.c [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 57% similarity]
src/gmxlib/mtop_util.c
src/gmxlib/names.c
src/gmxlib/network.c
src/gmxlib/nonbonded/nonbonded.c
src/gmxlib/nrnb.c
src/gmxlib/pbc.c
src/gmxlib/smalloc.c
src/gmxlib/tpxio.c
src/gmxlib/txtdump.c
src/kernel/CMakeLists.txt
src/kernel/calc_verletbuf.c [new file with mode: 0644]
src/kernel/calc_verletbuf.h [copied from src/kernel/repl_ex.h with 52% similarity]
src/kernel/grompp.c
src/kernel/md.c
src/kernel/md_openmm.c
src/kernel/mdrun.c
src/kernel/membed.c
src/kernel/membed.h
src/kernel/openmm_wrapper.cpp
src/kernel/pme_switch.c [new file with mode: 0644]
src/kernel/pme_switch.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 56% similarity]
src/kernel/readir.c
src/kernel/readir.h
src/kernel/repl_ex.h
src/kernel/runner.c
src/kernel/tpbcmp.c
src/mdlib/CMakeLists.txt
src/mdlib/calcmu.c
src/mdlib/clincs.c
src/mdlib/constr.c
src/mdlib/csettle.c
src/mdlib/domdec.c
src/mdlib/domdec_con.c
src/mdlib/domdec_top.c
src/mdlib/edsam.c
src/mdlib/fft5d.c
src/mdlib/force.c
src/mdlib/forcerec.c
src/mdlib/gmx_wallcycle.c
src/mdlib/groupcoord.h
src/mdlib/iteratedconstraints.c
src/mdlib/md_support.c
src/mdlib/mdatom.c
src/mdlib/mdebin.c
src/mdlib/minimize.c
src/mdlib/nbnxn_consts.h [new file with mode: 0644]
src/mdlib/nbnxn_cuda/CMakeLists.txt [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda.cu [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda.h [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda_data_mgmt.cu [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernel.cuh [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernel_legacy.cuh [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernel_utils.cuh [new file with mode: 0644]
src/mdlib/nbnxn_cuda/nbnxn_cuda_kernels.cuh [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 50% similarity]
src/mdlib/nbnxn_cuda/nbnxn_cuda_types.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_common.c [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_common.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 74% similarity]
src/mdlib/nbnxn_kernels/nbnxn_kernel_gpu_ref.c [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_gpu_ref.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 61% similarity]
src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.c [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_ref.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 64% similarity]
src/mdlib/nbnxn_kernels/nbnxn_kernel_ref_inner.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_ref_outer.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd128.c [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd128.h [copied from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 63% similarity]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd256.c [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd256.h [moved from src/kernel/gmx_gpu_utils/gmx_gpu_utils.h with 63% similarity]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd_includes.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd_inner.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd_outer.h [new file with mode: 0644]
src/mdlib/nbnxn_kernels/nbnxn_kernel_x86_simd_utils.h [new file with mode: 0644]
src/mdlib/nbnxn_search.c [new file with mode: 0644]
src/mdlib/nbnxn_search_x86_simd.h [new file with mode: 0644]
src/mdlib/nlistheuristics.c
src/mdlib/ns.c
src/mdlib/nsgrid.c
src/mdlib/partdec.c
src/mdlib/perf_est.c
src/mdlib/pme.c
src/mdlib/pme_pp.c
src/mdlib/pme_sse_single.h
src/mdlib/pull.c
src/mdlib/pull_rotation.c
src/mdlib/qmmm.c
src/mdlib/shakef.c
src/mdlib/shellfc.c
src/mdlib/sim_util.c
src/mdlib/stat.c
src/mdlib/tables.c
src/mdlib/tgroup.c
src/mdlib/tpi.c
src/mdlib/update.c
src/tools/addconf.c
src/tools/calcpot.c
src/tools/gmx_clustsize.c
src/tools/gmx_disre.c
src/tools/gmx_pme_error.c
src/tools/gmx_trjconv.c
src/tools/gmx_tune_pme.c