charm.git
3 years agoAMPI: remove redundant worldStruct and lift blockingReq into ampiParent78/4678/2
Sam White [Wed, 10 Oct 2018 03:23:06 +0000 (9 22:23 -0500)]
AMPI: remove redundant worldStruct and lift blockingReq into ampiParent

Change-Id: Idc42678eb811f9fbd1e06a7274d1b5151cb75677

3 years agoAMPI: set blockingReq before contributing to a reduction69/4669/8
Sam White [Tue, 9 Oct 2018 01:50:47 +0000 (8 20:50 -0500)]
AMPI: set blockingReq before contributing to a reduction

Set ampiParent::resumeOnColl and ampi::blockingReq before contributing
in case completion of the collective happens inline.

Change-Id: I7d52582940d3945bfed73f63abcf24f9119a9fa8

3 years agoAMPI: use concat rather than set reducer for gather(v)-like reductions67/4667/7
Sam White [Tue, 9 Oct 2018 00:09:29 +0000 (8 19:09 -0500)]
AMPI: use concat rather than set reducer for gather(v)-like reductions

Use concat for gather, gatherv, and noncommutative reductions. The set
reducer adds unnecessary memory overhead compared to concat, since each
contribution has an int size along with it. For gather's and
noncommutative reductions this is straightforward because the
contribution sizes are all the same, while with gatherv's we add a
separate tuple reduction to keep track of contribution sizes.

Change-Id: Ied154413696c140e4a25325bb7787cab1febaed5

3 years agoAMPI: don't forget about MPI_MINLOC in the predefined ops array68/4668/3
Sam White [Tue, 9 Oct 2018 00:11:04 +0000 (8 19:11 -0500)]
AMPI: don't forget about MPI_MINLOC in the predefined ops array

Change-Id: I178c9b3550caf829818fa15f73ba529199d7a57e

3 years agoUnbreak builds with tracing enabled71/4671/2
Sam White [Tue, 9 Oct 2018 13:22:30 +0000 (9 08:22 -0500)]
Unbreak builds with tracing enabled

Change-Id: I0f649de7b2bde46296797df0ece0ca5055fa3ac7

3 years agoNew thread CmiNodeBarrierCount implementation for win-smp64/4664/6
Juan Galvez [Mon, 8 Oct 2018 18:26:52 +0000 (8 13:26 -0500)]
New thread CmiNodeBarrierCount implementation for win-smp

The new barrier implementation uses 2 separate barriers: one for
CmiNodeBarrier, and a different one for CmiNodeAllBarrier.

Trying to reuse the same barrier with different thread count leads
to incorrect behavior under some race conditions. On Windows, this
is observed in some cases during charm initialization, where the
commthread in the last CmiNodeAllBarrier gets stuck because of 3
CmiNodeBarriers that are called in initDone.

Change-Id: Idcfd7a147b60b7b9bdba548efafd45ff5ef746f4

3 years agoZC Direct API: Add an example with an immediate entry method callback56/4656/4
Nitin Bhat [Fri, 5 Oct 2018 02:40:54 +0000 (4 22:40 -0400)]
ZC Direct API: Add an example with an immediate entry method callback

Change-Id: I54a01e466c311c756122fd81bb84fb13ff7f8160

3 years agoZC Direct API: Invoke callbacks in SMP mode through the first worker thread55/4655/4
Nitin Bhat [Thu, 4 Oct 2018 23:28:55 +0000 (4 19:28 -0400)]
ZC Direct API: Invoke callbacks in SMP mode through the first worker thread

There is no relationship between a CkNcpyBuffer's PE and the its callback.
Previously, in the SMP mode, where the ackhandler function was being
executed on the comm thread, the comm thread would invoke the
_callback_group on the srcPe/destPe to finally invoke the callback.
This was done as a workaround for the absence of a location manager on
the comm thread. However, it was unnecessary to invoke the _callback_group
on the srcPe/destPe. With this commit, the _callback_group is invoked on the
first pe of that node i.e. CmiNodeFirst(CkMyNode()) rather than the
srcPe/destPe. This prefers local communication over the more expensive remote
communication.

Change-Id: I621f3eb008eeb2768d5137b88de4f41d802c937c

3 years agoBug #1985: Fix and document error when defining a constructor with SDAG63/4663/2
Evan Ramos [Mon, 8 Oct 2018 18:32:11 +0000 (8 13:32 -0500)]
Bug #1985: Fix and document error when defining a constructor with SDAG

Change-Id: Ib5d0c33e709cbbd3e17a5f67ed5386e4ac9fdf21

3 years agoExpose addReducer()'s name argument for real61/4661/4
Sam White [Mon, 8 Oct 2018 01:47:10 +0000 (7 20:47 -0500)]
Expose addReducer()'s name argument for real

This previous commit that added the names for built-in types failed
to actually expose the optional name argument via addReducer() to users.

Also add some names to custom reducers in the runtime.

Change-Id: I9c9d25b865a439f76c020ebfa4d8c4d2d1d8bf49

3 years agoAMPI: do not free user-defined keyvals when freeing a comm60/4660/3
Sam White [Fri, 5 Oct 2018 20:18:02 +0000 (5 15:18 -0500)]
AMPI: do not free user-defined keyvals when freeing a comm

This works around a bug in AMPI's comm keyval handling that causes
a double-free in ROMIO/HDF5 file open.

Change-Id: I624b4d329dba042e84c79c35d2a4d86649229d56

3 years agoDocument trace{Begin/End}UserBracketEvent59/4659/1
Ronak Buch [Fri, 5 Oct 2018 20:00:44 +0000 (5 15:00 -0500)]
Document trace{Begin/End}UserBracketEvent

Change-Id: I0b5966ed707b52486c755a106e3a6fe99c209672

3 years agoRemove -g from optimization flags for pami-linux-ppc64le58/4658/1
Ronak Buch [Fri, 5 Oct 2018 17:43:26 +0000 (5 12:43 -0500)]
Remove -g from optimization flags for pami-linux-ppc64le

Change-Id: I95ca43d548e455fee53348203e69e328e39b998f

3 years agoPass correct flag for gcc on pami-linux-ppc64le instead of using xlC's54/4654/2
Ronak Buch [Thu, 4 Oct 2018 22:02:23 +0000 (4 17:02 -0500)]
Pass correct flag for gcc on pami-linux-ppc64le instead of using xlC's

Change-Id: I81cc0ae80bf0e2fbf598cc551e269c7e9eadc28e

3 years agoPrefix memory libs using ptmalloc3 with "gnu-" and add default handling of unprefixed... 50/4650/2
Evan Ramos [Thu, 4 Oct 2018 21:15:15 +0000 (4 16:15 -0500)]
Prefix memory libs using ptmalloc3 with "gnu-" and add default handling of unprefixed names

This changes the default allocator for verbose, paranoid, leak, and
isomalloc from ptmalloc3 to the operating system's malloc.

Change-Id: Id3699fe25c4e3733ee2d5086066f661151ab83cd

3 years agoCkIO: cleanup source of compiler warning51/4651/3
Sam White [Thu, 4 Oct 2018 21:35:12 +0000 (4 16:35 -0500)]
CkIO: cleanup source of compiler warning

Change-Id: Ifd299114b3d88968d51049947b9eb50335036ebf

3 years agoampirun: stop argument parsing when finding an executable49/4649/1
Matthias Diener [Thu, 4 Oct 2018 21:00:17 +0000 (4 14:00 -0700)]
ampirun: stop argument parsing when finding an executable

Prevents errors such as 'ampirun -n 4 ./foo -n bar'
getting parsed as "+p bar"

Change-Id: I20bc011bec05f05b80325b64bac31577d07332a0

3 years agoAdd NVTX to matmul and vecadd CUDA examples, guard setTraceName() with HAPI_TRACE44/4644/4
Jaemin Choi [Tue, 2 Oct 2018 21:32:48 +0000 (2 17:32 -0400)]
Add NVTX to matmul and vecadd CUDA examples, guard setTraceName() with HAPI_TRACE

Change-Id: I99dea6a1d69f5dba39ef6c3dce95bee7711d570f

3 years agodoc: change spack package name to charmpp78/4578/2
Matthias Diener [Sat, 15 Sep 2018 00:03:06 +0000 (14 19:03 -0500)]
doc: change spack package name to charmpp

The name has already been changed in Spack.

Change-Id: Ie953410dc978cc4e3c1f7469a1ce3e2987d8b459

3 years agoFeature #693: Add functions to improve CcdCallback frequency06/4506/10
Michael Robson [Wed, 22 Aug 2018 19:38:02 +0000 (22 14:38 -0500)]
Feature #693: Add functions to improve CcdCallback frequency

Change-Id: I43f0cbdb06894e719671e15df732fcf062132a70

3 years agoConditionalize CkDataMsg checks for msg corruption on CMK_ERROR_CHECKING43/4643/3
Sam White [Tue, 2 Oct 2018 21:29:06 +0000 (2 16:29 -0500)]
Conditionalize CkDataMsg checks for msg corruption on CMK_ERROR_CHECKING

Change-Id: I6f1fcfef2fc7b4555ee03aa49534d6cd20b4c857

3 years agoZerocopy Direct API: Support for QD61/4561/7
Nitin Bhat [Tue, 11 Sep 2018 18:07:59 +0000 (11 13:07 -0500)]
Zerocopy Direct API: Support for QD

Change-Id: Idf647b4bcc69836ed071abf8d3886ccfd0ba0e9e

3 years agoAMPI: avoid storing all groups that are std::iota20/4620/8
Sam White [Thu, 27 Sep 2018 03:51:26 +0000 (26 22:51 -0500)]
AMPI: avoid storing all groups that are std::iota

- Clean up Group interface, and avoid ever storing Groups that can be
  lazily and transiently constructed via std::iota() as needed.

Change-Id: I27ca4100c7ca063221d2ce10f339d84df698edac

3 years agoAMPI: store predefined ops and types per-process rather than per-rank10/4610/12
Sam White [Mon, 24 Sep 2018 19:20:32 +0000 (24 14:20 -0500)]
AMPI: store predefined ops and types per-process rather than per-rank

- Split storage of predefined ops and datatypes from user-defined ones,
  storing the predefined ones in static const arrays, in order to
  reduce memory overhead.

- Rename CkDDT_MAX_PRIMITIVE_TYPE and CkDDT_MAX_BASIC_TYPE to AMPI_*,
  and fix definition of BASIC type to include MPI_UB and MPI_LB.

Change-Id: I490a4c4d50e4fc1a1fb2c39314baa4b08640aa96

3 years agoDisable PU count mismatch warning on BG/Q40/4640/2
Evan Ramos [Tue, 2 Oct 2018 17:18:53 +0000 (2 12:18 -0500)]
Disable PU count mismatch warning on BG/Q

Change-Id: If3f9a8c93b05e0d1724aeefbb5ecea47d086d452

3 years agoAMPI: fix split creation of comms with dist_graph topology35/4635/4
Sam White [Mon, 1 Oct 2018 21:19:56 +0000 (1 16:19 -0500)]
AMPI: fix split creation of comms with dist_graph topology

Change-Id: I3326ff4b1db7eb0a0ed1a4368c074ba787eea48c

3 years agoLBComm: reduce initial size of LBCommData from 10K elements to 50015/3915/6
Sam White [Wed, 28 Mar 2018 20:31:05 +0000 (28 15:31 -0500)]
LBComm: reduce initial size of LBCommData from 10K elements to 500

- This reduces the memory footprint per PE of LBCommData from
  ~800KB to ~40KB.

Change-Id: I93a050e826ac88a37fec3b9d700406134b667ad7

3 years agoconv-conds: expose maximum CcdCallback number to users as CcdUSERMAX30/4630/3
Sam White [Fri, 28 Sep 2018 18:17:59 +0000 (28 13:17 -0500)]
conv-conds: expose maximum CcdCallback number to users as CcdUSERMAX

- Also redefine the predefined Ccd values to be more compact and leave
  more room for user-defined ones.

Change-Id: If6bafe102982fbb79b41abc8d7ea148ad4630a58

3 years agoconv-conds: decrease maximum conds to lessen the memory footprint25/4625/6
Sam White [Thu, 27 Sep 2018 21:22:57 +0000 (27 16:22 -0500)]
conv-conds: decrease maximum conds to lessen the memory footprint

- The heap dynamically resizes on demand, and MAXNUMCONDS=128 seems
  large enough. Added assertions about that too.

- Reorder the member variables of ccd_cblist for better packing.

- This reduces the memory footprint per PE of conv-conds data
  structures from ~340 KB to ~85 KB.

Change-Id: I53a5cd0511b56a14f1cd2dec237c28b5cc5438b7

3 years agoAMPI: enable changing the default errhandler at build-time34/4634/4
Sam White [Sat, 29 Sep 2018 15:47:25 +0000 (29 10:47 -0500)]
AMPI: enable changing the default errhandler at build-time

Change-Id: I427f31be68d2623f2d63bd84e415778ec7897747

3 years agoCleanup AMPI error checking routine33/4633/4
Sam White [Sat, 29 Sep 2018 15:40:06 +0000 (29 10:40 -0500)]
Cleanup AMPI error checking routine

Change-Id: I132573af6d1b572b165241b5d5abdc271d0b8d66

3 years agoCkArray: change localElems to unordered_map and fix eraseEltFromArrMgr()28/4628/10
Sam White [Fri, 28 Sep 2018 14:00:28 +0000 (28 09:00 -0500)]
CkArray: change localElems to unordered_map and fix eraseEltFromArrMgr()

- std::unordered_map lookup is generally more efficient than std::map

- eraseEltFromArrMgr() was previously not erasing from localElemVec,
  meaning that the vector would continuously increase in size
  over time given migrations.

- Work around known issue with calling migrateMe() from an SDAG
  entry method by changing tests/charm++/sdag/migration/ to instead
  call migrateMe() from a non-SDAG entry method. See redmine #480
  for more details on this. The change to eraseEltFromArrMgr()
  in this patch exposed this issue again.

Change-Id: I102d5975e383a746b0e84f116c45d40c9ebb4c24

3 years agoPUP: Fix handling of std::{set,multiset,forward_list}32/4632/1
Evan Ramos [Fri, 28 Sep 2018 22:32:13 +0000 (28 17:32 -0500)]
PUP: Fix handling of std::{set,multiset,forward_list}

Change-Id: I5f49f7347503bc9d884c44c6819e1def6a713f46

3 years agoCleanup CkMulticast STL usage27/4627/5
Sam White [Thu, 27 Sep 2018 23:09:22 +0000 (27 18:09 -0500)]
Cleanup CkMulticast STL usage

Change-Id: I62f6bb0a82a0414f0df2a1a69f07ba70ddae5efd

3 years agoReduce memory.C's life-raft allocation from 32 to 16 KB24/4624/3
Sam White [Thu, 27 Sep 2018 20:51:57 +0000 (27 15:51 -0500)]
Reduce memory.C's life-raft allocation from 32 to 16 KB

Change-Id: Ic98487b7e0c553bf2d73d04d5d8411de849bb11a

3 years agoBug #1981: Fix crash when creating a migratable thread with isomalloc disabled26/4626/1
Evan Ramos [Thu, 27 Sep 2018 22:40:28 +0000 (27 17:40 -0500)]
Bug #1981: Fix crash when creating a migratable thread with isomalloc disabled

This can be seen in tests/converse/cthtest when run with "+noisomalloc"
or on a machine that lacks isomalloc.

Change-Id: I4a8a3b2e2af2f831e713eb06a9e899c72a3a63a5

3 years agoExplicitly call the default constructor when allocating Cpv variables in C++ code23/4623/2
Evan Ramos [Thu, 27 Sep 2018 20:07:55 +0000 (27 15:07 -0500)]
Explicitly call the default constructor when allocating Cpv variables in C++ code

This fixes the crash in uth-linux-x86_64/examples/ampi/Cjacobi3d/jacobi.iso,
caused by isomalloc_blocklist not being properly nulled for all PEs upon
initialization.

Change-Id: I072ef998edfd0f733e417fd4628acafaa8cc6411

3 years agoCkIO : Add support to query Lustre FS API to set default stripe size69/4269/6
Karthik Senthil [Fri, 15 Jun 2018 21:08:52 +0000 (15 16:08 -0500)]
CkIO : Add support to query Lustre FS API to set default stripe size

- CkIO now sets the default writeStripe to the stripe size queried from
  Lustre FS API (otherwise 4 * 1024 * 1024 for non-Lustre systems)
- This patch also adds support for Charm++ to detect Lustre FS during
  build, and set the CMK_HAS_LUSTREFS and CMK_LUSTREAPI flags

Change-Id: I14e5d2bafec3f033bc3c8c00714de1e5c07e44e0

3 years agoCleanup AMPI GPUReq and pooled SsendReq21/4621/1
Sam White [Thu, 27 Sep 2018 04:10:38 +0000 (26 23:10 -0500)]
Cleanup AMPI GPUReq and pooled SsendReq

Change-Id: I81747da0d9fdd8cf0b534cc3ed2908946f193ad8

3 years agocharmc: When checking GNU ld's version, account for '-' in the minor field19/4619/2
Evan Ramos [Wed, 26 Sep 2018 19:42:35 +0000 (26 14:42 -0500)]
charmc: When checking GNU ld's version, account for '-' in the minor field

For example, "GNU ld version 2.27-28.base.el7_5.1".

Change-Id: Id8267bd47d25e2b2c9876d479aa5393a01dc9348

3 years agoIsomalloc: Disable use of the mempool79/4479/9
Evan Ramos [Tue, 14 Aug 2018 22:20:25 +0000 (14 17:20 -0500)]
Isomalloc: Disable use of the mempool

In testing, my isomalloc stress test completes more quickly this way.

Change-Id: I4588fe0762c86354e5d5f1dc4bcc406ce1c6aceb

3 years agoIsomalloc #934: Move usage to an explicit BlockList interface, to avoid implicit... 72/4472/13
Phil Miller [Thu, 31 Dec 2015 23:05:22 +0000 (31 17:05 -0600)]
Isomalloc #934: Move usage to an explicit BlockList interface, to avoid implicit thread-based context

This enables code to use Isomalloc-managed migratable heaps without associating
them with a corresponding migratable thread.

Co-authored-by: Evan Ramos <evan@hpccharm.com>
Change-Id: I5dd6d61f285807e8e7e344e341b50a3a5a5368c9

3 years agoFix AMPI compilation issues with newly merged GPUManager patches16/4616/3
Jaemin Choi [Tue, 25 Sep 2018 17:31:29 +0000 (25 13:31 -0400)]
Fix AMPI compilation issues with newly merged GPUManager patches

Change-Id: I9e9274a17466b3891163d1386425fd80f1e27935

3 years agoZC OFI API: Replace fi_write with fi_writemsg with FI_DELIVERY_COMPLETE97/4597/5
Nitin Bhat [Fri, 21 Sep 2018 18:37:55 +0000 (21 14:37 -0400)]
ZC OFI API: Replace fi_write with fi_writemsg with FI_DELIVERY_COMPLETE

Previously, fi_write would complete only when the source could
reuse its buffer. With this change, an fi_writemsg completes only when
the destination buffer has received the data. This change is required
to solve a rare race condition which occurs in the UNREG mode of operation,
where a Put operation is performed instead of a Get operation. The race
condition causes the source to send a message to the destination to
potentially de-register the destination buffer when the completion of the
write operation on the destination is uncertain i.e. the data could still
be in-flight. This patch fixes that case as completion on the source only
occurs after the destination has received the data through the RDMA write
operation.

Change-Id: I808f1d5bc9dda3d92859e9775531d0b5c47a1c8e

3 years agoUpdate references to hapi_src to hapi_impl and revert hapiRegisterCallbacks14/4614/1
Jaemin Choi [Tue, 25 Sep 2018 05:05:45 +0000 (25 01:05 -0400)]
Update references to hapi_src to hapi_impl and revert hapiRegisterCallbacks

Change-Id: Ic825715c76058c21e73558a1cf8dc154c218df7e

3 years agocuda: Add hapi prefix to HAPI structs and update AMPI interface02/4402/17
Michael Robson [Thu, 26 Jul 2018 22:21:43 +0000 (26 17:21 -0500)]
cuda: Add hapi prefix to HAPI structs and update AMPI interface

Change-Id: I577cf1ee8d4067186a9c8d3e04f68b71a7250a9d

3 years agocuda: Various fixes to enable new GPU Manager02/4302/13
Michael Robson [Tue, 24 Apr 2018 22:06:59 +0000 (24 17:06 -0500)]
cuda: Various fixes to enable new GPU Manager

* Change memcpyAsync to memcpy (synchronous)
* Ensure free is called for both wr and hapi APIs
* Fix CUDA_ and HAPI_ define flag inconsistencies
* Change GPUManager::createStreams() to always return number of streams
* Re-enable HAPI_MEMPOOL by default
* Update user_data API to either set or copy due to user preference

Change-Id: Id6fa947d5cea5b49af04bbd287ba97a2faa240a0

3 years agoCleanup #1491: Update documentation of GPUManager30/3330/16
Jaemin Choi [Mon, 27 Nov 2017 21:53:18 +0000 (27 15:53 -0600)]
Cleanup #1491: Update documentation of GPUManager

Change-Id: I9d1fc90f8556c14b868015fa2fc0ea127c439c3d

3 years agoCleanup #1489: Delete GPU dummy mempool29/3329/12
Jaemin Choi [Mon, 27 Nov 2017 20:32:27 +0000 (27 15:32 -0500)]
Cleanup #1489: Delete GPU dummy mempool

Change-Id: Ibfc4b14a6e5ce90bf0cddd951cb23708d65513f5

3 years agoSupport #1450: Clean up and add CUDA example programs26/3026/31
Jaemin Choi [Thu, 14 Sep 2017 17:41:43 +0000 (14 12:41 -0500)]
Support #1450: Clean up and add CUDA example programs

Change-Id: I89d668f736d69f11373f13055859135f64dd1e07

3 years agoFeature #1393: Redesign of Hybrid API to support concurrent kernels94/2994/16
Jaemin Choi [Wed, 6 Dec 2017 22:19:31 +0000 (6 16:19 -0600)]
Feature #1393: Redesign of Hybrid API to support concurrent kernels

- Core algorithmic changes using CUDA streams, events and callbacks
- Add NVTX support (with Tim Haines' modifications)
- Remove workRequest queue related files
- Retain support for workRequest with small syntax changes
- Add README

Change-Id: Icbef6a4c3408acdf23ee7506f35326b6fc949b34

3 years agoRename files for Feature #1393: Redesign of Hybrid API80/3380/8
Jaemin Choi [Wed, 6 Dec 2017 22:17:07 +0000 (6 16:17 -0600)]
Rename files for Feature #1393: Redesign of Hybrid API

Change-Id: Ia882ee7676919538b77b298b3fe43b65e1973d7a

3 years agoAMPI: fix intercomm NBC tests and remove testing of broken intercomm MPI_Ibarrier92/4592/4
Sam White [Thu, 20 Sep 2018 18:24:50 +0000 (20 13:24 -0500)]
AMPI: fix intercomm NBC tests and remove testing of broken intercomm MPI_Ibarrier

Change-Id: I998353dbd505b89791f4ce7f569280b10227fff0

3 years agoVerbs ZC API: Add lock while sending small message in UNREG mode89/4589/3
Nitin Bhat [Tue, 18 Sep 2018 22:59:05 +0000 (18 17:59 -0500)]
Verbs ZC API: Add lock while sending small message in UNREG mode

This patch also includes minor cleanup around the locking macros.
Previously, a macro called CMK_SMP_NOT_RELAX_LOCK was used, but was
never defined. That is removed in this patch and locking is limited
to only the SMP mode.

Change-Id: I41a0d34a58fd2eba7b5597714282289eea2c8b0a

3 years agoCleanup: shorten time taken to run longest-running tests/examples86/4586/5
Sam White [Tue, 18 Sep 2018 15:55:33 +0000 (18 10:55 -0500)]
Cleanup: shorten time taken to run longest-running tests/examples

- Decrease iterations and/or sizes of inputs.

- Decrease +p to avoid oversubscription in SMP mode on machines with
  only 8 PUs.

Change-Id: I38d76812a686ee38dac0602ed7da29eb2536f15e

3 years agoPUP: Add reconstruct support for std::{unordered_,}{multi,}{map,set}83/4583/4
Evan Ramos [Mon, 17 Sep 2018 21:59:07 +0000 (17 16:59 -0500)]
PUP: Add reconstruct support for std::{unordered_,}{multi,}{map,set}

Change-Id: I9672ebd3b2105f996c53a9b3620653bb2711aa38

3 years agoAMPI: Remove noexcept from thread start functions that wrap user code87/4587/2
Evan Ramos [Tue, 18 Sep 2018 17:36:39 +0000 (18 12:36 -0500)]
AMPI: Remove noexcept from thread start functions that wrap user code

Otherwise, debuggers will unwind to the functions marked noexcept
instead of the source of the exception.

Change-Id: I5718c8142c6eac0d1c04f2109d7e36a602e585bb

3 years agoPUP: Fix detection of class constructors taking PUP::reconstruct82/4582/1
Evan Ramos [Mon, 17 Sep 2018 21:58:02 +0000 (17 16:58 -0500)]
PUP: Fix detection of class constructors taking PUP::reconstruct

Previously, this code would check if PUP::reconstruct could be
constructed using the class type as a parameter. This patch swaps the
types to the correct relationship.

Change-Id: I014a5fd879d85aaccb7194c00ae571dc9c71c269

3 years agoAn example for 2D array sections, and its usage68/4568/5
raghavendrak [Wed, 12 Sep 2018 20:08:54 +0000 (12 15:08 -0500)]
An example for 2D array sections, and its usage

Change-Id: I616cd280d7c6bf2a35b0cdfb84ccd2e4f50bbd03

3 years agoPartially fix CkCallbacks to section multicasts67/4567/7
Sam White [Wed, 12 Sep 2018 17:19:11 +0000 (12 12:19 -0500)]
Partially fix CkCallbacks to section multicasts

This fixes CkCallbacks to section multicasts, which were broken by
commit ede3f6d854de03f1836c887530b3626cdc139640, and may have been
broken before then.

CkCallback's to sections will still break when migrated, since they
contain bare pointers to the section info. That issue is redmine #235.

Change-Id: If1830106d0f947e105e90aa6cbbab8219a059d35

3 years agoCleanup #1978: array sections example66/4566/7
Sam White [Wed, 12 Sep 2018 13:17:08 +0000 (12 08:17 -0500)]
Cleanup #1978: array sections example

Change-Id: Ia2668948b069b7f5407112f439f029a7ec704b5e

3 years agoConditionalize CkMcastBaseMsg::checkMagic on CMK_ERROR_CHECKING80/4580/1
Sam White [Sun, 16 Sep 2018 19:42:35 +0000 (16 14:42 -0500)]
Conditionalize CkMcastBaseMsg::checkMagic on CMK_ERROR_CHECKING

Change-Id: If8b4c82a568e98474cb6e464d84ff2f1885b1aa0

3 years agoCleanup: remove unnecessary delegation of cross-array section to CkMulticast65/4565/7
Sam White [Wed, 12 Sep 2018 02:12:35 +0000 (11 21:12 -0500)]
Cleanup: remove unnecessary delegation of cross-array section to CkMulticast

- Remove manual delegation from the xarraySection example code.

- Make the documentation about auto-delegation of cross-array sections
  more clear.

Change-Id: I2ed7721b21d482eacf519ff985ecb05c6f55bb6f

3 years agoAMPI: avoid unnecessary casts of AmpiRequests to derived types71/4571/3
Sam White [Thu, 13 Sep 2018 14:56:02 +0000 (13 09:56 -0500)]
AMPI: avoid unnecessary casts of AmpiRequests to derived types

Change-Id: Ib2d93666c0396180b4352dfc96759a99cb7bec9d

3 years agoAMPI: mark all of AMPI and TCharm as noexcept35/4135/14
Sam White [Sat, 5 May 2018 17:21:21 +0000 (5 10:21 -0700)]
AMPI: mark all of AMPI and TCharm as noexcept

Change-Id: Iaa66290a97547f59a95b5fa76195476d56db7259

3 years agoAMPI: align the AmpiRequest pool based on the alignment of the types it may hold64/4564/8
Sam White [Tue, 11 Sep 2018 23:15:40 +0000 (11 18:15 -0500)]
AMPI: align the AmpiRequest pool based on the alignment of the types it may hold

Change-Id: I9da505d77448c9bdabd9573832cc7943d66f0941

3 years agoAMPI: mark derived classes with C++11 final keyword where applicable63/4563/7
Sam White [Tue, 11 Sep 2018 23:05:12 +0000 (11 18:05 -0500)]
AMPI: mark derived classes with C++11 final keyword where applicable

- Fix bug in ampi::irednResult() exposed by use of 'final'.

Change-Id: Ifa622e9ce93f10067c4d5f2e94dad822d113a2b1

3 years agoCleanup: simplify AMPI class member variables62/4562/5
Sam White [Tue, 11 Sep 2018 22:57:05 +0000 (11 17:57 -0500)]
Cleanup: simplify AMPI class member variables

Change-Id: Id220857bdd4e88e13bf0bac3c59cb32b13d71b4e

3 years agoPass block pointers to free when Isomalloc is disabled74/4574/2
Evan Ramos [Thu, 13 Sep 2018 22:22:59 +0000 (13 17:22 -0500)]
Pass block pointers to free when Isomalloc is disabled

This fixes the crash in examples/armci/putTest.

Change-Id: Ib1bf6bdd684ea99ba4ad3e85459eb0e5a192a15a

3 years agoAdd check to ignore BG/Q's reserved socket in provisioning counts51/4551/2
Evan Ramos [Fri, 7 Sep 2018 17:35:05 +0000 (7 12:35 -0500)]
Add check to ignore BG/Q's reserved socket in provisioning counts

Change-Id: I9617a8481376e0942ac7ddb3848e9387ca38667b

3 years agoRemove CMK_CCS_AVAILABLE definitions from headers50/4550/1
Evan Ramos [Fri, 7 Sep 2018 17:25:51 +0000 (7 12:25 -0500)]
Remove CMK_CCS_AVAILABLE definitions from headers

This macro is handled by the configure script now.

Change-Id: Ic34511e08c7ad012bfa975499a32a2ef2bbaea5c

3 years agoBug #1975: Add -lrca to link line on Cray systems48/4548/2
Evan Ramos [Thu, 6 Sep 2018 23:07:54 +0000 (6 18:07 -0500)]
Bug #1975: Add -lrca to link line on Cray systems

Change-Id: I368e969b3ccca3958dcddddd2bbfd869cf65b79b

3 years agoAdd isomalloc test31/4231/31
Evan Ramos [Fri, 25 May 2018 20:25:12 +0000 (25 15:25 -0500)]
Add isomalloc test

This does not add it to part of the overall "make test".

Change-Id: I13a43fb361ac462aa56791c96ccccc1b3951fd5f

3 years agoIsomalloc: Fix handling of allocation length and alignment45/4545/2
Evan Ramos [Tue, 4 Sep 2018 19:54:39 +0000 (4 14:54 -0500)]
Isomalloc: Fix handling of allocation length and alignment

Previously, the alignment value requested through posix_memalign etc
would not be recorded, and alignment would be ignored after a migration.

Additionally, the length field would sometimes include the size of
struct CmiIsomallocBlock, but it would be passed unmodified to pup_bytes
along with the user pointer, potentially resulting in OOB read/write at
slot borders.

Change-Id: I971013e536b0b67147094322ec742b928219dafd

3 years agoDisable abort when registering entry methods after init with CharmPy44/4544/1
Juan Galvez [Tue, 4 Sep 2018 19:33:33 +0000 (4 14:33 -0500)]
Disable abort when registering entry methods after init with CharmPy

Change-Id: I3c3f5a382d0182728481bf35c42aeafc9cfbd52b

3 years agobuild: Avoid duplication of OPTSATBUILDTIME43/4543/1
Evan Ramos [Fri, 31 Aug 2018 20:59:29 +0000 (31 15:59 -0500)]
build: Avoid duplication of OPTSATBUILDTIME

Change-Id: I3fac15d2b964212d94a3fc4d49567909f2b5b023

3 years agoPartially fix compatibility when building mpi-win-x86_64 with GCC26/4426/4
Evan Ramos [Tue, 31 Jul 2018 23:00:23 +0000 (31 18:00 -0500)]
Partially fix compatibility when building mpi-win-x86_64 with GCC

A remaining issue is explained in a comment, and using Microsoft MPI
with GCC requires some manual installation modification anyway.

Change-Id: I654ff8e5774cbfc3b17dbffe981aaa19950b9bc3

3 years agoAllow mpi-win* to find the location of Microsoft MPI 9.0.114/4414/5
Evan Ramos [Fri, 27 Jul 2018 22:53:53 +0000 (27 17:53 -0500)]
Allow mpi-win* to find the location of Microsoft MPI 9.0.1

Change-Id: Ie0ceb8e06878ed6f4004ed3a3ab381ba95320cd4

3 years agoImplement conv-mach-opt.mak27/4527/6
Evan Ramos [Mon, 27 Aug 2018 18:35:29 +0000 (27 13:35 -0500)]
Implement conv-mach-opt.mak

This file allows Makefiles to see variables determined at configure time
without invoking a shell script, which is slow.

Change-Id: I558f898db13b50306bb7f0bc349d61cfd6d365fc

3 years agompi-win: Don't let windows.h include winsock.h42/4542/2
Juan Galvez [Fri, 31 Aug 2018 16:21:25 +0000 (31 11:21 -0500)]
mpi-win: Don't let windows.h include winsock.h

winsock.h conflicts with winsock2.h included in sockRoutines

Change-Id: I915c21caab0c61afbca99ce2b91bef775fcec868

3 years agoAMPI #1097: fix broken support for MPI keyval attributes13/4413/10
Sam White [Fri, 27 Jul 2018 21:37:02 +0000 (27 16:37 -0500)]
AMPI #1097: fix broken support for MPI keyval attributes

- Add support for callbacks associated with copying and deletion of
  keyvals.

- Add support for reference counting to keyvals.

- Have comms, datatypes, windows maintain vectors of keyval references.

- Don't forget to PUP the keyvals in ampiCommStruct.

- Improve error checking for various keyval operations.

- Fix MPI_Comm_compare to return MPI_UNEQUAL when the input comms have
  different sizes.

Change-Id: Ie94a3d0bc7e4f67c9276223696d779217c19aca7

3 years agoRequire mmap when building all forms of charmdebug41/4541/1
Evan Ramos [Fri, 31 Aug 2018 03:12:24 +0000 (30 22:12 -0500)]
Require mmap when building all forms of charmdebug

This fixes Windows build failure upon trying to build libmemory-os-charmdebug.

Change-Id: I8ee2513d3b946e6b8b66b92adcbd0125d928b5d3

3 years agoWindows: netlrts TCP performance optimization using WSASend35/4535/3
Juan Galvez [Wed, 29 Aug 2018 21:16:01 +0000 (29 16:16 -0500)]
Windows: netlrts TCP performance optimization using WSASend

This is the Windows equivalent of the sendmsg optimization for
Unix (commit 4af1d868a).

WSASend can send from multiple buffers, which improves the
performance of the netlrts TCP layer. With this patch,
performance in `tests/charm++/pingpong` improves by 50-60%,
and TCP layer now performs same as UDP layer on Windows in
this benchmark.

WSASend requires winsock2.h. This patch replaces
`#include <winsock.h>` with winsock2.h. winsock.h is extremely old
and winsock2 has been around since Win98. Problem is that
including winsock2.h is not exactly trivial. Due to weirdness of
Windows, windows.h includes winsock.h, and trying to include
winsock2.h after it leads to compiler errors. There are different
approaches to include winsock2.h. Most appropriate for charm seems
to be to define WIN32_LEAN_AND_MEAN before including windows.h, so
that it doesn't include winsock.h and other headers. Presumably
this also reduces compilation time.

Change-Id: I8b45589f1e275e90278745b872ba6c0165f8c0e3

3 years agopami: remove always_inline from machine_send36/4536/5
Matthias Diener [Thu, 30 Aug 2018 18:23:57 +0000 (30 11:23 -0700)]
pami: remove always_inline from machine_send

Prevents errors of the type
"error: inlining failed in call to always_inline 'void
machine_send(pami_context_t, int, int, int, char*, int)': function body can be
overwritten at link time"

Change-Id: I388c3c265411108b24860b29b2792f6e574b3541

3 years agoStreamline C++11 support, add support for C++11 in charmxi code10/4510/7
Ronak Buch [Thu, 23 Aug 2018 17:30:11 +0000 (23 12:30 -0500)]
Streamline C++11 support, add support for C++11 in charmxi code

Change-Id: Ida2ced2005947b64297479bcd3674c38837f67f5

3 years agoDisable CMK_CHARMDEBUG from configure if CCS is unavailable34/4534/3
Evan Ramos [Wed, 29 Aug 2018 19:44:24 +0000 (29 14:44 -0500)]
Disable CMK_CHARMDEBUG from configure if CCS is unavailable

This prevents the value in conv-autoconfig.h becoming out of sync with
the value in conv-mach-opt.sh, etc.

Change-Id: I0d43d047123f58d65dfbc5f46454193e7b5dfa0a

3 years agoverbs: Determine device speed entirely ourselves33/4533/2
Evan Ramos [Wed, 29 Aug 2018 19:24:06 +0000 (29 14:24 -0500)]
verbs: Determine device speed entirely ourselves

We cannot depend on ibv_rate_to_mbps because it is not present in all
versions of libibverbs. Same for some identifiers in enum ibv_rate.

Change-Id: I23d21dc7f11ecf3009556447b0b79a8b332e5ae6

3 years agoAMPI testing: check for MPI_COMM_NULL before calling MPI_Comm_free30/4530/4
Sam White [Tue, 28 Aug 2018 17:09:15 +0000 (28 12:09 -0500)]
AMPI testing: check for MPI_COMM_NULL before calling MPI_Comm_free

Change-Id: Iecd4ebfda6180dfa916c90a82d70ecabf7bc8758

3 years agoZerocopy Direct API: Do not change UNREG to REG mode after registration32/4532/3
Nitin Bhat [Tue, 28 Aug 2018 22:17:33 +0000 (28 17:17 -0500)]
Zerocopy Direct API: Do not change UNREG to REG mode after registration

With the introduction of a new boolean variable inside CkNcpyBuffer,
called isRegistered, the registration management can be controlled without
the change in the user specified mode. With this change, the RTS internally
uses the value of isRegistered to avoid unnecessary registration and
de-registration. The user specified mode remains unchanged.

This fix also updates the example programs to test de-registration using the
CkNcpyBuffer received in the callback method. The examples are also updated
to not store the CkNcpyBuffer objects, wherever applicable, as data members
in the class as they are now received as a part of the CkDataMsg in the
CkCallback.

Change-Id: I896c7002284b7aab447ad5e3bd41cf709814d218

3 years agoDocumentation for Zerocopy API: Minor change in mode specification31/4531/1
Nitin Bhat [Tue, 28 Aug 2018 22:15:36 +0000 (28 17:15 -0500)]
Documentation for Zerocopy API: Minor change in mode specification

Change-Id: I25c1f54eef75130d2d730ac45b1578f20c4d3367

3 years agoDocumentation about PREREG mode for the Zerocopy API28/4528/1
Nitin Bhat [Tue, 28 Aug 2018 14:25:54 +0000 (28 09:25 -0500)]
Documentation about PREREG mode for the Zerocopy API

Change-Id: I403924abe3c125b53fd7f8c6593fd5d0b480818f

3 years agoFix bug in MPI machine layer related to the Ncpy API26/4526/3
Nitin Bhat [Mon, 27 Aug 2018 17:25:55 +0000 (27 13:25 -0400)]
Fix bug in MPI machine layer related to the Ncpy API

The bug is related to linked list states when messages are sent
in the callback function invoked inside ReleasePostedMessages.
Previously, the end_sent link list was stale when a new message was
added to the linked list while iterating across the list inside
ReleasePostedMessages. With the fix, the link list state is updated
before invoking the acknowledgement function to handle sends while
iterating across the list.

Change-Id: Ib06b3ab7375c13f328df9502976b5fc900fbc03f

3 years agocleanup: remove vestiges of netlrts-ibverbs / net-ibverbs25/4525/1
Matthias Diener [Mon, 27 Aug 2018 18:11:54 +0000 (27 13:11 -0500)]
cleanup: remove vestiges of netlrts-ibverbs / net-ibverbs

Change-Id: I2178e4d0e72544df5225ee48d9ac9fd1d507f5f7

3 years agodoc: fix defaults for spack package24/4524/2
Matthias Diener [Mon, 27 Aug 2018 15:25:40 +0000 (27 10:25 -0500)]
doc: fix defaults for spack package

Change-Id: I8c67a8d308bc2e14a209bbe8e415fd409dc3df3f

3 years agoDeclare charm_reducers as extern "C" to avoid C++ name mangling23/4523/1
Juan Galvez [Mon, 27 Aug 2018 15:19:55 +0000 (27 10:19 -0500)]
Declare charm_reducers as extern "C" to avoid C++ name mangling

charm_reducers is data meant to be accessed from outside Charm++
(namely CharmPy). extern "C" declaration simplifies access from C
compiled code.

Change-Id: I728d425eaf5ed1a2ead443c00a85ec730c2b50c5

3 years agocpuaffinity: use cmi_hwloc_ prefix consistently21/4521/4
Matthias Diener [Fri, 24 Aug 2018 18:21:07 +0000 (24 13:21 -0500)]
cpuaffinity: use cmi_hwloc_ prefix consistently

Change-Id: Iae44cdcc4f06feca522043d81d9aa9adb391305e

3 years agobuild: pass through CMK_LIBS in cc-gcc19/4519/2
Matthias Diener [Fri, 24 Aug 2018 17:41:17 +0000 (24 12:41 -0500)]
build: pass through CMK_LIBS in cc-gcc

Fixes missing references to pthreads when
building with e.g.
./build charm++ multicore-linux-x86_64 gcc

Change-Id: I19b7f18b42f023c208b00032d0086604ede8c7f1

3 years agoCleanup: add missing return to cpuaffinity18/4518/1
Sam White [Fri, 24 Aug 2018 16:01:26 +0000 (24 11:01 -0500)]
Cleanup: add missing return to cpuaffinity

Change-Id: Ic47fc1a58fcdf8a30f86a14e87e262cddaae60fa

3 years agoConditionalize use of C++11 emplace_back in charmxi code on 'charmc -host' support... 17/4517/1
Sam White [Fri, 24 Aug 2018 15:36:41 +0000 (24 10:36 -0500)]
Conditionalize use of C++11 emplace_back in charmxi code on 'charmc -host' support of C++11

Change-Id: Icd7544c2554b5dd8f5da04a2c59f29b7c94f6d9b

3 years agoCleanup AMPI DDT01/4501/16
Sam White [Wed, 22 Aug 2018 16:50:54 +0000 (22 11:50 -0500)]
Cleanup AMPI DDT

Make spacing consistent, use MPI_ macros wherever possible, update old
comments, use C++11 support for copy constructors with default args,
rename variable names for more consistency, and order the defs/decls
of CkDDT_ classes from least complex (contiguous) to most complex
(struct).

Change-Id: I362449e0397c10b4639e6d328840ae6e8201e7d1