Sam White [Wed, 10 Oct 2018 03:23:06 +0000 (9 22:23 -0500)]
AMPI: remove redundant worldStruct and lift blockingReq into ampiParent
Change-Id: Idc42678eb811f9fbd1e06a7274d1b5151cb75677
Sam White [Tue, 9 Oct 2018 01:50:47 +0000 (8 20:50 -0500)]
AMPI: set blockingReq before contributing to a reduction
Set ampiParent::resumeOnColl and ampi::blockingReq before contributing
in case completion of the collective happens inline.
Change-Id: I7d52582940d3945bfed73f63abcf24f9119a9fa8
Sam White [Tue, 9 Oct 2018 00:09:29 +0000 (8 19:09 -0500)]
AMPI: use concat rather than set reducer for gather(v)-like reductions
Use concat for gather, gatherv, and noncommutative reductions. The set
reducer adds unnecessary memory overhead compared to concat, since each
contribution has an int size along with it. For gather's and
noncommutative reductions this is straightforward because the
contribution sizes are all the same, while with gatherv's we add a
separate tuple reduction to keep track of contribution sizes.
Change-Id: Ied154413696c140e4a25325bb7787cab1febaed5
Sam White [Tue, 9 Oct 2018 00:11:04 +0000 (8 19:11 -0500)]
AMPI: don't forget about MPI_MINLOC in the predefined ops array
Change-Id: I178c9b3550caf829818fa15f73ba529199d7a57e
Sam White [Tue, 9 Oct 2018 13:22:30 +0000 (9 08:22 -0500)]
Unbreak builds with tracing enabled
Change-Id: I0f649de7b2bde46296797df0ece0ca5055fa3ac7
Juan Galvez [Mon, 8 Oct 2018 18:26:52 +0000 (8 13:26 -0500)]
New thread CmiNodeBarrierCount implementation for win-smp
The new barrier implementation uses 2 separate barriers: one for
CmiNodeBarrier, and a different one for CmiNodeAllBarrier.
Trying to reuse the same barrier with different thread count leads
to incorrect behavior under some race conditions. On Windows, this
is observed in some cases during charm initialization, where the
commthread in the last CmiNodeAllBarrier gets stuck because of 3
CmiNodeBarriers that are called in initDone.
Change-Id: Idcfd7a147b60b7b9bdba548efafd45ff5ef746f4
Nitin Bhat [Fri, 5 Oct 2018 02:40:54 +0000 (4 22:40 -0400)]
ZC Direct API: Add an example with an immediate entry method callback
Change-Id: I54a01e466c311c756122fd81bb84fb13ff7f8160
Nitin Bhat [Thu, 4 Oct 2018 23:28:55 +0000 (4 19:28 -0400)]
ZC Direct API: Invoke callbacks in SMP mode through the first worker thread
There is no relationship between a CkNcpyBuffer's PE and the its callback.
Previously, in the SMP mode, where the ackhandler function was being
executed on the comm thread, the comm thread would invoke the
_callback_group on the srcPe/destPe to finally invoke the callback.
This was done as a workaround for the absence of a location manager on
the comm thread. However, it was unnecessary to invoke the _callback_group
on the srcPe/destPe. With this commit, the _callback_group is invoked on the
first pe of that node i.e. CmiNodeFirst(CkMyNode()) rather than the
srcPe/destPe. This prefers local communication over the more expensive remote
communication.
Change-Id: I621f3eb008eeb2768d5137b88de4f41d802c937c
Evan Ramos [Mon, 8 Oct 2018 18:32:11 +0000 (8 13:32 -0500)]
Bug #1985: Fix and document error when defining a constructor with SDAG
Change-Id: Ib5d0c33e709cbbd3e17a5f67ed5386e4ac9fdf21
Sam White [Mon, 8 Oct 2018 01:47:10 +0000 (7 20:47 -0500)]
Expose addReducer()'s name argument for real
This previous commit that added the names for built-in types failed
to actually expose the optional name argument via addReducer() to users.
Also add some names to custom reducers in the runtime.
Change-Id: I9c9d25b865a439f76c020ebfa4d8c4d2d1d8bf49
Sam White [Fri, 5 Oct 2018 20:18:02 +0000 (5 15:18 -0500)]
AMPI: do not free user-defined keyvals when freeing a comm
This works around a bug in AMPI's comm keyval handling that causes
a double-free in ROMIO/HDF5 file open.
Change-Id: I624b4d329dba042e84c79c35d2a4d86649229d56
Ronak Buch [Fri, 5 Oct 2018 20:00:44 +0000 (5 15:00 -0500)]
Document trace{Begin/End}UserBracketEvent
Change-Id: I0b5966ed707b52486c755a106e3a6fe99c209672
Ronak Buch [Fri, 5 Oct 2018 17:43:26 +0000 (5 12:43 -0500)]
Remove -g from optimization flags for pami-linux-ppc64le
Change-Id: I95ca43d548e455fee53348203e69e328e39b998f
Ronak Buch [Thu, 4 Oct 2018 22:02:23 +0000 (4 17:02 -0500)]
Pass correct flag for gcc on pami-linux-ppc64le instead of using xlC's
Change-Id: I81cc0ae80bf0e2fbf598cc551e269c7e9eadc28e
Evan Ramos [Thu, 4 Oct 2018 21:15:15 +0000 (4 16:15 -0500)]
Prefix memory libs using ptmalloc3 with "gnu-" and add default handling of unprefixed names
This changes the default allocator for verbose, paranoid, leak, and
isomalloc from ptmalloc3 to the operating system's malloc.
Change-Id: Id3699fe25c4e3733ee2d5086066f661151ab83cd
Sam White [Thu, 4 Oct 2018 21:35:12 +0000 (4 16:35 -0500)]
CkIO: cleanup source of compiler warning
Change-Id: Ifd299114b3d88968d51049947b9eb50335036ebf
Matthias Diener [Thu, 4 Oct 2018 21:00:17 +0000 (4 14:00 -0700)]
ampirun: stop argument parsing when finding an executable
Prevents errors such as 'ampirun -n 4 ./foo -n bar'
getting parsed as "+p bar"
Change-Id: I20bc011bec05f05b80325b64bac31577d07332a0
Jaemin Choi [Tue, 2 Oct 2018 21:32:48 +0000 (2 17:32 -0400)]
Add NVTX to matmul and vecadd CUDA examples, guard setTraceName() with HAPI_TRACE
Change-Id: I99dea6a1d69f5dba39ef6c3dce95bee7711d570f
Matthias Diener [Sat, 15 Sep 2018 00:03:06 +0000 (14 19:03 -0500)]
doc: change spack package name to charmpp
The name has already been changed in Spack.
Change-Id: Ie953410dc978cc4e3c1f7469a1ce3e2987d8b459
Michael Robson [Wed, 22 Aug 2018 19:38:02 +0000 (22 14:38 -0500)]
Feature #693: Add functions to improve CcdCallback frequency
Change-Id: I43f0cbdb06894e719671e15df732fcf062132a70
Sam White [Tue, 2 Oct 2018 21:29:06 +0000 (2 16:29 -0500)]
Conditionalize CkDataMsg checks for msg corruption on CMK_ERROR_CHECKING
Change-Id: I6f1fcfef2fc7b4555ee03aa49534d6cd20b4c857
Nitin Bhat [Tue, 11 Sep 2018 18:07:59 +0000 (11 13:07 -0500)]
Zerocopy Direct API: Support for QD
Change-Id: Idf647b4bcc69836ed071abf8d3886ccfd0ba0e9e
Sam White [Thu, 27 Sep 2018 03:51:26 +0000 (26 22:51 -0500)]
AMPI: avoid storing all groups that are std::iota
- Clean up Group interface, and avoid ever storing Groups that can be
lazily and transiently constructed via std::iota() as needed.
Change-Id: I27ca4100c7ca063221d2ce10f339d84df698edac
Sam White [Mon, 24 Sep 2018 19:20:32 +0000 (24 14:20 -0500)]
AMPI: store predefined ops and types per-process rather than per-rank
- Split storage of predefined ops and datatypes from user-defined ones,
storing the predefined ones in static const arrays, in order to
reduce memory overhead.
- Rename CkDDT_MAX_PRIMITIVE_TYPE and CkDDT_MAX_BASIC_TYPE to AMPI_*,
and fix definition of BASIC type to include MPI_UB and MPI_LB.
Change-Id: I490a4c4d50e4fc1a1fb2c39314baa4b08640aa96
Evan Ramos [Tue, 2 Oct 2018 17:18:53 +0000 (2 12:18 -0500)]
Disable PU count mismatch warning on BG/Q
Change-Id: If3f9a8c93b05e0d1724aeefbb5ecea47d086d452
Sam White [Mon, 1 Oct 2018 21:19:56 +0000 (1 16:19 -0500)]
AMPI: fix split creation of comms with dist_graph topology
Change-Id: I3326ff4b1db7eb0a0ed1a4368c074ba787eea48c
Sam White [Wed, 28 Mar 2018 20:31:05 +0000 (28 15:31 -0500)]
LBComm: reduce initial size of LBCommData from 10K elements to 500
- This reduces the memory footprint per PE of LBCommData from
~800KB to ~40KB.
Change-Id: I93a050e826ac88a37fec3b9d700406134b667ad7
Sam White [Fri, 28 Sep 2018 18:17:59 +0000 (28 13:17 -0500)]
conv-conds: expose maximum CcdCallback number to users as CcdUSERMAX
- Also redefine the predefined Ccd values to be more compact and leave
more room for user-defined ones.
Change-Id: If6bafe102982fbb79b41abc8d7ea148ad4630a58
Sam White [Thu, 27 Sep 2018 21:22:57 +0000 (27 16:22 -0500)]
conv-conds: decrease maximum conds to lessen the memory footprint
- The heap dynamically resizes on demand, and MAXNUMCONDS=128 seems
large enough. Added assertions about that too.
- Reorder the member variables of ccd_cblist for better packing.
- This reduces the memory footprint per PE of conv-conds data
structures from ~340 KB to ~85 KB.
Change-Id: I53a5cd0511b56a14f1cd2dec237c28b5cc5438b7
Sam White [Sat, 29 Sep 2018 15:47:25 +0000 (29 10:47 -0500)]
AMPI: enable changing the default errhandler at build-time
Change-Id: I427f31be68d2623f2d63bd84e415778ec7897747
Sam White [Sat, 29 Sep 2018 15:40:06 +0000 (29 10:40 -0500)]
Cleanup AMPI error checking routine
Change-Id: I132573af6d1b572b165241b5d5abdc271d0b8d66
Sam White [Fri, 28 Sep 2018 14:00:28 +0000 (28 09:00 -0500)]
CkArray: change localElems to unordered_map and fix eraseEltFromArrMgr()
- std::unordered_map lookup is generally more efficient than std::map
- eraseEltFromArrMgr() was previously not erasing from localElemVec,
meaning that the vector would continuously increase in size
over time given migrations.
- Work around known issue with calling migrateMe() from an SDAG
entry method by changing tests/charm++/sdag/migration/ to instead
call migrateMe() from a non-SDAG entry method. See redmine #480
for more details on this. The change to eraseEltFromArrMgr()
in this patch exposed this issue again.
Change-Id: I102d5975e383a746b0e84f116c45d40c9ebb4c24
Evan Ramos [Fri, 28 Sep 2018 22:32:13 +0000 (28 17:32 -0500)]
PUP: Fix handling of std::{set,multiset,forward_list}
Change-Id: I5f49f7347503bc9d884c44c6819e1def6a713f46
Sam White [Thu, 27 Sep 2018 23:09:22 +0000 (27 18:09 -0500)]
Cleanup CkMulticast STL usage
Change-Id: I62f6bb0a82a0414f0df2a1a69f07ba70ddae5efd
Sam White [Thu, 27 Sep 2018 20:51:57 +0000 (27 15:51 -0500)]
Reduce memory.C's life-raft allocation from 32 to 16 KB
Change-Id: Ic98487b7e0c553bf2d73d04d5d8411de849bb11a
Evan Ramos [Thu, 27 Sep 2018 22:40:28 +0000 (27 17:40 -0500)]
Bug #1981: Fix crash when creating a migratable thread with isomalloc disabled
This can be seen in tests/converse/cthtest when run with "+noisomalloc"
or on a machine that lacks isomalloc.
Change-Id: I4a8a3b2e2af2f831e713eb06a9e899c72a3a63a5
Evan Ramos [Thu, 27 Sep 2018 20:07:55 +0000 (27 15:07 -0500)]
Explicitly call the default constructor when allocating Cpv variables in C++ code
This fixes the crash in uth-linux-x86_64/examples/ampi/Cjacobi3d/jacobi.iso,
caused by isomalloc_blocklist not being properly nulled for all PEs upon
initialization.
Change-Id: I072ef998edfd0f733e417fd4628acafaa8cc6411
Karthik Senthil [Fri, 15 Jun 2018 21:08:52 +0000 (15 16:08 -0500)]
CkIO : Add support to query Lustre FS API to set default stripe size
- CkIO now sets the default writeStripe to the stripe size queried from
Lustre FS API (otherwise 4 * 1024 * 1024 for non-Lustre systems)
- This patch also adds support for Charm++ to detect Lustre FS during
build, and set the CMK_HAS_LUSTREFS and CMK_LUSTREAPI flags
Change-Id: I14e5d2bafec3f033bc3c8c00714de1e5c07e44e0
Sam White [Thu, 27 Sep 2018 04:10:38 +0000 (26 23:10 -0500)]
Cleanup AMPI GPUReq and pooled SsendReq
Change-Id: I81747da0d9fdd8cf0b534cc3ed2908946f193ad8
Evan Ramos [Wed, 26 Sep 2018 19:42:35 +0000 (26 14:42 -0500)]
charmc: When checking GNU ld's version, account for '-' in the minor field
For example, "GNU ld version 2.27-28.base.el7_5.1".
Change-Id: Id8267bd47d25e2b2c9876d479aa5393a01dc9348
Evan Ramos [Tue, 14 Aug 2018 22:20:25 +0000 (14 17:20 -0500)]
Isomalloc: Disable use of the mempool
In testing, my isomalloc stress test completes more quickly this way.
Change-Id: I4588fe0762c86354e5d5f1dc4bcc406ce1c6aceb
Phil Miller [Thu, 31 Dec 2015 23:05:22 +0000 (31 17:05 -0600)]
Isomalloc #934: Move usage to an explicit BlockList interface, to avoid implicit thread-based context
This enables code to use Isomalloc-managed migratable heaps without associating
them with a corresponding migratable thread.
Co-authored-by: Evan Ramos <evan@hpccharm.com>
Change-Id: I5dd6d61f285807e8e7e344e341b50a3a5a5368c9
Jaemin Choi [Tue, 25 Sep 2018 17:31:29 +0000 (25 13:31 -0400)]
Fix AMPI compilation issues with newly merged GPUManager patches
Change-Id: I9e9274a17466b3891163d1386425fd80f1e27935
Nitin Bhat [Fri, 21 Sep 2018 18:37:55 +0000 (21 14:37 -0400)]
ZC OFI API: Replace fi_write with fi_writemsg with FI_DELIVERY_COMPLETE
Previously, fi_write would complete only when the source could
reuse its buffer. With this change, an fi_writemsg completes only when
the destination buffer has received the data. This change is required
to solve a rare race condition which occurs in the UNREG mode of operation,
where a Put operation is performed instead of a Get operation. The race
condition causes the source to send a message to the destination to
potentially de-register the destination buffer when the completion of the
write operation on the destination is uncertain i.e. the data could still
be in-flight. This patch fixes that case as completion on the source only
occurs after the destination has received the data through the RDMA write
operation.
Change-Id: I808f1d5bc9dda3d92859e9775531d0b5c47a1c8e
Jaemin Choi [Tue, 25 Sep 2018 05:05:45 +0000 (25 01:05 -0400)]
Update references to hapi_src to hapi_impl and revert hapiRegisterCallbacks
Change-Id: Ic825715c76058c21e73558a1cf8dc154c218df7e
Michael Robson [Thu, 26 Jul 2018 22:21:43 +0000 (26 17:21 -0500)]
cuda: Add hapi prefix to HAPI structs and update AMPI interface
Change-Id: I577cf1ee8d4067186a9c8d3e04f68b71a7250a9d
Michael Robson [Tue, 24 Apr 2018 22:06:59 +0000 (24 17:06 -0500)]
cuda: Various fixes to enable new GPU Manager
* Change memcpyAsync to memcpy (synchronous)
* Ensure free is called for both wr and hapi APIs
* Fix CUDA_ and HAPI_ define flag inconsistencies
* Change GPUManager::createStreams() to always return number of streams
* Re-enable HAPI_MEMPOOL by default
* Update user_data API to either set or copy due to user preference
Change-Id: Id6fa947d5cea5b49af04bbd287ba97a2faa240a0
Jaemin Choi [Mon, 27 Nov 2017 21:53:18 +0000 (27 15:53 -0600)]
Cleanup #1491: Update documentation of GPUManager
Change-Id: I9d1fc90f8556c14b868015fa2fc0ea127c439c3d
Jaemin Choi [Mon, 27 Nov 2017 20:32:27 +0000 (27 15:32 -0500)]
Cleanup #1489: Delete GPU dummy mempool
Change-Id: Ibfc4b14a6e5ce90bf0cddd951cb23708d65513f5
Jaemin Choi [Thu, 14 Sep 2017 17:41:43 +0000 (14 12:41 -0500)]
Support #1450: Clean up and add CUDA example programs
Change-Id: I89d668f736d69f11373f13055859135f64dd1e07
Jaemin Choi [Wed, 6 Dec 2017 22:19:31 +0000 (6 16:19 -0600)]
Feature #1393: Redesign of Hybrid API to support concurrent kernels
- Core algorithmic changes using CUDA streams, events and callbacks
- Add NVTX support (with Tim Haines' modifications)
- Remove workRequest queue related files
- Retain support for workRequest with small syntax changes
- Add README
Change-Id: Icbef6a4c3408acdf23ee7506f35326b6fc949b34
Jaemin Choi [Wed, 6 Dec 2017 22:17:07 +0000 (6 16:17 -0600)]
Rename files for Feature #1393: Redesign of Hybrid API
Change-Id: Ia882ee7676919538b77b298b3fe43b65e1973d7a
Sam White [Thu, 20 Sep 2018 18:24:50 +0000 (20 13:24 -0500)]
AMPI: fix intercomm NBC tests and remove testing of broken intercomm MPI_Ibarrier
Change-Id: I998353dbd505b89791f4ce7f569280b10227fff0
Nitin Bhat [Tue, 18 Sep 2018 22:59:05 +0000 (18 17:59 -0500)]
Verbs ZC API: Add lock while sending small message in UNREG mode
This patch also includes minor cleanup around the locking macros.
Previously, a macro called CMK_SMP_NOT_RELAX_LOCK was used, but was
never defined. That is removed in this patch and locking is limited
to only the SMP mode.
Change-Id: I41a0d34a58fd2eba7b5597714282289eea2c8b0a
Sam White [Tue, 18 Sep 2018 15:55:33 +0000 (18 10:55 -0500)]
Cleanup: shorten time taken to run longest-running tests/examples
- Decrease iterations and/or sizes of inputs.
- Decrease +p to avoid oversubscription in SMP mode on machines with
only 8 PUs.
Change-Id: I38d76812a686ee38dac0602ed7da29eb2536f15e
Evan Ramos [Mon, 17 Sep 2018 21:59:07 +0000 (17 16:59 -0500)]
PUP: Add reconstruct support for std::{unordered_,}{multi,}{map,set}
Change-Id: I9672ebd3b2105f996c53a9b3620653bb2711aa38
Evan Ramos [Tue, 18 Sep 2018 17:36:39 +0000 (18 12:36 -0500)]
AMPI: Remove noexcept from thread start functions that wrap user code
Otherwise, debuggers will unwind to the functions marked noexcept
instead of the source of the exception.
Change-Id: I5718c8142c6eac0d1c04f2109d7e36a602e585bb
Evan Ramos [Mon, 17 Sep 2018 21:58:02 +0000 (17 16:58 -0500)]
PUP: Fix detection of class constructors taking PUP::reconstruct
Previously, this code would check if PUP::reconstruct could be
constructed using the class type as a parameter. This patch swaps the
types to the correct relationship.
Change-Id: I014a5fd879d85aaccb7194c00ae571dc9c71c269
raghavendrak [Wed, 12 Sep 2018 20:08:54 +0000 (12 15:08 -0500)]
An example for 2D array sections, and its usage
Change-Id: I616cd280d7c6bf2a35b0cdfb84ccd2e4f50bbd03
Sam White [Wed, 12 Sep 2018 17:19:11 +0000 (12 12:19 -0500)]
Partially fix CkCallbacks to section multicasts
This fixes CkCallbacks to section multicasts, which were broken by
commit
ede3f6d854de03f1836c887530b3626cdc139640, and may have been
broken before then.
CkCallback's to sections will still break when migrated, since they
contain bare pointers to the section info. That issue is redmine #235.
Change-Id: If1830106d0f947e105e90aa6cbbab8219a059d35
Sam White [Wed, 12 Sep 2018 13:17:08 +0000 (12 08:17 -0500)]
Cleanup #1978: array sections example
Change-Id: Ia2668948b069b7f5407112f439f029a7ec704b5e
Sam White [Sun, 16 Sep 2018 19:42:35 +0000 (16 14:42 -0500)]
Conditionalize CkMcastBaseMsg::checkMagic on CMK_ERROR_CHECKING
Change-Id: If8b4c82a568e98474cb6e464d84ff2f1885b1aa0
Sam White [Wed, 12 Sep 2018 02:12:35 +0000 (11 21:12 -0500)]
Cleanup: remove unnecessary delegation of cross-array section to CkMulticast
- Remove manual delegation from the xarraySection example code.
- Make the documentation about auto-delegation of cross-array sections
more clear.
Change-Id: I2ed7721b21d482eacf519ff985ecb05c6f55bb6f
Sam White [Thu, 13 Sep 2018 14:56:02 +0000 (13 09:56 -0500)]
AMPI: avoid unnecessary casts of AmpiRequests to derived types
Change-Id: Ib2d93666c0396180b4352dfc96759a99cb7bec9d
Sam White [Sat, 5 May 2018 17:21:21 +0000 (5 10:21 -0700)]
AMPI: mark all of AMPI and TCharm as noexcept
Change-Id: Iaa66290a97547f59a95b5fa76195476d56db7259
Sam White [Tue, 11 Sep 2018 23:15:40 +0000 (11 18:15 -0500)]
AMPI: align the AmpiRequest pool based on the alignment of the types it may hold
Change-Id: I9da505d77448c9bdabd9573832cc7943d66f0941
Sam White [Tue, 11 Sep 2018 23:05:12 +0000 (11 18:05 -0500)]
AMPI: mark derived classes with C++11 final keyword where applicable
- Fix bug in ampi::irednResult() exposed by use of 'final'.
Change-Id: Ifa622e9ce93f10067c4d5f2e94dad822d113a2b1
Sam White [Tue, 11 Sep 2018 22:57:05 +0000 (11 17:57 -0500)]
Cleanup: simplify AMPI class member variables
Change-Id: Id220857bdd4e88e13bf0bac3c59cb32b13d71b4e
Evan Ramos [Thu, 13 Sep 2018 22:22:59 +0000 (13 17:22 -0500)]
Pass block pointers to free when Isomalloc is disabled
This fixes the crash in examples/armci/putTest.
Change-Id: Ib1bf6bdd684ea99ba4ad3e85459eb0e5a192a15a
Evan Ramos [Fri, 7 Sep 2018 17:35:05 +0000 (7 12:35 -0500)]
Add check to ignore BG/Q's reserved socket in provisioning counts
Change-Id: I9617a8481376e0942ac7ddb3848e9387ca38667b
Evan Ramos [Fri, 7 Sep 2018 17:25:51 +0000 (7 12:25 -0500)]
Remove CMK_CCS_AVAILABLE definitions from headers
This macro is handled by the configure script now.
Change-Id: Ic34511e08c7ad012bfa975499a32a2ef2bbaea5c
Evan Ramos [Thu, 6 Sep 2018 23:07:54 +0000 (6 18:07 -0500)]
Bug #1975: Add -lrca to link line on Cray systems
Change-Id: I368e969b3ccca3958dcddddd2bbfd869cf65b79b
Evan Ramos [Fri, 25 May 2018 20:25:12 +0000 (25 15:25 -0500)]
Add isomalloc test
This does not add it to part of the overall "make test".
Change-Id: I13a43fb361ac462aa56791c96ccccc1b3951fd5f
Evan Ramos [Tue, 4 Sep 2018 19:54:39 +0000 (4 14:54 -0500)]
Isomalloc: Fix handling of allocation length and alignment
Previously, the alignment value requested through posix_memalign etc
would not be recorded, and alignment would be ignored after a migration.
Additionally, the length field would sometimes include the size of
struct CmiIsomallocBlock, but it would be passed unmodified to pup_bytes
along with the user pointer, potentially resulting in OOB read/write at
slot borders.
Change-Id: I971013e536b0b67147094322ec742b928219dafd
Juan Galvez [Tue, 4 Sep 2018 19:33:33 +0000 (4 14:33 -0500)]
Disable abort when registering entry methods after init with CharmPy
Change-Id: I3c3f5a382d0182728481bf35c42aeafc9cfbd52b
Evan Ramos [Fri, 31 Aug 2018 20:59:29 +0000 (31 15:59 -0500)]
build: Avoid duplication of OPTSATBUILDTIME
Change-Id: I3fac15d2b964212d94a3fc4d49567909f2b5b023
Evan Ramos [Tue, 31 Jul 2018 23:00:23 +0000 (31 18:00 -0500)]
Partially fix compatibility when building mpi-win-x86_64 with GCC
A remaining issue is explained in a comment, and using Microsoft MPI
with GCC requires some manual installation modification anyway.
Change-Id: I654ff8e5774cbfc3b17dbffe981aaa19950b9bc3
Evan Ramos [Fri, 27 Jul 2018 22:53:53 +0000 (27 17:53 -0500)]
Allow mpi-win* to find the location of Microsoft MPI 9.0.1
Change-Id: Ie0ceb8e06878ed6f4004ed3a3ab381ba95320cd4
Evan Ramos [Mon, 27 Aug 2018 18:35:29 +0000 (27 13:35 -0500)]
Implement conv-mach-opt.mak
This file allows Makefiles to see variables determined at configure time
without invoking a shell script, which is slow.
Change-Id: I558f898db13b50306bb7f0bc349d61cfd6d365fc
Juan Galvez [Fri, 31 Aug 2018 16:21:25 +0000 (31 11:21 -0500)]
mpi-win: Don't let windows.h include winsock.h
winsock.h conflicts with winsock2.h included in sockRoutines
Change-Id: I915c21caab0c61afbca99ce2b91bef775fcec868
Sam White [Fri, 27 Jul 2018 21:37:02 +0000 (27 16:37 -0500)]
AMPI #1097: fix broken support for MPI keyval attributes
- Add support for callbacks associated with copying and deletion of
keyvals.
- Add support for reference counting to keyvals.
- Have comms, datatypes, windows maintain vectors of keyval references.
- Don't forget to PUP the keyvals in ampiCommStruct.
- Improve error checking for various keyval operations.
- Fix MPI_Comm_compare to return MPI_UNEQUAL when the input comms have
different sizes.
Change-Id: Ie94a3d0bc7e4f67c9276223696d779217c19aca7
Evan Ramos [Fri, 31 Aug 2018 03:12:24 +0000 (30 22:12 -0500)]
Require mmap when building all forms of charmdebug
This fixes Windows build failure upon trying to build libmemory-os-charmdebug.
Change-Id: I8ee2513d3b946e6b8b66b92adcbd0125d928b5d3
Juan Galvez [Wed, 29 Aug 2018 21:16:01 +0000 (29 16:16 -0500)]
Windows: netlrts TCP performance optimization using WSASend
This is the Windows equivalent of the sendmsg optimization for
Unix (commit
4af1d868a).
WSASend can send from multiple buffers, which improves the
performance of the netlrts TCP layer. With this patch,
performance in `tests/charm++/pingpong` improves by 50-60%,
and TCP layer now performs same as UDP layer on Windows in
this benchmark.
WSASend requires winsock2.h. This patch replaces
`#include <winsock.h>` with winsock2.h. winsock.h is extremely old
and winsock2 has been around since Win98. Problem is that
including winsock2.h is not exactly trivial. Due to weirdness of
Windows, windows.h includes winsock.h, and trying to include
winsock2.h after it leads to compiler errors. There are different
approaches to include winsock2.h. Most appropriate for charm seems
to be to define WIN32_LEAN_AND_MEAN before including windows.h, so
that it doesn't include winsock.h and other headers. Presumably
this also reduces compilation time.
Change-Id: I8b45589f1e275e90278745b872ba6c0165f8c0e3
Matthias Diener [Thu, 30 Aug 2018 18:23:57 +0000 (30 11:23 -0700)]
pami: remove always_inline from machine_send
Prevents errors of the type
"error: inlining failed in call to always_inline 'void
machine_send(pami_context_t, int, int, int, char*, int)': function body can be
overwritten at link time"
Change-Id: I388c3c265411108b24860b29b2792f6e574b3541
Ronak Buch [Thu, 23 Aug 2018 17:30:11 +0000 (23 12:30 -0500)]
Streamline C++11 support, add support for C++11 in charmxi code
Change-Id: Ida2ced2005947b64297479bcd3674c38837f67f5
Evan Ramos [Wed, 29 Aug 2018 19:44:24 +0000 (29 14:44 -0500)]
Disable CMK_CHARMDEBUG from configure if CCS is unavailable
This prevents the value in conv-autoconfig.h becoming out of sync with
the value in conv-mach-opt.sh, etc.
Change-Id: I0d43d047123f58d65dfbc5f46454193e7b5dfa0a
Evan Ramos [Wed, 29 Aug 2018 19:24:06 +0000 (29 14:24 -0500)]
verbs: Determine device speed entirely ourselves
We cannot depend on ibv_rate_to_mbps because it is not present in all
versions of libibverbs. Same for some identifiers in enum ibv_rate.
Change-Id: I23d21dc7f11ecf3009556447b0b79a8b332e5ae6
Sam White [Tue, 28 Aug 2018 17:09:15 +0000 (28 12:09 -0500)]
AMPI testing: check for MPI_COMM_NULL before calling MPI_Comm_free
Change-Id: Iecd4ebfda6180dfa916c90a82d70ecabf7bc8758
Nitin Bhat [Tue, 28 Aug 2018 22:17:33 +0000 (28 17:17 -0500)]
Zerocopy Direct API: Do not change UNREG to REG mode after registration
With the introduction of a new boolean variable inside CkNcpyBuffer,
called isRegistered, the registration management can be controlled without
the change in the user specified mode. With this change, the RTS internally
uses the value of isRegistered to avoid unnecessary registration and
de-registration. The user specified mode remains unchanged.
This fix also updates the example programs to test de-registration using the
CkNcpyBuffer received in the callback method. The examples are also updated
to not store the CkNcpyBuffer objects, wherever applicable, as data members
in the class as they are now received as a part of the CkDataMsg in the
CkCallback.
Change-Id: I896c7002284b7aab447ad5e3bd41cf709814d218
Nitin Bhat [Tue, 28 Aug 2018 22:15:36 +0000 (28 17:15 -0500)]
Documentation for Zerocopy API: Minor change in mode specification
Change-Id: I25c1f54eef75130d2d730ac45b1578f20c4d3367
Nitin Bhat [Tue, 28 Aug 2018 14:25:54 +0000 (28 09:25 -0500)]
Documentation about PREREG mode for the Zerocopy API
Change-Id: I403924abe3c125b53fd7f8c6593fd5d0b480818f
Nitin Bhat [Mon, 27 Aug 2018 17:25:55 +0000 (27 13:25 -0400)]
Fix bug in MPI machine layer related to the Ncpy API
The bug is related to linked list states when messages are sent
in the callback function invoked inside ReleasePostedMessages.
Previously, the end_sent link list was stale when a new message was
added to the linked list while iterating across the list inside
ReleasePostedMessages. With the fix, the link list state is updated
before invoking the acknowledgement function to handle sends while
iterating across the list.
Change-Id: Ib06b3ab7375c13f328df9502976b5fc900fbc03f
Matthias Diener [Mon, 27 Aug 2018 18:11:54 +0000 (27 13:11 -0500)]
cleanup: remove vestiges of netlrts-ibverbs / net-ibverbs
Change-Id: I2178e4d0e72544df5225ee48d9ac9fd1d507f5f7
Matthias Diener [Mon, 27 Aug 2018 15:25:40 +0000 (27 10:25 -0500)]
doc: fix defaults for spack package
Change-Id: I8c67a8d308bc2e14a209bbe8e415fd409dc3df3f
Juan Galvez [Mon, 27 Aug 2018 15:19:55 +0000 (27 10:19 -0500)]
Declare charm_reducers as extern "C" to avoid C++ name mangling
charm_reducers is data meant to be accessed from outside Charm++
(namely CharmPy). extern "C" declaration simplifies access from C
compiled code.
Change-Id: I728d425eaf5ed1a2ead443c00a85ec730c2b50c5
Matthias Diener [Fri, 24 Aug 2018 18:21:07 +0000 (24 13:21 -0500)]
cpuaffinity: use cmi_hwloc_ prefix consistently
Change-Id: Iae44cdcc4f06feca522043d81d9aa9adb391305e
Matthias Diener [Fri, 24 Aug 2018 17:41:17 +0000 (24 12:41 -0500)]
build: pass through CMK_LIBS in cc-gcc
Fixes missing references to pthreads when
building with e.g.
./build charm++ multicore-linux-x86_64 gcc
Change-Id: I19b7f18b42f023c208b00032d0086604ede8c7f1
Sam White [Fri, 24 Aug 2018 16:01:26 +0000 (24 11:01 -0500)]
Cleanup: add missing return to cpuaffinity
Change-Id: Ic47fc1a58fcdf8a30f86a14e87e262cddaae60fa
Sam White [Fri, 24 Aug 2018 15:36:41 +0000 (24 10:36 -0500)]
Conditionalize use of C++11 emplace_back in charmxi code on 'charmc -host' support of C++11
Change-Id: Icd7544c2554b5dd8f5da04a2c59f29b7c94f6d9b
Sam White [Wed, 22 Aug 2018 16:50:54 +0000 (22 11:50 -0500)]
Cleanup AMPI DDT
Make spacing consistent, use MPI_ macros wherever possible, update old
comments, use C++11 support for copy constructors with default args,
rename variable names for more consistency, and order the defs/decls
of CkDDT_ classes from least complex (contiguous) to most complex
(struct).
Change-Id: I362449e0397c10b4639e6d328840ae6e8201e7d1