1 This file describes the most significant changes. For more detail, use
2 'git log' on a clone of the charm repository.
4 ================================================================================
5 What's new in Charm++ 6.8.2
6 ================================================================================
8 This is a minor release containing only the following changes on top of 6.8.1:
10 - Fix for a crash in memory deregistration on the OFI communication layer in SMP mode.
12 - Tuned eager/rendezvous messaging thresholds for the PAMI communication layer
15 ================================================================================
16 What's new in Charm++ 6.8.1
17 ================================================================================
19 This is a backwards-compatible patch/bug-fix release. Roughly 100 bug
20 fixes, improvements, and cleanups have been applied across the entire
21 system. Notable changes are described below:
23 General System Improvements
25 - Enable network- and node-topology-aware trees for group and chare
26 array reductions and broadcasts
28 - Add a message receive 'fast path' for quicker array element lookup
30 - Feature #1434: Optimize degenerate CkLoop cases
32 - Fix a rare race condition in Quiescence Detection that could allow
33 it to fire prematurely (bug #1658)
34 * Thanks to Nikhil Jain (LLNL) and Karthik Senthil for isolating
35 this in the Quicksilver proxy application
38 * Fix RefineSwapLB to properly handle non-migratable objects
39 * GreedyRefine: improvements for concurrent=false and HybridLB integration
40 * Bug #1649: NullLB shouldnt wait for LB period
42 - Fix Projections tracing bug #1437: CkLoop work traces to the
43 previous entry on the PE rather than to the caller
45 - Modify [aggregate] entry method (TRAM) support to only deliver
46 PE-local messages inline for [inline]-annotated methods. This avoids
47 the potential for excessively deep recursion that could overrun
50 - Fix various compilation warnings
54 - Improve experimental support for PAMI network layer on POWER8 Linux platforms
55 * Thanks to Sameer Kumar of IBM for contributing these patches
57 - Add an experimental 'ofi' network layer to run on Intel Omni-Path
58 hardware using libfabric
59 * Thanks to Yohann Burette and Mikhail Shiryaev of Intel for
60 contributing this new network layer
62 - The GNI network layer (used on Cray XC/XK/XE systems) now respects
63 the ++quiet command line argument during startup
67 - Support for MPI_IN_PLACE in all collectives and for persistent requests
69 - Improved Alltoall(v,w) implementations
71 - AMPI now passes all MPICH-3.2 tests for groups, virtual topologies, and infos
73 - Fixed Isomalloc to not leave behind mapped memory when migrating off a PE
75 ================================================================================
76 What's new in Charm++ 6.8.0
77 ================================================================================
79 Over 900 bug fixes, improvements, and cleanups have been applied
80 across the entire system. Major changes are described below:
84 - Calls to entry methods taking a single fixed-size parameter can now
85 automatically be aggregated and routed through the TRAM library by
86 marking them with the [aggregate] attribute.
88 - Calls to parameter-marshalled entry methods with large array
89 arguments can ask for asynchronous zero-copy send behavior with an
90 `nocopy' tag in the parameter's declaration.
92 - The runtime system now integrates an OpenMP runtime library so that
93 code using OpenMP parallelism will dispatch work to idle worker
94 threads within the Charm++ process.
96 - Applications can ask the runtime system to perform automatic
97 high-level end-of-run performance analysis by linking with the
98 `-tracemode perfReport' option.
100 - Added a new dynamic remapping/load-balancing strategy,
101 GreedyRefineLB, that offers high result quality and well bounded
104 - Improved and expanded topology-aware spanning tree generation
105 strategies, including support for runs on a torus with holes, such
106 as Blue Waters and other Cray XE/XK systems.
108 - Charm++ programs can now define their own main() function, rather
109 than using a generated implementation from a mainmodule/mainchare
110 combination. This extends the existing Charm++/MPI interoperation
113 - Improvements to Sections:
115 * Array sections API has been simplified, with array sections being
116 automatically delegated to CkMulticastMgr (the most efficient implementation
117 in Charm++). Changes are reflected in Chapter 14 of the manual.
119 * Group sections can now be delegated to CkMulticastMgr (improved performance
120 compared to default implementation). Note that they have to be manually
121 delegated. Documentation is in Chapter 14 of Charm++ manual.
123 * Group section reductions are now supported for delegated sections
126 * Improved performance of section creation in CkMulticastMgr.
128 * CkMulticastMgr uses the improved spanning tree strategies. See above.
130 - GPU manager now creates one instance per OS process and scales the
131 pre-allocated memory pool size according to the GPU memory size and
132 number of GPU manager instances on a physical node.
134 - Several GPU Manager API changes including:
136 * Replaced references to global variables in the GPU manager API with calls to
139 * The user is no longer required to specify a bufferID in dataInfo struct.
141 * Replaced calls to kernelSelect with direct invocation of functions passed
142 via the work request object (allows CUDA to be built with all programs).
144 - Added support for malleable jobs that can dynamically shrink and
145 expand the set of compute nodes hosting Charm++ processes.
147 - Greatly expanded and improved reduction operations:
149 * Added built-in reductions for all logical and bitwise operations
150 on integer and boolean input.
152 * Reductions over groups and chare arrays that apply commutative,
153 associative operations (e.g. MIN, MAX, SUM, AND, OR, XOR) are now
154 processed in a streaming fashion. This reduces the memory footprint of
155 reductions. User-defined reductions can opt into this mode as well.
157 * Added a new `Tuple' reducer that allows combining multiple reductions
158 of different input data and operations from a common set of source
159 objects to a single target callback.
161 * Added a new `Summary Statistics' reducer that provides count, mean,
162 and standard deviation using a numerically-stable streaming algorithm.
164 - Added a `++quiet' option to suppress charmrun and charm++ non-error
167 - Calls to chare array element entry methods with the [inline] tag now
168 avoid copying their arguments when the called method takes its
169 parameters by const&, offering a substantial reduction in overhead in
172 - Synchronous entry methods that block until completion (marked with
173 the [sync] attribute) can now return any type that defines a PUP
174 method, rather than only message types.
176 - Static (non-generated) header files are now warning-free for
177 gcc -Wall -Wextra -pedantic.
179 - Deprecated setReductionClient and CkSetReductionClient in favor of
180 explicitly passing callbacks to contribute calls.
182 - On C++ standard library implementations with support for
183 std::is_constructible (e.g. GCC libstdc++ >4.5), chare array
184 elements only need to define a constructor taking CkMigrateMessage*
185 if it will actually be migrated.
187 - The PUP serialization framework gained support for some C++11
188 library classes, including unique_ptr and unordered_map, when the
189 underlying types have PUP operators.
193 - More efficient implementations of message matching infrastructure, multiple
194 completion routines, and all varieties of reductions and gathers.
196 - Support for user-defined non-commutative reductions, MPI_BOTTOM, cancelling
197 receive requests, MPI_THREAD_FUNNELED, PSCW synchronization for RMA, and more.
199 - Fixes to AMPI's extensions for load balancing and to Isomalloc on SMP builds.
201 - More robust derived datatype support, optimizations for truly contiguous types.
203 - ROMIO is now built on AMPI and linked in by ampicc by default.
205 - A version of HDF5 v1.10.1 that builds and runs on AMPI with virtualization
206 is now available at https://charm.cs.illinois.edu/gerrit/#/admin/projects/hdf5-ampi
208 - Improved support for performance analysis and visualization with Projections.
210 Platforms and Portability
212 - The runtime system code now requires compiler support for C++11
213 R-value references and move constructors. This is not expected to be
214 incompatible with any currently supported compilers.
216 - The next feature release (anticipated to be 6.9.0 or 7.0) will require
217 full C++11 support from the compiler and standard library.
219 - Added support for IBM POWER8 systems with the PAMI communication API,
220 such as development/test platforms for the upcoming Sierra and Summit
221 supercomputers at LLNL and ORNL. Contributed by Sameer Kumar of IBM.
223 - Mac OS (darwin) builds now default to the modern libc++ standard
224 library instead of the older libstdc++.
226 - Blue Gene/Q build targets have been added for the `bgclang' compiler.
228 - Charm++ can now be built on Cray's CCE 8.5.4+.
230 - Charm++ will now build without custom configuration on Arch Linux
232 - Charmrun can automatically detect rank and node count from
233 Slurm/srun environment variables.
235 - Many obsolete architecture, network, and compiler support files have
236 been removed. These include:
238 * Sony/Toshiba/IBM Cell (including PlayStation 3)
240 * Intel IA-64 (Itanium)
241 * Intel x86-32 for Windows, Mac OS X (darwin), and Solaris
243 * Older IBM AIX/POWER configurations
244 * GCC 3 and KAI compilers
247 ================================================================================
248 What's new in Charm++ 6.7.1
249 ================================================================================
251 Changes in this release are primarily bug fixes for 6.7.0. The major exception
252 is AMPI, which has seen changes to its extension APIs and now complies with more
253 of the MPI standard. A brief list of changes follows:
257 - Startup and exit sequences are more robust
259 - Error and warning messages are generally more informative
261 - CkMulticast's set and concat reducers work correctly
265 - AMPI's extensions have been renamed to use the prefix AMPI_ instead of MPI_
266 and to generally follow MPI's naming conventions
268 - AMPI_Migrate(MPI_Info) is now used for dynamic load balancing and all fault
269 tolerance schemes (see the AMPI manual)
271 - AMPI officially supports MPI-2.2, and also implements the non-blocking
272 collectives and neighborhood collectives from MPI-3.1
274 Platforms and Portability
276 - Cray regularpages build target has been fixed
278 - Clang compiler target for BlueGene/Q systems added
280 - Comm. thread tracing for SMP mode added
282 - AMPI's compiler wrappers are easier to use with autoconf and cmake
284 ================================================================================
285 What's new in Charm++ 6.7.0
286 ================================================================================
288 Over 120 bugs fixed, spanning areas across the entire system
292 - New API for efficient formula-based distributed sparse array creation
294 - CkLoop is now built by default
296 - CBase_Foo::pup need not be called from Foo::pup in user code anymore - runtime
297 code handles this automatically
299 - Error reporting and recovery in .ci files is greatly improved, providing more
300 precise line numbers and often column information
302 - Many data races occurring under shared-memory builds (smp, multicore) were
303 fixed, facilitating use of tools like ThreadSanitizer and Helgrind
307 - Further MPI standard compliance in AMPI allows users to build and run
308 Hypre-2.10.1 on AMPI with virtualization, migration, etc.
310 - Improved AMPI Fortran2003 PUP interface 'apup', similar to C++'s STL PUP
312 Platforms and Portability
314 - Compiling Charm++ now requires support for C++11 variadic templates. In GCC,
315 this became available with version 4.3, released in 2008
317 - New machine target for multicore Linux ARM7: multicore-linux-arm7
319 - Preliminary support for POWER8 processors, in preparation for the upcoming
320 Summit and Sierra supercomputers
322 - The charmrun process launcher is now much more robust in the face of slow
323 or rate-limited connections to compute nodes
325 - PXSHM now auto-detects the node size, so the '+nodesize' is no longer needed
327 - Out-of-tree builds are now supported
331 - CommLib has been removed.
333 - CmiBool has been dropped in favor of C++'s bool
336 ================================================================================
337 What's new in Charm++ 6.6.1
338 ================================================================================
340 Changes in this release are primarily bug fixes for 6.6.0. A concise list of
341 affected components follows:
345 - Reductions with syncFT
347 - mpicxx based MPI builds
349 - Increased support for macros in CI file
351 - GNI + RDMA related communication
353 - MPI_STATUSES_IGNORE support for AMPIF
355 - Restart on different node count with chkpt
357 - Immediate msgs on multicore builds
359 ================================================================================
360 What's new in Charm++ 6.6.0
361 ================================================================================
363 - Machine target files for Cray XC systems ('gni-crayxc') have been added
365 - Interoperability with MPI code using native communication interfaces on Blue
366 Gene Q (PAMI) and Cray XE/XK/XC (uGNI) systems, in addition to the universal
367 MPI communication interface
369 - Support for partitioned jobs on all machine types, including TCP/IP and IB
370 Verbs networks using 'netlrts' and 'verbs' machine layers
372 - A substantially improved version of our asynchronous library, CkIO, for
373 parallel output of large files
375 - Narrowing the circumstances in which the runtime system will send
376 overhead-inducing ReductionStarting messages
378 - A new fully distributed load balancing strategy, DistributedLB, that produces
379 high quality results with very low latency
381 - An API for applications to feed custom per-object data to specialized load
382 balancing strategies (e.g. physical simulation coordinates)
384 - SMP builds on LRTS-based machine layers (pamilrts, gni, mpi, netlrts, verbs)
385 support tracing messages through communication threads
387 - Thread affinity mapping with +pemap now supports Intel's Hyperthreading more
390 - After restarting from a checkpoint, thread affinity will use new
391 +pemap/+commap arguments
393 - Queue order randomization options were added to assist in debugging race
394 conditions in application and runtime code
396 - The full runtime code and associated libraries can now compile under the C11
397 and C++11/14 standards.
399 - Numerous bug fixes, performance enhancements, and smaller improvements in the
400 provided runtime facilities
403 * The long-unsupported FEM library has been deprecated in favor of ParFUM
404 * The CmiBool typedefs have been deleted, as C++ bool has long been universal
405 * Future versions of the runtime system and libraries will require some degree
406 of support for C++11 features from compilers
408 ================================================================================
409 What's new in Charm++ 6.5.0
410 ================================================================================
412 - The Charm++ manual has been thoroughly revised to improve its organization,
413 comprehensiveness, and clarity, with many additional example code snippets
416 - The runtime system now includes the 'Metabalancer', which can provide
417 substantial performance improvements for applications that exhibit dynamic
418 load imbalance. It provides two primary benefits. First, it automatically
419 optimizes the frequency of load balancer invocation, to avoid work stoppage
420 when it will provide too little benefit. Second, calls to AtSync() are made
421 less synchronous, to further reduce overhead when the load balancer doesn't
422 need to run. To activate the Metabalancer, pass the option +MetaLB at
423 runtime. To get the full benefits, calls to AtSync() should be made at every
424 iteration, rather than at some arbitrary longer interval as was previously
427 - Many feature additions and usability improvements have been made in the
428 interface translator that generates code from .ci files:
429 * Charmxi now provides much better error reports, including more accurate
430 line numbers and clearer reasons for failure, including some semantic
431 problems that would otherwise appear when compiling the C++ code or even at
433 * A new SDAG construct 'case' has been added that defines a disjunction over a
434 set of 'when' clauses: only one 'when' out of a set will ever be triggered.
435 * Entry method templates are now supported. An example program can be found
436 in tests/charm++/method_templates/.
437 * SDAG keyword "atomic" has been deprecated in favor of the newly supported
438 keyword "serial". The two are synonymous, but "atomic" is now provided only
439 for backward compatibility.
440 * It is no longer necessary to call __sdag_init() in chares that contain SDAG
441 code - the generated code does this automatically. The function is left as
442 a no-op for compatibility, but may be removed in a future version.
443 * Code generated from .ci files is now primarily in .def.h files, with only
444 declarations in .decl.h. This improves debugging, speeds compilation,
445 provides clearer compiler output, and enables more complete encapsulation,
446 especially in SDAG code.
447 * Mainchare constructors are expected to take CkArgMsg*, and always have
448 been. However, charmxi would allow declarations with no argument, and
449 assume the message. This is now deprecated, and generates a warning.
451 - Projections tracing has been extended and improved in various ways
452 * The trace module can generate a record of network topology of the nodes in
453 a run for certain platforms (including Cray), which Projections can
455 * If the gzip library (libz) is available when Charm++ is compiled, traces
456 are compressed by default.
457 * If traces were flushed as a results of filled buffers during the run, a
458 warning will be printed at exit to indicate that the user should be wary of
459 interference that may have resulted.
460 * In SMP builds, it is now possible to trace message progression through the
461 communication threads. This is disabled by default to avoid overhead and
462 potential misleading interpretation.
464 - Array elements can be block-mapped at the SMP node level instead of at the
465 per-PE level (option "+useNodeBlkMapping").
467 - AMPI can now privatize global and static variables using TLS. This is
468 supported in C and C++ with __thread markings on the variable declarations
469 and definitions, and in Fortran with a patched version of the gfortran
470 compiler. To activate this feature, append '-tls' to the '-thread' option's
471 argument when you link your AMPI program.
473 - Charm can now be built to only support message priorities of a specific data
474 type. This enables an optimized message queue within the the runtime
475 system. Typical applications with medium sized compute grains may not benefit
476 noticeably when switching to the new scheduler. However, this may permit
477 further optimizations in later releases.
479 The new queue is enabled by specifying the data type of the message
480 priorities while building charm using --with-prio-type=dtype. Here, dtype can
481 be one of char, short, int, long, float, double and bitvec. Specifying bitvec
482 will permit arbitrary-length bitvector priorities, and is the current default
483 mode of operation. However, we may change this in a future release.
485 - Converse now provides a complete set of wrappers for
486 fopen/fread/fwrite/fclose to handle EINTR, which is not uncommon on the
487 increasingly-popular Lustre. They are named CmiF{open,read,write,close}, and
488 are available from C and C++ code.
490 - The utility class 'CkEntryOptions' now permits method chaining for cleaner
491 usage. This applies to all its set methods (setPriority, setQueueing,
492 setGroupDepID). Example usage can be found in examples/charm++/prio/pgm.C.
494 - When creating groups or chare arrays that depend on the previous construction
495 of another such entity on the local PE, it is now possible to declare that
496 dependence to the runtime. Creation messages whose dependence is not yet
497 satisfied will be buffered until it is.
499 - For any given chare class Foo and entry method Bar, the supporting class's
500 member CkIndex_Foo::Bar() is used to lookup/specify the entry method
501 index. This release adds a newer API for such members where the argument is a
502 function pointer of the same signature as the entry method. Those new
503 functions are used like CkIndex_Foo::idx_Bar(&Foo::Bar). This permits entry
504 point index lookup without instantiating temporary variables just to feed the
505 CkIndex_Foo::Bar() methods. In cases where Foo::Bar is overloaded, &Foo::Bar
506 must be cast to the desired type to disambiguate it.
508 - CkReduction::reducerType now have PUP methods defined; and can hence be
509 passed as parameter-marshalled arguments to entry methods.
511 - The runtime option +stacksize for controlling the allocation of user-level
512 threads' stacks now accepts shorthanded annotation such as 1M.
514 - The -optimize flag to the charmc compiler wrapper now passes more aggressive
515 options to the various underlying compilers than the previous '-O'.
517 - The charmc compiler wrapper now provides a flag -use-new-std to enable
518 support for C11 and C++11 where available. To use this in application code,
519 the runtime system must have been built with that flag as well.
521 - When using, CmiMemoryUsage(), the runtime can be instructed not to use the
522 underlying mallinfo() library call, which can be inaccurate in settings where
523 usage exceeds INT_MAX. This is accomplished by setting the environment
524 variable "MEMORYUSAGE_NO_MALLINFO".
526 - Experimental Features
527 * Initial implementation of a fast message-logging protocol. Use option
528 'mlogft' to build it.
529 * Message compression support for persistent message on Gemini machine layer.
530 * Node-level inter-PE loop/task parallelization is now supported through
532 * New temperature/CPU frequency aware load balancer
533 * Support interoperation of Charm++ and native MPI code through dynamically
534 switching control between the two
535 * API in centralized load balancers to get and set PE speed
536 * A new scheme for optimization of double in-memory checkpoint/restart.
537 * Message combining library for improved fine-grained communication
539 * Support for partitioning of allocated nodes into subsets that run
540 independent Charm++ instances but can interact with each other.
542 Platform-Specific Changes
543 -------------------------
546 * The gemini_gni network layer has been heavily tuned and optimized,
547 providing substantial improvements in performance, scalability, and
549 * The gemini_gni-crayxe machine layer supports a 'hugepages' option at build
550 time, rather than requiring manual configuration file editing.
551 * Persistent message optimizations can be used to reduce latency and
553 * Experimental support for 'urgent' sends, which are sent ahead of any other
554 outgoing messages queued for transmission.
556 - IBM Blue Gene Q: Experimental machine-layer support for the native PAMI
557 interface and MPI, with and without SMP support. This supports many new
558 systems, including LLNL's Sequoia, ALCF's Mira, and FZ Juelich's Juqueen.
560 There are three network-layer implementations for these systems: 'mpi',
561 'pami', and 'pamilrts'. The 'mpi' layer is stable, but its performance and
562 scalability suffers from the additional overhead of using MPI rather than
563 driving the interconnect directly. The 'pami' layer is well tested for NAMD,
564 but has shown instability for other applications. It is likely to be replaced
565 by the 'pamilrts' layer, which is more generally stable and seems to provide
566 the same performance, in the next release.
568 In addition to the common 'smp' option to build the runtime system with
569 shared memory support, there is an 'async' option which sometimes provides
570 better performance on SMP builds. This option passes tests on 'pamilrts', but
571 is still experimental.
573 Note: Applications that have large number of messages may crash in default
574 setup due to overflow in the low-level FIFOs. Environment variables
575 MUSPI_INJFIFOSIZE and PAMI_RGETINJFIFOSIZE can be set to avoid application
576 failures due to large number of small and large messages respectively. The
577 default value of these variable is 65536 which is sufficient for 1000
580 - Infiniband Verbs: Better support for more flavors of ibverbs libraries
583 * Experimental rendezvous protocol for better performance above some MPI
585 * Some tuning parameters ("+dynCapSend" and "+dynCapRecv") are now
586 configurable at job launch, rather than Charm++ compilation.
588 - PGI C++: Disable automatic 'using namespace std;'
590 - Charm++ now supports ARM, both non-smp and smp.
592 - Mac OS X: Compilation options to build and link correctly on newer versions
595 ================================================================================
596 What's new in Charm++ 6.4.0
597 ================================================================================
599 --------------------------------------------------------------------------------
601 --------------------------------------------------------------------------------
603 - Cray XE and XK systems using the Gemini network via either MPI
604 (mpi-crayxe) or the native uGNI (gemini_gni-crayxe)
606 - IBM Blue Gene Q, using MPI (mpi-bluegeneq) or PAMI (pami-bluegeneq)
608 - Clang, Cray, and Fujitsu compilers
610 - MPI-based machine layers can now run on >64k PEs
612 --------------------------------------------------------------------------------
614 --------------------------------------------------------------------------------
616 - Added a new [reductiontarget] attribute to enable
617 parameter-marshaled recipients of reduction messages
619 - Enabled pipelining of large messages in CkMulticast by default
621 - New load balancers added:
624 * Scotch graph partitioning based: ScotchLB and Refine and Topo variants
627 - Load balancing improvements:
629 * Allow reduced load database size using floats instead of doubles
630 * Improved hierarchical balancer
631 * Periodic balancing adapts its interval dynamically
632 * User code can request a callback when migration is complete
633 * More balancers properly consider object migratability and PE
634 availability and speed
635 * Instrumentation records multicasts
637 - Chare arrays support options that can enable some optimizations
639 - New 'completion detection' library for parallel process termination
640 detection, when the need for modularity excludes full quiescence
643 - New 'mesh streamer' library for fine-grain many-to-many collectives,
644 handling message bundling and network topology
646 - Memory pooling allocator performance and resource usage improved
649 - AMPI: More routines support MPI_IN_PLACE, and those that don't check
652 ================================================================================
653 What's new in Charm++ 6.2.1 (since 6.2.0)
654 ================================================================================
656 --------------------------------------------------------------------------------
657 New Supported Platforms:
658 --------------------------------------------------------------------------------
660 POWER7 with LAPI on Linux
662 Infiniband on PowerPC
664 --------------------------------------------------------------------------------
666 --------------------------------------------------------------------------------
668 - Better support for multicasts on groups
669 - Topology information gathering has been optimized
670 - Converse (seed) load balancers have many new optimizations applied
671 - CPU affinity can be set more easily using +pemap and +commap options
672 instead of the older +coremap
673 - HybridLB (hierarchical balancing for very large core-count systems)
674 has been substantially improved
675 - Load balancing infrastructure has further optimizations and bug fixes
676 - Object mappings can be read from a file, to allow offline
677 topology-aware placement
678 - Projections logs can be spread across multiple directories, speeding
679 up output when dealing with thousands of cores (+trace-subdirs N
680 will divide log files evenly among N subdirectories of the trace
681 root, named PROGNAME.projdir.K)
682 - AMPI now implements MPI_Issend
683 - AMPI's MPI_Alltoall uses a flooding algorithm more agressively,
684 versus pairwise exchange
685 - Virtualized ARMCI support has been extended to cover the functions
688 --------------------------------------------------------------------------------
689 Architecture-specific changes
690 --------------------------------------------------------------------------------
692 - LAPI SMP has many new optimizations applied
694 - Net builds support the use of clusters' mpiexec systems for job
695 launch, via the ++mpiexec option to charmrun
697 ================================================================================
698 What's new in Charm++ 6.2.0 (since 6.1)
699 ================================================================================
701 --------------------------------------------------------------------------------
702 New Supported Platforms:
703 --------------------------------------------------------------------------------
705 64-bit MIPS, such as SiCortex, using mpi-linux-mips64
707 Windows HPC cluster, using mpi-win32/mpi-win64
709 Mac OSX 10.6, Snow Leopard (32-bit and 64-bit).
711 --------------------------------------------------------------------------------
713 --------------------------------------------------------------------------------
716 - Smarter build/configure scripts
717 - A new interface for model-based load balancing
718 - new CPU topology API
719 - a general implementation of CmiMemoryUsage()
720 - Bug fix: Quiescence detection (QD) works with immediate messages
721 - New reduction functions implemented in Converse
722 - CCS (Converse Client-Server) can deliver message to more than one processor
723 - Added a memory-aware adaptive scheduler, which can be optionally
725 - Added preliminary support for automatic message prioritization
726 (disabled by default)
729 - Cross-array and cross-group sections
730 - Structured Dagger (SDAG): Support templated arguments properly
731 - Plain chares support checkpoint/restart (both in-memory and disk-based)
732 - Conditional packing of messages and parameters in SMP scenario
733 - Changes to the CkArrayIndex class hierarchy
734 -- sizeof() all CkArrayIndex* classes is now the same
735 -- Codes using custom array indices have to use placement-new to construct
736 their custom index. Refer example code: examples/charm++/hello/fancyarray/
737 -- *** Backward Incompatibility ***
738 CkArrayIndex[4D/5D/6D]::index are now of type int (instead of short)
739 However the data is stored as shorts. Access by casting
740 CkArrayIndexND::data() appropriately
741 -- *** Deprecated ***
742 The direct use of public data member
743 CkArrayIndexND::index (N=1..6) is deprecated. We reserve the right to
744 change/remove this variable in future releases of Charm++.
745 Instead, please access the indices via member function:
746 int CkArrayIndexND::data()
749 - Compilers renamed to avoid collision with host MPI (ampicc, ampiCC,
751 - Improved MPI standard conformance, and documentation of non-conformance
752 * Bug fixes in: MPI_Ssend, MPI_Cart_shift, MPI_Get_count
753 * Support MPI_IN_PLACE in MPI_(All)Reduce
754 * Define various missing constants
755 - Return the received message's tag in response to a non-blocking
756 wildcard receive, to support SuperLU
757 - Improved tracing for BigSim
759 Multiphase Shared Arrays (MSA)
760 - Typed handles to enforce phases
761 - Split-phase synchronization to enable message-driven execution
765 - Automatic tracing of API calls for simulation and analysis
768 - Wider support for architectures other than net- (in particular MPI layers)
769 - Improved support for large scale debugging (better scalability)
770 - Enhanced record/replay stability to handle various events, and to
771 signal unexpected messages
772 - New detailed record/replay: The full content of messages can be
773 recorded, and a single processor can be re-executed outside of the
777 - Tracing of nested entry methods
779 Automatic Performance Tuning
780 - Created an automatic tuning framework [still for experimental use only]
783 - Network-topology / node aware spanning trees used internally for and
784 lower bytes on the network and improved performance in multicasts and
785 reductions delegated to this library
788 - Improved OneTimeMulticastStrategy classes
791 - Out-of-core support, with prefetching capability
792 - Detailed tracing of MPI calls
793 - Detailed record/replay support at emulation time, capable of
794 replaying any emulated processor after obtained recorded logs.
796 --------------------------------------------------------------------------------
797 Architecture-specific changes
798 --------------------------------------------------------------------------------
801 - Can run jobs with more than 1024 PEs
804 - New charmrun option ++no-va-randomization to disable address space
805 randomization (ASLR). This is most useful for running AMPI with
809 - Default to using ampicxx instead of mpiCC
812 - The +p option now has the same semantics as in other smp builds
815 - Support for VSX in SIMD abstraction API
818 - Compilers and options have been updated to the latest ones
821 - Added routines for measuring performance counters on BG/P.
822 - Updated to support latest DCMF driver version. On ANL's Intrepid, you may
823 need to set BGP_INSTALL=/bgsys/drivers/V1R4M1_460_2009-091110P/ppc in your
824 environment. This is the default on ANL's Surveyor.
827 - cputopology information is now available on XT3/4/5
830 - Bug fix: plug memory leaks that caused failures in long runs
831 - Optimized to reduce startup delays
834 - Support for SMP (experimental)
837 ================================================================================
838 Note that changes from 5.9, 6.0, and 6.1 are not documented here. A partial list
839 can be found on the charm download page, or by reading through version control
842 ================================================================================
843 What's New since Charm++ 5.4 release 1
844 ================================================================================
846 --------------------------------------------------------------------------------
847 New Supported Platforms:
848 --------------------------------------------------------------------------------
849 1. Charm++ ported to IA64 Itanium running Win2K and Linux, Charm++ also support
850 Intel C/C++ compilers;
852 2. Charm++ ported to Power Macintosh powerpc running Darwin;
854 3. Charm++ ported to Myrinet networking with GM API;
856 --------------------------------------------------------------------------------
857 Summary of New Features:
858 --------------------------------------------------------------------------------
860 Structured Dagger is a coordination language built on top of CHARM++.
861 Structured Dagger allows easy expression of dependences among messages and
862 computations and also among computations within the same object using
863 when-blocks and various structured constructs.
865 2. Entry functions support parameter marshalling
866 Now you can declare and invoke remote entry functions using parameter
867 marshalling instead of defining messages.
869 3. Easier running - standalone mode
870 For net-* version running locally, you can now run Charm programs without
871 charmrun. Running a node program directly from command line is now the
872 same as "charmrun +p1 <program>"; for SMP version, you can also specify
873 multiple (local) processors, as in "program +p2".
876 --------------------------------------------------------------------------------
878 --------------------------------------------------------------------------------
879 1. "build" changed for compilation of Charm++
880 To build Charm++ from scratch, we now take additional command line options
881 to compile with addon features and using different compilers other than gcc.
882 For example, to build Linux IA64 with Myrinet support, type command:
883 ./build net-linux-ia64 gm
886 ******* Old Change histories *******
889 ================================================================================
890 What's New in Charm++ 5.4 release 1 since 5.0
891 ================================================================================
893 --------------------------------------------------------------------------------
894 New Supported Platforms:
895 --------------------------------------------------------------------------------
897 1. Win9x/2000/NT: with Visual C++ or Cygwin gcc/g++, you can compile and run
898 Charm++ programs on all Win32 platforms.
900 2. Scyld Beowulf: Charm++ has been ported to the Linux-based Scyld Beowulf
901 operating system. For more information on Scyld, see <http://www.scyld.com>
903 3. MPI with VMI: Charm++ has been ported to NCSA's Virtual Machine Interface,
904 which is an efficient messaging library for heterogeneous cluster
908 --------------------------------------------------------------------------------
909 Summary of New Features:
910 --------------------------------------------------------------------------------
911 1. Dynamic Load balancing:
912 Chare migration is supported in the new release. Migration-based dynamic
913 load balancing framework with various load balancing strategies library has
917 Charm++ array is supported. You can now create an array of Chare objects
918 and use array index to refer the Charm++ array elements. A reduction
919 library on top of Chare array has been implemented and included.
922 Projections, a Java application for Charm++ program performance analysis and
923 visualization, has been included and distributed in the new release. Two
924 trace modes are available: trace-projections and trace-summary. Trace-summary
925 is a light-weight trace library compared to trace-projections.
928 AMPI is a load-balancing based library for porting legacy MPI applications
929 to Charm++. With few changes in the original MPI code to AMPI, the new
930 legacy MPI application on Charm++ will gain from Charm++'s adptive
931 load balancing ability.
934 "Charmrun" is now available on all platforms, with a uniform command line
935 syntax. You can forget the difference between net-* versions and MPI versions,
936 and run charm++ application with this same charmrun command syntax.
937 ++local option is added in charmrun for net-* version, it provides
938 simple local use of Charm and no longer require the ability to
939 "rsh localhost" or a nodelist file in order to run charm only on the local
940 machine. This is especially attractive when you run Charm++ on Windows.
943 Many new libraries have been added in this release. They include:
944 1) master-slave library: for writing manager-worker paradigm programs.
945 2) receiver library: provide asynchronous communication mode for chare array.
946 3) f90charm: provides Fortran90 bindings for Charm++ Array.
947 4) BlueGene: a Charm++/Converse emulator for IBM proposed Blue Gene.
949 --------------------------------------------------------------------------------
951 --------------------------------------------------------------------------------
952 1. message declaration syntax in .ci file:
953 The message declaration syntax for packed/varsize messages has been changed.
954 The packed/varsize keywords are eliminated, and you can specify the actual
955 actual varsize arrays in the interface file and have the translator generate
956 alloc, pack and unpack.
959 Here is the detailed list of Changes:
961 --------------------------------------------------------------------------------
963 --------------------------------------------------------------------------------
965 10/06/1999 rbrunner Added migration-based dynamic load balancing
967 11/15/1999 olawlor Added reduction support foe Charm++ arrays
968 02/06/2000 milind Added AMPI, an implementation of MPI with
969 dynamic load balancing
970 02/18/2000 paranjpy New platforms supported: net-win32, and net-win32-smp
971 04/04/2000 olawlor Added arbitrarily indexed Charm++ arrays.
972 Also, added translator support for new arrays.
973 04/15/2000 olawlor Added "puppers" for packing and unpacking
975 06/14/2000 milind Added the threaded FEM framework.
977 --------------------------------------------------------------------------------
979 --------------------------------------------------------------------------------
981 10/09/1999 rbrunner Added packlib, a library for C and C++ to
982 pack-unpack data to/from Charm++ messages.
983 10/13/1999 gzheng New LB strategy: RefineLB
984 10/13/1999 paranjpy New LB Strategy: Heap
985 10/14/1999 milind New LB Strategy: Metis
986 10/19/1999 olawlor New test program for testing LB strategies.
987 10/21/1999 gzheng New trace mode: trace-summary
988 10/28/1999 milind New supported platform: net-sol-x86
989 10/29/1999 milind Added runtime checks for ChareID assignment.
990 11/10/1999 rbrunner Added Neighborhood base strategy for LB
992 11/15/1999 olawlor conv-host now reads in a startup file
994 11/15/1999 olawlor New test program for testing array reductions.
995 11/16/1999 rbrunner Added processor-speed checking functions to
997 11/19/1999 milind Mapped SIGUSR to a Ccd condtion handler
998 11/22/1999 rbrunner New LB strategy: WSLB
999 11/29/1999 ruiliu Modified Metis LB strategy to deal with
1000 different processor speeds
1001 12/16/1999 rbrunner New LB strategy: GreedyRef
1002 12/16/1999 rbrunner New LB strategy: RandRef
1003 12/21/1999 skumar2 New LB strategy: CommLB
1004 01/03/2000 rbrunner New LB strategy: RecBisectBfLB
1005 01/08/2000 skumar2 New LB strategy: Comm1LB, with varying processor
1007 01/18/2000 milind Modified SM library syntax, and added a test
1009 01/19/2000 gzheng Added irecv, a library to simplify conversion
1010 of message-passing programs to Charm++
1011 02/20/2000 olawlor Added preliminary broadcast support to Charm++
1013 02/23/2000 paranjpy Added converse-level quiescence detection
1014 03/02/2000 milind Added ++server-port option to pre-specify
1016 03/10/2000 wilmarth Random seed-based load balancer now uses
1017 bit-vector for active PEs.
1018 03/21/2000 gzheng Added support for marking user-defined events
1020 03/28/2000 wilmarth Added CMK_TRUECRASH. Very helpful for
1021 post-mortem debugging of Charm++ programs on
1023 03/31/2000 jdesouza Added Fortran90 support to the Charm++
1024 interface translator.
1025 03/09/2000 milind Added support for -LANG and -rpath options
1026 in charmc for Origin2000.
1027 04/28/2000 milind Added prioritized converse threads.
1028 05/01/2000 milind Added test programs for TeMPO, AMPI and irecv.
1029 05/04/2000 milind New supported platform: mpi-sp.
1030 05/04/2000 gzheng Added irecv pingpong program.
1031 05/17/2000 olawlor Each chare, group and array element now has to
1032 have migration constructor.
1033 05/24/2000 milind Added Jacobi3D programs for irecv and AMPI both.
1034 05/24/2000 milind Made migratable an optional attribute of
1035 chares, groups, and nodegroups.
1036 Arrays are by default migratable.
1037 05/29/2000 paranjpy Added pup methods to arrays, reductions etc
1039 06/13/2000 milind Made CtvInitialize idempotent. That is, it
1040 can be called by any number of threads now,
1041 only the first one will actually do
1043 06/20/2000 milind Added a simple test program for the FEM
1045 07/06/2000 milind Imported Metis 4.0 sources in the CVS tree.
1046 Also added code to make metis libraries and
1047 executables to Makefile.
1048 07/07/2000 milind Added more meaningfull error messages using
1049 perror in addition to a cryptic error codes in
1051 07/10/2000 milind fem and femf are now recognized as "languages"
1053 07/10/2000 saboo Added the derived datatypes library.
1054 07/13/2000 milind Added +idle_timeout functionality. It takes a
1055 commandline parameter denoting milliseconds of
1056 maximum consecutive idle time allowed per
1058 07/14/2000 milind Added group multicast. Added
1059 CkSendMsgBranchMulti, CldEnqueueMulti, and
1060 translator changes to support it.
1061 07/14/2000 milind SUPER_INSTALL now takes "-*" arguments prior
1062 to the target, that will be passed to make as
1063 "makeflags". This makes it easy to suppress
1064 make's output of commands etc (with the -s
1065 flag). As a result of this, several Makefiles
1067 07/18/2000 milind Added support for using "dbx" on suns as
1069 07/19/2000 milind Added ability to tracemode projections which
1070 produces binary trace files. Use flag
1071 +binary-trace on the command line.
1072 07/26/2000 milind Separated AMPI from TeMPO.
1073 07/28/2000 milind Added test programs to test reduce, alltoall
1074 and allreduce functionality of AMPI.
1075 08/02/2000 milind Added an option to let the user specify which
1076 "xterm" to use. For example, on some systems
1077 (CDE), only dtterm is installed. So, by
1078 putting ++xterm dtterm on the conv-host
1079 commandline, one can use dtterm when ++in-xterm
1080 option is specified on conv-host commandline.
1081 08/14/2000 milind FEM Framework: Added capabilities to handle
1082 esoteric meshes to standalone offline programs.
1083 Makefile now produces gmap and fgmap programs,
1084 which are used for this purpose. They convert
1085 the mesh to a graph before partitioning it
1087 08/24/2000 milind Added the 2D crack propagation program as a
1088 test program for FEM framework.
1089 08/25/2000 milind Initial implementation of isomalloc-based
1090 threads. This implementation uses a fixed
1091 stack size for all threads (can be set at
1093 08/26/2000 milind Added a macro CtvAccessOther that lets you
1094 get/set a Ctv variable of any thread. It
1095 should be invoked as CtvAccessOther(thread,
1096 varname); Added CthGetData function to each of
1097 the threads implementation. This function is
1098 used in the CtvAccessOther macro.
1099 08/27/2000 milind FEM Framework: Separated mesh to graph
1100 conversion capability into a separate program.
1101 This way, the generated graph can be partitioned
1103 09/04/2000 milind Added the class static readonly variables to
1105 09/05/2000 milind FEM Framework: A very fast O(n) algorithm for
1106 mesh2graph , uses more memory, but the tradeoff
1107 was worth it. Coded by Karthik Mahesh, minor
1108 optimizations by Milind.
1109 09/05/2000 milind Added a barebones charm kernel scheduling
1110 overhead measurement program.
1111 09/15/2000 milind Added pup support for AMPI and FEM framework.
1112 09/20/2000 olawlor Added capability to have an array of base type
1113 where individual element could be of derived
1115 10/03/2000 gzheng New supported platform: net-linux-axp
1116 10/05/2000 skumar2 Added program littleMD to the test suite.
1117 10/07/2000 skumar2 New job scheduler (Faucets projects).
1118 10/15/2000 milind Improved support for Fortran90 in charmc.
1119 11/04/2000 jdesouza Made the Faucets scheduler multi-threaded.
1120 11/05/2000 olawlor FEM Framework: supports multiple element types,
1121 mesh re-assembly, etc.
1122 11/15/2000 gzheng New platform support: net-cygwin
1123 11/18/2000 gzheng conv-host no longer needs /bin/csh to start
1125 CMK_CONV_HOST_CSH_UNAVAILABLE to 1 to use
1127 11/25/2000 milind Finished experimental implementation of
1128 converse-threads based on co-operative pthreads.
1129 11/25/2000 milind Added a benchmark suite of all pingpongs in
1131 11/28/2000 milind Removed deletion of _idx at the end of every
1132 send or doneInserting call. Instead now it is
1133 in the destructor of the proxy. This allows us
1134 to cache proxies, when proxy creation becomes
1136 11/28/2000 olawlor Added "seek blocks" to puppers. This should
1137 allow out-of-order pup'ing without the ugliness
1138 of getBuf; and in a way that works with all
1140 11/29/2000 olawlor Simplified and regularized command-line-argument
1142 11/29/2000 milind AMPI: Added multiple-communicators capability.
1143 12/05/2000 gzheng Now /bin/sh is default shell to fork node
1144 program on remote machines.
1145 12/13/2000 olawlor Added charmrun wrapper for poe on mpi-sp.
1146 12/14/2000 milind Added bluegene emulator sources and test
1147 programs. Added "bluegene" as a language known
1148 to charmc. Makefile now has a target called
1149 bluegene. Added preliminary bluegene
1150 documentation. (copied from Arun's webpage.)
1151 12/15/2000 gzheng f90charm addition to Makefile and charmc. Also,
1152 added fixed size arrays support to f90charm. A
1153 test program f90charm/hello is checked in.
1154 12/17/2000 milind Added rtest test program. Contributed by jim to
1155 test Converse message transmission.
1156 12/20/2000 olawlor Added charmconfig script. Enables automatic
1157 determination of C++ compiler properties,
1158 replacing the verbose and error-prone
1159 conv-mach.h entries for CMK_BOOL,
1160 CMK_STL_USE_DOT_H, CMK_CPP_CAST_OK, ...
1161 12/20/2000 olawlor Charm++ Arrays optimizations: Key and object
1162 now variable-length fields, instead of pointers.
1163 This extra flexibility lets us save many
1164 dynamic allocations in the array framework.
1165 12/20/2000 olawlor Added PUP::able support-- dynamic type
1166 identification, allocation, and deletion.
1167 Allows you to write: p(objPtr); and
1168 objPointer will be properly identified,
1169 allocated, packed, and deallocated (depending
1170 on the PUP::er). Requires you to register any
1171 such classes with DECLARE_PUPable and
1173 12/20/2000 olawlor Arrays optimizations: Made CkArrayIndex
1174 fixed-size. This significantly improves
1175 messaging speed (7 us instead of 10 us
1176 roundtrip). Move spring cleaning check into a
1177 CcdCallFnAfter, which gains more speed (down to
1179 12/20/2000 olawlor More optimizations: Minor speed tweaks--
1180 conv-ccs.c uses hashtable for handler lookup;
1181 conv-conds skips timer test until needed;
1182 convcore.c scheduler loop optmizations (no
1183 superfluous EndIdle calls); threads.c
1184 CMK_OPTIMIZE-> no mprotect.
1185 12/20/2000 olawlor More Optimizations: Minor speed tweaks-- ck.C
1186 groups cldEnqueue skip; init.h defines
1187 CkLocalBranch inline; and supporting changes.
1188 12/22/2000 gzheng IA64 support for Converse user level threads.
1189 01/02/2001 olawlor CCS: Minor update-- enabled CcsProbe, cleaned
1190 up superflous debug messages in server, added
1191 Java interface (originally written for
1193 01/09/2001 gzheng charmconfig converted to autoconf style, need
1194 to change configure.in and conv-autoconfig.h.in,
1195 and run autoconf to get configure and copy to
1196 charmconfig. added fortran subroutine name
1197 test and get libpthread.a
1198 01/10/2001 milind Added telnet method of getting libpthread.a
1199 from charm webserver.
1200 01/11/2001 olawlor Moved projections files here from
1201 CVSROOT/projections-java. Added fast Java
1202 versions of the .log file input routines in
1203 LogReader, LogLoader, LogAnalyzer, and
1204 UsageCalc. Added "U.java" user interface
1205 utility file, allowing times to be input in
1206 seconds, milliseconds, or microseconds,
1207 instead of just microseconds.
1208 01/15/2001 gzheng add +trace-root to specify the directory to
1209 put log files in. this is need in Scyld cluster
1210 where there is no NFS mounting and no i/o
1211 access to home directory sharing on nodes.
1212 01/15/2001 milind Made AMPI into a f90 module instead of
1213 'ampif.h' inclusion. AMPI f90 bindings are
1214 now more inclusive. Fixed argc,argv handling
1215 bugs in ArgsInfo message. Fixed a bug in pup
1216 that caused thread not to be sized, but was
1217 packed nevertheless. Moved irecv to waitall
1218 instead of at in ampi_start. Made
1219 AMPI_COMM_WORLD to be 0, because it clashed
1220 with wildcard(-1). AMPI_COMM_UNIVERSE is now
1221 handled properly in the AMPI module.
1222 C/C++ data members are NOT visible to
1224 01/18/2001 gzheng New supported platform: net-linux-scyld
1225 01/20/2001 olawlor Moved array index field from CMessage_* to the
1226 Ck envelope itself. This is the right thing
1227 to do, because any message may be sent to/from
1228 an array element. To reduce the wasted space
1229 in a message, a union is used to overlay the
1230 fields for the various possible message types.
1231 01/29/2001 olawlor Freed charmrun on net-* version from using
1232 remote shell to fork off processes. One can now
1233 use a daemon provided in the distribution.
1234 02/07/2001 olawlor Added debugging support to puppers.
1235 02/13/2000 gzheng Added ++local option to charmrun to start node
1236 program locally without any daemon; fix the
1237 hang program if you type wrong pgm name in
1238 scyld version, and redirect all output to
1239 /dev/null, otherwise all node program can send
1240 its output to console in scyld. Also implemented ++local in net-win32 version.
1241 02/26/2000 milind Changed the varsize syntax. Now one can specify
1242 actual varsize arrays in the interface file
1243 and have the translator generate alloc, pack
1246 --------------------------------------------------------------------------------
1248 --------------------------------------------------------------------------------
1250 10/29/1999 milind Replaced jmemcpy by memcpy in net versions, as
1251 it was causing a bit to flip (bug reported
1253 10/29/1999 milind Fixed multiline macros in all header files.
1254 02/05/2000 milind Fixed linking errors by getting the order of
1255 libraries right from the charmc command-line.
1256 02/18/2000 paranjpy Fixed Charm++ initialization bug on SMPs.
1257 02/21/2000 milind Fixed a context-switching bug in mipspro version
1259 02/25/2000 milind Charm++ interface translator was segfaulting
1260 on interface file errors. Fixed that. Also,
1261 added linenumbers to error messages.
1262 03/02/2000 milind Made CCS work on SMPs.
1263 03/07/2000 milind Made ConverseInit consistent with the manual on
1265 04/18/2000 milind Fixed a bug in CkWaitFuture, which was caching
1266 a variable locally, while it was changed by
1268 05/04/2000 paranjpy Fixed argv deletion bug on net-win32-smp.
1269 06/08/2000 milind sp3 version: changed optimization flags, which
1270 where power2 processor-specific.
1271 06/20/2000 milind mpi-* versions: Fixed ConverseExit since it was
1272 not obeying the following statement in the MPI
1273 standard: The user must ensure that all pending
1274 communications involving a process completes
1275 before the process calls MPI_FINALIZE.
1276 07/05/2000 milind Fixed a nasty bug in charmc in the -cp option.
1277 It used to append the name provided to -o flag
1278 to the directory provided to the -cp flag.
1279 Thus, -o ../pgm -cp ../bin options meant that
1280 the pgm would be copied to ../bin/.., which is
1281 not the expected behavior. This fix correctly
1282 copies pgm to ../bin.
1283 07/07/2000 milind Removed variable arg_myhome, as it was not
1284 being used anywhere, and also, setting it was
1285 causing problems of env var HOME was not set.
1286 07/27/2000 milind thishandle for the arrayelement was not being
1287 correctly set. Bug was reported by Neelam.
1288 08/26/2000 milind Origin2000: Changed the page alignment to
1289 reflect the mmap alignment. The mmap man page
1290 specifically states that it is not the same as
1292 09/02/2000 milind Fixed a bug in code generated for threaded
1293 (void) entry methods of array elements. The
1294 dummy message that is passed to that method in
1295 a thread has to be deleted before calling the
1296 object method, because upon object method's
1297 return, the thread might have migrated.
1298 09/03/2000 olawlor Minor fix-fixes: 1.) Change to LBObjid hash
1299 function would fail for >4-int object indices.
1300 Replaced with proper function, which also
1301 preserves the 1-int case. 2.) Array element
1302 sends must go via the message queue to prevent
1303 stack build-up for deep single-processor call
1304 chains. These might happen, e.g., in a driver
1305 element calling itself for the main time loop.
1306 Messages are now properly noted as sent, then
1307 wait through the queue for delivery. This
1308 entailed minor reorganization of the message
1310 09/21/2000 olawlor Tiny SMP thread fix-- registrations of a
1311 thread-private variable now reserve space on
1312 calls after the first. This wastes space for
1313 multiple CthInitialize's-- it's a quick hack to
1314 get threads working again on SMP versions.
1315 10/16/2000 olawlor A few CCS fixes: -Added split-phase reply
1316 (delay reply indefinitely) -Cleaned up error
1317 handling -Pass user data as "void *" instead of
1319 11/03/2000 wilmarth Removed 0 size array allocation in Charm++
1320 quiescence detection.
1321 11/20/2000 gzheng Rewrote part of Fiber thread, including a bug
1322 fix for a the non thread-safe function, and a
1323 different fiber free strategy.
1324 11/29/2000 gzheng The LB init procedure tried to allocate
1325 65536*160 as initial size, which is 10M memory
1326 for communication table, which is too big.
1327 Cut it down to roughly 1M, and it can expand
1329 12/05/2000 gzheng In many cases, conv-host exits without print
1330 out the error message from remote shell. try
1331 to fix it by calling sync to flush the pipe
1333 12/10/2000 milind net-linux: Made static linking the default
1334 option because dynamic linking runtime causes
1335 isomalloc threads to crash.
1336 12/18/2000 milind Increased portability of isomalloc threads by
1337 removing dependence on alloca.
1338 12/28/2000 milind Fixed ctrl-getone abort bug on SMP.
1339 12/28/2000 milind Made _groupTable a pointer on which a
1340 constructor is explicitly called. Since it
1341 was a Cpv variable, its constructor was not
1342 called by default in case of an SMP version.
1343 12/29/2000 olawlor Prevent infinite copy constructor recursion on
1345 01/10/2001 olawlor Added "explicit" keyword to remove ambiguity
1346 for KCC, which was confused by the private
1347 PUP::er(int) "cast" constructor and the operator
1348 |(PUP::er &p,T &t) into rejecting all operator|
1349 (int,int) as ambiguous.
1350 2001/01/17 gzheng fix the charmconfig bug on paragon-red: the
1351 failure testing of fortran won't stop the
1353 01/20/2001 olawlor Arrays reduction: Fixed bug-- reduction may end
1354 because all contributors migrate away.
1355 01/29/2001 olawlor Fix heap-corrupting bug-- call ->init() on
1356 nodeGroupTable, which sets the "pending"
1357 message queue to NULL. This prevents a nasty
1358 delete-unitialized-data bug later on. Also
1359 delayed queue creation until messages actually
1362 --------------------------------------------------------------------------------
1363 Documentation Changes:
1364 --------------------------------------------------------------------------------
1366 01/31/2000 milind Installation manual: Fixed bugs pointed out by
1368 02/28/2000 wilmarth Added a new look Charm++ manual.
1369 06/20/2000 milind Added pdflatex support to generate PDF versions
1370 of manuals from LaTeX sources.
1371 12/05/2000 milind Added Orion's FEM manual. Converted from HTML.
1372 12/10/2000 milind Added pplmanual.sty for all manuals.
1373 12/17/2000 milind Added master-slave library documentation to
1375 12/21/2000 saboo Added DDT documentation.
1376 01/02/2001 olawlor Updated for new CCS version.
1378 --------------------------------------------------------------------------------
1380 --------------------------------------------------------------------------------
1382 10/24/1999 olawlor charmc is changed to Bourne shell script
1383 instead of csh. All conv-mach.csh are
1384 replaced by conv-mach.sh.
1385 10/25/1999 olawlor SUPER_INSTALL is converted to use bourne shell.
1386 10/28/1999 milind All Makefiles now take OPTS commandline
1388 01/16/2000 olawlor Simplified Charm++ interface translator.
1389 02/23/2000 ruiliu Changed rand() calls from all over the codes
1390 to the new Converse random number generator.
1391 02/26/2000 milind Simplified the converse scheduler loop by
1392 combining the maxmsgs and poll modes.
1393 08/31/2000 milind Imported system documentation into the CVS tree.
1394 Also added super_install target for docs with
1395 necessary Makefile modifications.
1396 09/08/2000 olawlor Made soft links use relative pathnames instead
1397 of absolute. This lets you move a charm++
1398 installation without having to recompile
1400 09/11/2000 olawlor Grouped commonly needed code in the new util
1401 directory. Also, added pup_c a C wrapper for
1403 09/11/2000 olawlor Slightly reorganized header structure. Now no
1404 headers should need to be listed twice (once in
1405 ALLHEADERS, again in CKHEADERS). Now headers
1406 are soft-linked instead of copied. This makes
1407 development much easier. Added support for the
1408 new Common/util directory.
1409 09/21/2000 olawlor Major reorganization of net-* codes. Now all
1410 the TCP socket routines are in separate files.
1411 Also combined windoes NT code with unix codes.
1412 09/21/2000 olawlor Major rewrite of CCS-- underlying protocol is
1413 now binary (send/recv binary data everywhere);
1414 conv-host forwards requests to nodes; and
1415 source has been significantly re-arranged.
1416 (especially if NODE_0_IS_CONVHOST).
1417 11/22/2000 milind Removed IDL translator from distribution.
1418 12/01/2000 olawlor Renamed conv-host charmrun; added test for
1419 script conv-host. Also added charmrun for most
1421 12/17/2000 milind Moved List related data structures into
1422 cklists.h in util. Removed most of the redundant
1423 list implementations.
1424 12/20/2000 gzheng SUPER_INSTALL: format the output of list of
1425 versions and make the help page fit into one
1427 12/24/2000 milind Added test-{charm,converse,ampi,fem} targets to
1429 12/28/2000 milind net-sol-smp now uses pthreads.
1430 01/29/2001 olawlor Merged windowsNT and unix build procedures by
1431 basing the Windows build on cygwin. Added
1432 scripts to deal with unix and windows