1 This file describes the most significant changes. For more detail, use
2 'git log' on a clone of the charm repository.
4 ================================================================================
5 What's new in Charm++ 6.6.0
6 ================================================================================
8 - Machine target files for Cray XC systems ('gni-crayxc') have been added
10 - Interoperability with MPI code using native communication interfaces on Blue
11 Gene Q (PAMI) and Cray XE/XK/XC (uGNI) systems, in addition to the universal
12 MPI communication interface
14 - Support for partitioned jobs on all machine types, including TCP/IP and IB
15 Verbs networks using 'netlrts' and 'verbs' machine layers
17 - A substantially improved version of our asynchronous library, CkIO, for
18 parallel output of large files
20 - Narrowing the circumstances in which the runtime system will send
21 overhead-inducing ReductionStarting messages
23 - A new fully distributed load balancing strategy, DistributedLB, that produces
24 high quality results with very low latency
26 - An API for applications to feed custom per-object data to specialized load
27 balancing strategies (e.g. physical simulation coordinates)
29 - SMP builds on LRTS-based machine layers (pamilrts, gni, mpi, netlrts, verbs)
30 support tracing messages through communication threads
32 - Thread affinity mapping with +pemap now supports Intel's Hyperthreading more
35 - After restarting from a checkpoint, thread affinity will use new
36 +pemap/+commap arguments
38 - Queue order randomization options were added to assist in debugging race
39 conditions in application and runtime code
41 - The full runtime code and associated libraries can now compile under the C11
42 and C++11/14 standards.
44 - Numerous bug fixes, performance enhancements, and smaller improvements in the
45 provided runtime facilities
48 * The long-unsupported FEM library has been deprecated in favor of ParFUM
49 * The CmiBool typedefs have been deleted, as C++ bool has long been universal
50 * Future versions of the runtime system and libraries will require some degree
51 of support for C++11 features from compilers
53 ================================================================================
54 What's new in Charm++ 6.5.0
55 ================================================================================
57 - The Charm++ manual has been thoroughly revised to improve its organization,
58 comprehensiveness, and clarity, with many additional example code snippets
61 - The runtime system now includes the 'Metabalancer', which can provide
62 substantial performance improvements for applications that exhibit dynamic
63 load imbalance. It provides two primary benefits. First, it automatically
64 optimizes the frequency of load balancer invocation, to avoid work stoppage
65 when it will provide too little benefit. Second, calls to AtSync() are made
66 less synchronous, to further reduce overhead when the load balancer doesn't
67 need to run. To activate the Metabalancer, pass the option +MetaLB at
68 runtime. To get the full benefits, calls to AtSync() should be made at every
69 iteration, rather than at some arbitrary longer interval as was previously
72 - Many feature additions and usability improvements have been made in the
73 interface translator that generates code from .ci files:
74 * Charmxi now provides much better error reports, including more accurate
75 line numbers and clearer reasons for failure, including some semantic
76 problems that would otherwise appear when compiling the C++ code or even at
78 * A new SDAG construct 'case' has been added that defines a disjunction over a
79 set of 'when' clauses: only one 'when' out of a set will ever be triggered.
80 * Entry method templates are now supported. An example program can be found
81 in tests/charm++/method_templates/.
82 * SDAG keyword "atomic" has been deprecated in favor of the newly supported
83 keyword "serial". The two are synonymous, but "atomic" is now provided only
84 for backward compatibility.
85 * It is no longer necessary to call __sdag_init() in chares that contain SDAG
86 code - the generated code does this automatically. The function is left as
87 a no-op for compatibility, but may be removed in a future version.
88 * Code generated from .ci files is now primarily in .def.h files, with only
89 declarations in .decl.h. This improves debugging, speeds compilation,
90 provides clearer compiler output, and enables more complete encapsulation,
91 especially in SDAG code.
92 * Mainchare constructors are expected to take CkArgMsg*, and always have
93 been. However, charmxi would allow declarations with no argument, and
94 assume the message. This is now deprecated, and generates a warning.
96 - Projections tracing has been extended and improved in various ways
97 * The trace module can generate a record of network topology of the nodes in
98 a run for certain platforms (including Cray), which Projections can
100 * If the gzip library (libz) is available when Charm++ is compiled, traces
101 are compressed by default.
102 * If traces were flushed as a results of filled buffers during the run, a
103 warning will be printed at exit to indicate that the user should be wary of
104 interference that may have resulted.
105 * In SMP builds, it is now possible to trace message progression through the
106 communication threads. This is disabled by default to avoid overhead and
107 potential misleading interpretation.
109 - Array elements can be block-mapped at the SMP node level instead of at the
110 per-PE level (option "+useNodeBlkMapping").
112 - AMPI can now privatize global and static variables using TLS. This is
113 supported in C and C++ with __thread markings on the variable declarations
114 and definitions, and in Fortran with a patched version of the gfortran
115 compiler. To activate this feature, append '-tls' to the '-thread' option's
116 argument when you link your AMPI program.
118 - Charm can now be built to only support message priorities of a specific data
119 type. This enables an optimized message queue within the the runtime
120 system. Typical applications with medium sized compute grains may not benefit
121 noticeably when switching to the new scheduler. However, this may permit
122 further optimizations in later releases.
124 The new queue is enabled by specifying the data type of the message
125 priorities while building charm using --with-prio-type=dtype. Here, dtype can
126 be one of char, short, int, long, float, double and bitvec. Specifying bitvec
127 will permit arbitrary-length bitvector priorities, and is the current default
128 mode of operation. However, we may change this in a future release.
130 - Converse now provides a complete set of wrappers for
131 fopen/fread/fwrite/fclose to handle EINTR, which is not uncommon on the
132 increasingly-popular Lustre. They are named CmiF{open,read,write,close}, and
133 are available from C and C++ code.
135 - The utility class 'CkEntryOptions' now permits method chaining for cleaner
136 usage. This applies to all its set methods (setPriority, setQueueing,
137 setGroupDepID). Example usage can be found in examples/charm++/prio/pgm.C.
139 - When creating groups or chare arrays that depend on the previous construction
140 of another such entity on the local PE, it is now possible to declare that
141 dependence to the runtime. Creation messages whose dependence is not yet
142 satisfied will be buffered until it is.
144 - For any given chare class Foo and entry method Bar, the supporting class's
145 member CkIndex_Foo::Bar() is used to lookup/specify the entry method
146 index. This release adds a newer API for such members where the argument is a
147 function pointer of the same signature as the entry method. Those new
148 functions are used like CkIndex_Foo::idx_Bar(&Foo::Bar). This permits entry
149 point index lookup without instantiating temporary variables just to feed the
150 CkIndex_Foo::Bar() methods. In cases where Foo::Bar is overloaded, &Foo::Bar
151 must be cast to the desired type to disambiguate it.
153 - CkReduction::reducerType now have PUP methods defined; and can hence be
154 passed as parameter-marshalled arguments to entry methods.
156 - The runtime option +stacksize for controlling the allocation of user-level
157 threads' stacks now accepts shorthanded annotation such as 1M.
159 - The -optimize flag to the charmc compiler wrapper now passes more aggressive
160 options to the various underlying compilers than the previous '-O'.
162 - The charmc compiler wrapper now provides a flag -use-new-std to enable
163 support for C11 and C++11 where available. To use this in application code,
164 the runtime system must have been built with that flag as well.
166 - When using, CmiMemoryUsage(), the runtime can be instructed not to use the
167 underlying mallinfo() library call, which can be inaccurate in settings where
168 usage exceeds INT_MAX. This is accomplished by setting the environment
169 variable "MEMORYUSAGE_NO_MALLINFO".
171 - Experimental Features
172 * Initial implementation of a fast message-logging protocol. Use option
173 'mlogft' to build it.
174 * Message compression support for persistent message on Gemini machine layer.
175 * Node-level inter-PE loop/task parallelization is now supported through
177 * New temperature/CPU frequency aware load balancer
178 * Support interoperation of Charm++ and native MPI code through dynamically
179 switching control between the two
180 * API in centralized load balancers to get and set PE speed
181 * A new scheme for optimization of double in-memory checkpoint/restart.
182 * Message combining library for improved fine-grained communication
184 * Support for partitioning of allocated nodes into subsets that run
185 independent Charm++ instances but can interact with each other.
187 Platform-Specific Changes
188 -------------------------
191 * The gemini_gni network layer has been heavily tuned and optimized,
192 providing substantial improvements in performance, scalability, and
194 * The gemini_gni-crayxe machine layer supports a 'hugepages' option at build
195 time, rather than requiring manual configuration file editing.
196 * Persistent message optimizations can be used to reduce latency and
198 * Experimental support for 'urgent' sends, which are sent ahead of any other
199 outgoing messages queued for transmission.
201 - IBM Blue Gene Q: Experimental machine-layer support for the native PAMI
202 interface and MPI, with and without SMP support. This supports many new
203 systems, including LLNL's Sequoia, ALCF's Mira, and FZ Juelich's Juqueen.
205 There are three network-layer implementations for these systems: 'mpi',
206 'pami', and 'pamilrts'. The 'mpi' layer is stable, but its performance and
207 scalability suffers from the additional overhead of using MPI rather than
208 driving the interconnect directly. The 'pami' layer is well tested for NAMD,
209 but has shown instability for other applications. It is likely to be replaced
210 by the 'pamilrts' layer, which is more generally stable and seems to provide
211 the same performance, in the next release.
213 In addition to the common 'smp' option to build the runtime system with
214 shared memory support, there is an 'async' option which sometimes provides
215 better performance on SMP builds. This option passes tests on 'pamilrts', but
216 is still experimental.
218 Note: Applications that have large number of messages may crash in default
219 setup due to overflow in the low-level FIFOs. Environment variables
220 MUSPI_INJFIFOSIZE and PAMI_RGETINJFIFOSIZE can be set to avoid application
221 failures due to large number of small and large messages respectively. The
222 default value of these variable is 65536 which is sufficient for 1000
225 - Infiniband Verbs: Better support for more flavors of ibverbs libraries
228 * Experimental rendezvous protocol for better performance above some MPI
230 * Some tuning parameters ("+dynCapSend" and "+dynCapRecv") are now
231 configurable at job launch, rather than Charm++ compilation.
233 - PGI C++: Disable automatic 'using namespace std;'
235 - Charm++ now supports ARM, both non-smp and smp.
237 - Mac OS X: Compilation options to build and link correctly on newer versions
240 ================================================================================
241 What's new in Charm++ 6.4.0
242 ================================================================================
244 --------------------------------------------------------------------------------
246 --------------------------------------------------------------------------------
248 - Cray XE and XK systems using the Gemini network via either MPI
249 (mpi-crayxe) or the native uGNI (gemini_gni-crayxe)
251 - IBM Blue Gene Q, using MPI (mpi-bluegeneq) or PAMI (pami-bluegeneq)
253 - Clang, Cray, and Fujitsu compilers
255 - MPI-based machine layers can now run on >64k PEs
257 --------------------------------------------------------------------------------
259 --------------------------------------------------------------------------------
261 - Added a new [reductiontarget] attribute to enable
262 parameter-marshaled recipients of reduction messages
264 - Enabled pipelining of large messages in CkMulticast by default
266 - New load balancers added:
269 * Scotch graph partitioning based: ScotchLB and Refine and Topo variants
272 - Load balancing improvements:
274 * Allow reduced load database size using floats instead of doubles
275 * Improved hierarchical balancer
276 * Periodic balancing adapts its interval dynamically
277 * User code can request a callback when migration is complete
278 * More balancers properly consider object migratability and PE
279 availability and speed
280 * Instrumentation records multicasts
282 - Chare arrays support options that can enable some optimizations
284 - New 'completion detection' library for parallel process termination
285 detection, when the need for modularity excludes full quiescence
288 - New 'mesh streamer' library for fine-grain many-to-many collectives,
289 handling message bundling and network topology
291 - Memory pooling allocator performance and resource usage improved
294 - AMPI: More routines support MPI_IN_PLACE, and those that don't check
297 ================================================================================
298 What's new in Charm++ 6.2.1 (since 6.2.0)
299 ================================================================================
301 --------------------------------------------------------------------------------
302 New Supported Platforms:
303 --------------------------------------------------------------------------------
305 POWER7 with LAPI on Linux
307 Infiniband on PowerPC
309 --------------------------------------------------------------------------------
311 --------------------------------------------------------------------------------
313 - Better support for multicasts on groups
314 - Topology information gathering has been optimized
315 - Converse (seed) load balancers have many new optimizations applied
316 - CPU affinity can be set more easily using +pemap and +commap options
317 instead of the older +coremap
318 - HybridLB (hierarchical balancing for very large core-count systems)
319 has been substantially improved
320 - Load balancing infrastructure has further optimizations and bug fixes
321 - Object mappings can be read from a file, to allow offline
322 topology-aware placement
323 - Projections logs can be spread across multiple directories, speeding
324 up output when dealing with thousands of cores (+trace-subdirs N
325 will divide log files evenly among N subdirectories of the trace
326 root, named PROGNAME.projdir.K)
327 - AMPI now implements MPI_Issend
328 - AMPI's MPI_Alltoall uses a flooding algorithm more agressively,
329 versus pairwise exchange
330 - Virtualized ARMCI support has been extended to cover the functions
333 --------------------------------------------------------------------------------
334 Architecture-specific changes
335 --------------------------------------------------------------------------------
337 - LAPI SMP has many new optimizations applied
339 - Net builds support the use of clusters' mpiexec systems for job
340 launch, via the ++mpiexec option to charmrun
342 ================================================================================
343 What's new in Charm++ 6.2.0 (since 6.1)
344 ================================================================================
346 --------------------------------------------------------------------------------
347 New Supported Platforms:
348 --------------------------------------------------------------------------------
350 64-bit MIPS, such as SiCortex, using mpi-linux-mips64
352 Windows HPC cluster, using mpi-win32/mpi-win64
354 Mac OSX 10.6, Snow Leopard (32-bit and 64-bit).
356 --------------------------------------------------------------------------------
358 --------------------------------------------------------------------------------
361 - Smarter build/configure scripts
362 - A new interface for model-based load balancing
363 - new CPU topology API
364 - a general implementation of CmiMemoryUsage()
365 - Bug fix: Quiescence detection (QD) works with immediate messages
366 - New reduction functions implemented in Converse
367 - CCS (Converse Client-Server) can deliver message to more than one processor
368 - Added a memory-aware adaptive scheduler, which can be optionally
370 - Added preliminary support for automatic message prioritization
371 (disabled by default)
374 - Cross-array and cross-group sections
375 - Structured Dagger (SDAG): Support templated arguments properly
376 - Plain chares support checkpoint/restart (both in-memory and disk-based)
377 - Conditional packing of messages and parameters in SMP scenario
378 - Changes to the CkArrayIndex class hierarchy
379 -- sizeof() all CkArrayIndex* classes is now the same
380 -- Codes using custom array indices have to use placement-new to construct
381 their custom index. Refer example code: examples/charm++/hello/fancyarray/
382 -- *** Backward Incompatibility ***
383 CkArrayIndex[4D/5D/6D]::index are now of type int (instead of short)
384 However the data is stored as shorts. Access by casting
385 CkArrayIndexND::data() appropriately
386 -- *** Deprecated ***
387 The direct use of public data member
388 CkArrayIndexND::index (N=1..6) is deprecated. We reserve the right to
389 change/remove this variable in future releases of Charm++.
390 Instead, please access the indices via member function:
391 int CkArrayIndexND::data()
394 - Compilers renamed to avoid collision with host MPI (ampicc, ampiCC,
396 - Improved MPI standard conformance, and documentation of non-conformance
397 * Bug fixes in: MPI_Ssend, MPI_Cart_shift, MPI_Get_count
398 * Support MPI_IN_PLACE in MPI_(All)Reduce
399 * Define various missing constants
400 - Return the received message's tag in response to a non-blocking
401 wildcard receive, to support SuperLU
402 - Improved tracing for BigSim
404 Multiphase Shared Arrays (MSA)
405 - Typed handles to enforce phases
406 - Split-phase synchronization to enable message-driven execution
410 - Automatic tracing of API calls for simulation and analysis
413 - Wider support for architectures other than net- (in particular MPI layers)
414 - Improved support for large scale debugging (better scalability)
415 - Enhanced record/replay stability to handle various events, and to
416 signal unexpected messages
417 - New detailed record/replay: The full content of messages can be
418 recorded, and a single processor can be re-executed outside of the
422 - Tracing of nested entry methods
424 Automatic Performance Tuning
425 - Created an automatic tuning framework [still for experimental use only]
428 - Network-topology / node aware spanning trees used internally for and
429 lower bytes on the network and improved performance in multicasts and
430 reductions delegated to this library
433 - Improved OneTimeMulticastStrategy classes
436 - Out-of-core support, with prefetching capability
437 - Detailed tracing of MPI calls
438 - Detailed record/replay support at emulation time, capable of
439 replaying any emulated processor after obtained recorded logs.
441 --------------------------------------------------------------------------------
442 Architecture-specific changes
443 --------------------------------------------------------------------------------
446 - Can run jobs with more than 1024 PEs
449 - New charmrun option ++no-va-randomization to disable address space
450 randomization (ASLR). This is most useful for running AMPI with
454 - Default to using ampicxx instead of mpiCC
457 - The +p option now has the same semantics as in other smp builds
460 - Support for VSX in SIMD abstraction API
463 - Compilers and options have been updated to the latest ones
466 - Added routines for measuring performance counters on BG/P.
467 - Updated to support latest DCMF driver version. On ANL's Intrepid, you may
468 need to set BGP_INSTALL=/bgsys/drivers/V1R4M1_460_2009-091110P/ppc in your
469 environment. This is the default on ANL's Surveyor.
472 - cputopology information is now available on XT3/4/5
475 - Bug fix: plug memory leaks that caused failures in long runs
476 - Optimized to reduce startup delays
479 - Support for SMP (experimental)
482 ================================================================================
483 Note that changes from 5.9, 6.0, and 6.1 are not documented here. A partial list
484 can be found on the charm download page, or by reading through version control
487 ================================================================================
488 What's New since Charm++ 5.4 release 1
489 ================================================================================
491 --------------------------------------------------------------------------------
492 New Supported Platforms:
493 --------------------------------------------------------------------------------
494 1. Charm++ ported to IA64 Itanium running Win2K and Linux, Charm++ also support
495 Intel C/C++ compilers;
497 2. Charm++ ported to Power Macintosh powerpc running Darwin;
499 3. Charm++ ported to Myrinet networking with GM API;
501 --------------------------------------------------------------------------------
502 Summary of New Features:
503 --------------------------------------------------------------------------------
505 Structured Dagger is a coordination language built on top of CHARM++.
506 Structured Dagger allows easy expression of dependences among messages and
507 computations and also among computations within the same object using
508 when-blocks and various structured constructs.
510 2. Entry functions support parameter marshalling
511 Now you can declare and invoke remote entry functions using parameter
512 marshalling instead of defining messages.
514 3. Easier running - standalone mode
515 For net-* version running locally, you can now run Charm programs without
516 charmrun. Running a node program directly from command line is now the
517 same as "charmrun +p1 <program>"; for SMP version, you can also specify
518 multiple (local) processors, as in "program +p2".
521 --------------------------------------------------------------------------------
523 --------------------------------------------------------------------------------
524 1. "build" changed for compilation of Charm++
525 To build Charm++ from scratch, we now take additional command line options
526 to compile with addon features and using different compilers other than gcc.
527 For example, to build Linux IA64 with Myrinet support, type command:
528 ./build net-linux-ia64 gm
531 ******* Old Change histories *******
534 ================================================================================
535 What's New in Charm++ 5.4 release 1 since 5.0
536 ================================================================================
538 --------------------------------------------------------------------------------
539 New Supported Platforms:
540 --------------------------------------------------------------------------------
542 1. Win9x/2000/NT: with Visual C++ or Cygwin gcc/g++, you can compile and run
543 Charm++ programs on all Win32 platforms.
545 2. Scyld Beowulf: Charm++ has been ported to the Linux-based Scyld Beowulf
546 operating system. For more information on Scyld, see <http://www.scyld.com>
548 3. MPI with VMI: Charm++ has been ported to NCSA's Virtual Machine Interface,
549 which is an efficient messaging library for heterogeneous cluster
553 --------------------------------------------------------------------------------
554 Summary of New Features:
555 --------------------------------------------------------------------------------
556 1. Dynamic Load balancing:
557 Chare migration is supported in the new release. Migration-based dynamic
558 load balancing framework with various load balancing strategies library has
562 Charm++ array is supported. You can now create an array of Chare objects
563 and use array index to refer the Charm++ array elements. A reduction
564 library on top of Chare array has been implemented and included.
567 Projections, a Java application for Charm++ program performance analysis and
568 visualization, has been included and distributed in the new release. Two
569 trace modes are available: trace-projections and trace-summary. Trace-summary
570 is a light-weight trace library compared to trace-projections.
573 AMPI is a load-balancing based library for porting legacy MPI applications
574 to Charm++. With few changes in the original MPI code to AMPI, the new
575 legacy MPI application on Charm++ will gain from Charm++'s adptive
576 load balancing ability.
579 "Charmrun" is now available on all platforms, with a uniform command line
580 syntax. You can forget the difference between net-* versions and MPI versions,
581 and run charm++ application with this same charmrun command syntax.
582 ++local option is added in charmrun for net-* version, it provides
583 simple local use of Charm and no longer require the ability to
584 "rsh localhost" or a nodelist file in order to run charm only on the local
585 machine. This is especially attractive when you run Charm++ on Windows.
588 Many new libraries have been added in this release. They include:
589 1) master-slave library: for writing manager-worker paradigm programs.
590 2) receiver library: provide asynchronous communication mode for chare array.
591 3) f90charm: provides Fortran90 bindings for Charm++ Array.
592 4) BlueGene: a Charm++/Converse emulator for IBM proposed Blue Gene.
594 --------------------------------------------------------------------------------
596 --------------------------------------------------------------------------------
597 1. message declaration syntax in .ci file:
598 The message declaration syntax for packed/varsize messages has been changed.
599 The packed/varsize keywords are eliminated, and you can specify the actual
600 actual varsize arrays in the interface file and have the translator generate
601 alloc, pack and unpack.
604 Here is the detailed list of Changes:
606 --------------------------------------------------------------------------------
608 --------------------------------------------------------------------------------
610 10/06/1999 rbrunner Added migration-based dynamic load balancing
612 11/15/1999 olawlor Added reduction support foe Charm++ arrays
613 02/06/2000 milind Added AMPI, an implementation of MPI with
614 dynamic load balancing
615 02/18/2000 paranjpy New platforms supported: net-win32, and net-win32-smp
616 04/04/2000 olawlor Added arbitrarily indexed Charm++ arrays.
617 Also, added translator support for new arrays.
618 04/15/2000 olawlor Added "puppers" for packing and unpacking
620 06/14/2000 milind Added the threaded FEM framework.
622 --------------------------------------------------------------------------------
624 --------------------------------------------------------------------------------
626 10/09/1999 rbrunner Added packlib, a library for C and C++ to
627 pack-unpack data to/from Charm++ messages.
628 10/13/1999 gzheng New LB strategy: RefineLB
629 10/13/1999 paranjpy New LB Strategy: Heap
630 10/14/1999 milind New LB Strategy: Metis
631 10/19/1999 olawlor New test program for testing LB strategies.
632 10/21/1999 gzheng New trace mode: trace-summary
633 10/28/1999 milind New supported platform: net-sol-x86
634 10/29/1999 milind Added runtime checks for ChareID assignment.
635 11/10/1999 rbrunner Added Neighborhood base strategy for LB
637 11/15/1999 olawlor conv-host now reads in a startup file
639 11/15/1999 olawlor New test program for testing array reductions.
640 11/16/1999 rbrunner Added processor-speed checking functions to
642 11/19/1999 milind Mapped SIGUSR to a Ccd condtion handler
643 11/22/1999 rbrunner New LB strategy: WSLB
644 11/29/1999 ruiliu Modified Metis LB strategy to deal with
645 different processor speeds
646 12/16/1999 rbrunner New LB strategy: GreedyRef
647 12/16/1999 rbrunner New LB strategy: RandRef
648 12/21/1999 skumar2 New LB strategy: CommLB
649 01/03/2000 rbrunner New LB strategy: RecBisectBfLB
650 01/08/2000 skumar2 New LB strategy: Comm1LB, with varying processor
652 01/18/2000 milind Modified SM library syntax, and added a test
654 01/19/2000 gzheng Added irecv, a library to simplify conversion
655 of message-passing programs to Charm++
656 02/20/2000 olawlor Added preliminary broadcast support to Charm++
658 02/23/2000 paranjpy Added converse-level quiescence detection
659 03/02/2000 milind Added ++server-port option to pre-specify
661 03/10/2000 wilmarth Random seed-based load balancer now uses
662 bit-vector for active PEs.
663 03/21/2000 gzheng Added support for marking user-defined events
665 03/28/2000 wilmarth Added CMK_TRUECRASH. Very helpful for
666 post-mortem debugging of Charm++ programs on
668 03/31/2000 jdesouza Added Fortran90 support to the Charm++
669 interface translator.
670 03/09/2000 milind Added support for -LANG and -rpath options
671 in charmc for Origin2000.
672 04/28/2000 milind Added prioritized converse threads.
673 05/01/2000 milind Added test programs for TeMPO, AMPI and irecv.
674 05/04/2000 milind New supported platform: mpi-sp.
675 05/04/2000 gzheng Added irecv pingpong program.
676 05/17/2000 olawlor Each chare, group and array element now has to
677 have migration constructor.
678 05/24/2000 milind Added Jacobi3D programs for irecv and AMPI both.
679 05/24/2000 milind Made migratable an optional attribute of
680 chares, groups, and nodegroups.
681 Arrays are by default migratable.
682 05/29/2000 paranjpy Added pup methods to arrays, reductions etc
684 06/13/2000 milind Made CtvInitialize idempotent. That is, it
685 can be called by any number of threads now,
686 only the first one will actually do
688 06/20/2000 milind Added a simple test program for the FEM
690 07/06/2000 milind Imported Metis 4.0 sources in the CVS tree.
691 Also added code to make metis libraries and
692 executables to Makefile.
693 07/07/2000 milind Added more meaningfull error messages using
694 perror in addition to a cryptic error codes in
696 07/10/2000 milind fem and femf are now recognized as "languages"
698 07/10/2000 saboo Added the derived datatypes library.
699 07/13/2000 milind Added +idle_timeout functionality. It takes a
700 commandline parameter denoting milliseconds of
701 maximum consecutive idle time allowed per
703 07/14/2000 milind Added group multicast. Added
704 CkSendMsgBranchMulti, CldEnqueueMulti, and
705 translator changes to support it.
706 07/14/2000 milind SUPER_INSTALL now takes "-*" arguments prior
707 to the target, that will be passed to make as
708 "makeflags". This makes it easy to suppress
709 make's output of commands etc (with the -s
710 flag). As a result of this, several Makefiles
712 07/18/2000 milind Added support for using "dbx" on suns as
714 07/19/2000 milind Added ability to tracemode projections which
715 produces binary trace files. Use flag
716 +binary-trace on the command line.
717 07/26/2000 milind Separated AMPI from TeMPO.
718 07/28/2000 milind Added test programs to test reduce, alltoall
719 and allreduce functionality of AMPI.
720 08/02/2000 milind Added an option to let the user specify which
721 "xterm" to use. For example, on some systems
722 (CDE), only dtterm is installed. So, by
723 putting ++xterm dtterm on the conv-host
724 commandline, one can use dtterm when ++in-xterm
725 option is specified on conv-host commandline.
726 08/14/2000 milind FEM Framework: Added capabilities to handle
727 esoteric meshes to standalone offline programs.
728 Makefile now produces gmap and fgmap programs,
729 which are used for this purpose. They convert
730 the mesh to a graph before partitioning it
732 08/24/2000 milind Added the 2D crack propagation program as a
733 test program for FEM framework.
734 08/25/2000 milind Initial implementation of isomalloc-based
735 threads. This implementation uses a fixed
736 stack size for all threads (can be set at
738 08/26/2000 milind Added a macro CtvAccessOther that lets you
739 get/set a Ctv variable of any thread. It
740 should be invoked as CtvAccessOther(thread,
741 varname); Added CthGetData function to each of
742 the threads implementation. This function is
743 used in the CtvAccessOther macro.
744 08/27/2000 milind FEM Framework: Separated mesh to graph
745 conversion capability into a separate program.
746 This way, the generated graph can be partitioned
748 09/04/2000 milind Added the class static readonly variables to
750 09/05/2000 milind FEM Framework: A very fast O(n) algorithm for
751 mesh2graph , uses more memory, but the tradeoff
752 was worth it. Coded by Karthik Mahesh, minor
753 optimizations by Milind.
754 09/05/2000 milind Added a barebones charm kernel scheduling
755 overhead measurement program.
756 09/15/2000 milind Added pup support for AMPI and FEM framework.
757 09/20/2000 olawlor Added capability to have an array of base type
758 where individual element could be of derived
760 10/03/2000 gzheng New supported platform: net-linux-axp
761 10/05/2000 skumar2 Added program littleMD to the test suite.
762 10/07/2000 skumar2 New job scheduler (Faucets projects).
763 10/15/2000 milind Improved support for Fortran90 in charmc.
764 11/04/2000 jdesouza Made the Faucets scheduler multi-threaded.
765 11/05/2000 olawlor FEM Framework: supports multiple element types,
766 mesh re-assembly, etc.
767 11/15/2000 gzheng New platform support: net-cygwin
768 11/18/2000 gzheng conv-host no longer needs /bin/csh to start
770 CMK_CONV_HOST_CSH_UNAVAILABLE to 1 to use
772 11/25/2000 milind Finished experimental implementation of
773 converse-threads based on co-operative pthreads.
774 11/25/2000 milind Added a benchmark suite of all pingpongs in
776 11/28/2000 milind Removed deletion of _idx at the end of every
777 send or doneInserting call. Instead now it is
778 in the destructor of the proxy. This allows us
779 to cache proxies, when proxy creation becomes
781 11/28/2000 olawlor Added "seek blocks" to puppers. This should
782 allow out-of-order pup'ing without the ugliness
783 of getBuf; and in a way that works with all
785 11/29/2000 olawlor Simplified and regularized command-line-argument
787 11/29/2000 milind AMPI: Added multiple-communicators capability.
788 12/05/2000 gzheng Now /bin/sh is default shell to fork node
789 program on remote machines.
790 12/13/2000 olawlor Added charmrun wrapper for poe on mpi-sp.
791 12/14/2000 milind Added bluegene emulator sources and test
792 programs. Added "bluegene" as a language known
793 to charmc. Makefile now has a target called
794 bluegene. Added preliminary bluegene
795 documentation. (copied from Arun's webpage.)
796 12/15/2000 gzheng f90charm addition to Makefile and charmc. Also,
797 added fixed size arrays support to f90charm. A
798 test program f90charm/hello is checked in.
799 12/17/2000 milind Added rtest test program. Contributed by jim to
800 test Converse message transmission.
801 12/20/2000 olawlor Added charmconfig script. Enables automatic
802 determination of C++ compiler properties,
803 replacing the verbose and error-prone
804 conv-mach.h entries for CMK_BOOL,
805 CMK_STL_USE_DOT_H, CMK_CPP_CAST_OK, ...
806 12/20/2000 olawlor Charm++ Arrays optimizations: Key and object
807 now variable-length fields, instead of pointers.
808 This extra flexibility lets us save many
809 dynamic allocations in the array framework.
810 12/20/2000 olawlor Added PUP::able support-- dynamic type
811 identification, allocation, and deletion.
812 Allows you to write: p(objPtr); and
813 objPointer will be properly identified,
814 allocated, packed, and deallocated (depending
815 on the PUP::er). Requires you to register any
816 such classes with DECLARE_PUPable and
818 12/20/2000 olawlor Arrays optimizations: Made CkArrayIndex
819 fixed-size. This significantly improves
820 messaging speed (7 us instead of 10 us
821 roundtrip). Move spring cleaning check into a
822 CcdCallFnAfter, which gains more speed (down to
824 12/20/2000 olawlor More optimizations: Minor speed tweaks--
825 conv-ccs.c uses hashtable for handler lookup;
826 conv-conds skips timer test until needed;
827 convcore.c scheduler loop optmizations (no
828 superfluous EndIdle calls); threads.c
829 CMK_OPTIMIZE-> no mprotect.
830 12/20/2000 olawlor More Optimizations: Minor speed tweaks-- ck.C
831 groups cldEnqueue skip; init.h defines
832 CkLocalBranch inline; and supporting changes.
833 12/22/2000 gzheng IA64 support for Converse user level threads.
834 01/02/2001 olawlor CCS: Minor update-- enabled CcsProbe, cleaned
835 up superflous debug messages in server, added
836 Java interface (originally written for
838 01/09/2001 gzheng charmconfig converted to autoconf style, need
839 to change configure.in and conv-autoconfig.h.in,
840 and run autoconf to get configure and copy to
841 charmconfig. added fortran subroutine name
842 test and get libpthread.a
843 01/10/2001 milind Added telnet method of getting libpthread.a
844 from charm webserver.
845 01/11/2001 olawlor Moved projections files here from
846 CVSROOT/projections-java. Added fast Java
847 versions of the .log file input routines in
848 LogReader, LogLoader, LogAnalyzer, and
849 UsageCalc. Added "U.java" user interface
850 utility file, allowing times to be input in
851 seconds, milliseconds, or microseconds,
852 instead of just microseconds.
853 01/15/2001 gzheng add +trace-root to specify the directory to
854 put log files in. this is need in Scyld cluster
855 where there is no NFS mounting and no i/o
856 access to home directory sharing on nodes.
857 01/15/2001 milind Made AMPI into a f90 module instead of
858 'ampif.h' inclusion. AMPI f90 bindings are
859 now more inclusive. Fixed argc,argv handling
860 bugs in ArgsInfo message. Fixed a bug in pup
861 that caused thread not to be sized, but was
862 packed nevertheless. Moved irecv to waitall
863 instead of at in ampi_start. Made
864 AMPI_COMM_WORLD to be 0, because it clashed
865 with wildcard(-1). AMPI_COMM_UNIVERSE is now
866 handled properly in the AMPI module.
867 C/C++ data members are NOT visible to
869 01/18/2001 gzheng New supported platform: net-linux-scyld
870 01/20/2001 olawlor Moved array index field from CMessage_* to the
871 Ck envelope itself. This is the right thing
872 to do, because any message may be sent to/from
873 an array element. To reduce the wasted space
874 in a message, a union is used to overlay the
875 fields for the various possible message types.
876 01/29/2001 olawlor Freed charmrun on net-* version from using
877 remote shell to fork off processes. One can now
878 use a daemon provided in the distribution.
879 02/07/2001 olawlor Added debugging support to puppers.
880 02/13/2000 gzheng Added ++local option to charmrun to start node
881 program locally without any daemon; fix the
882 hang program if you type wrong pgm name in
883 scyld version, and redirect all output to
884 /dev/null, otherwise all node program can send
885 its output to console in scyld. Also implemented ++local in net-win32 version.
886 02/26/2000 milind Changed the varsize syntax. Now one can specify
887 actual varsize arrays in the interface file
888 and have the translator generate alloc, pack
891 --------------------------------------------------------------------------------
893 --------------------------------------------------------------------------------
895 10/29/1999 milind Replaced jmemcpy by memcpy in net versions, as
896 it was causing a bit to flip (bug reported
898 10/29/1999 milind Fixed multiline macros in all header files.
899 02/05/2000 milind Fixed linking errors by getting the order of
900 libraries right from the charmc command-line.
901 02/18/2000 paranjpy Fixed Charm++ initialization bug on SMPs.
902 02/21/2000 milind Fixed a context-switching bug in mipspro version
904 02/25/2000 milind Charm++ interface translator was segfaulting
905 on interface file errors. Fixed that. Also,
906 added linenumbers to error messages.
907 03/02/2000 milind Made CCS work on SMPs.
908 03/07/2000 milind Made ConverseInit consistent with the manual on
910 04/18/2000 milind Fixed a bug in CkWaitFuture, which was caching
911 a variable locally, while it was changed by
913 05/04/2000 paranjpy Fixed argv deletion bug on net-win32-smp.
914 06/08/2000 milind sp3 version: changed optimization flags, which
915 where power2 processor-specific.
916 06/20/2000 milind mpi-* versions: Fixed ConverseExit since it was
917 not obeying the following statement in the MPI
918 standard: The user must ensure that all pending
919 communications involving a process completes
920 before the process calls MPI_FINALIZE.
921 07/05/2000 milind Fixed a nasty bug in charmc in the -cp option.
922 It used to append the name provided to -o flag
923 to the directory provided to the -cp flag.
924 Thus, -o ../pgm -cp ../bin options meant that
925 the pgm would be copied to ../bin/.., which is
926 not the expected behavior. This fix correctly
927 copies pgm to ../bin.
928 07/07/2000 milind Removed variable arg_myhome, as it was not
929 being used anywhere, and also, setting it was
930 causing problems of env var HOME was not set.
931 07/27/2000 milind thishandle for the arrayelement was not being
932 correctly set. Bug was reported by Neelam.
933 08/26/2000 milind Origin2000: Changed the page alignment to
934 reflect the mmap alignment. The mmap man page
935 specifically states that it is not the same as
937 09/02/2000 milind Fixed a bug in code generated for threaded
938 (void) entry methods of array elements. The
939 dummy message that is passed to that method in
940 a thread has to be deleted before calling the
941 object method, because upon object method's
942 return, the thread might have migrated.
943 09/03/2000 olawlor Minor fix-fixes: 1.) Change to LBObjid hash
944 function would fail for >4-int object indices.
945 Replaced with proper function, which also
946 preserves the 1-int case. 2.) Array element
947 sends must go via the message queue to prevent
948 stack build-up for deep single-processor call
949 chains. These might happen, e.g., in a driver
950 element calling itself for the main time loop.
951 Messages are now properly noted as sent, then
952 wait through the queue for delivery. This
953 entailed minor reorganization of the message
955 09/21/2000 olawlor Tiny SMP thread fix-- registrations of a
956 thread-private variable now reserve space on
957 calls after the first. This wastes space for
958 multiple CthInitialize's-- it's a quick hack to
959 get threads working again on SMP versions.
960 10/16/2000 olawlor A few CCS fixes: -Added split-phase reply
961 (delay reply indefinitely) -Cleaned up error
962 handling -Pass user data as "void *" instead of
964 11/03/2000 wilmarth Removed 0 size array allocation in Charm++
965 quiescence detection.
966 11/20/2000 gzheng Rewrote part of Fiber thread, including a bug
967 fix for a the non thread-safe function, and a
968 different fiber free strategy.
969 11/29/2000 gzheng The LB init procedure tried to allocate
970 65536*160 as initial size, which is 10M memory
971 for communication table, which is too big.
972 Cut it down to roughly 1M, and it can expand
974 12/05/2000 gzheng In many cases, conv-host exits without print
975 out the error message from remote shell. try
976 to fix it by calling sync to flush the pipe
978 12/10/2000 milind net-linux: Made static linking the default
979 option because dynamic linking runtime causes
980 isomalloc threads to crash.
981 12/18/2000 milind Increased portability of isomalloc threads by
982 removing dependence on alloca.
983 12/28/2000 milind Fixed ctrl-getone abort bug on SMP.
984 12/28/2000 milind Made _groupTable a pointer on which a
985 constructor is explicitly called. Since it
986 was a Cpv variable, its constructor was not
987 called by default in case of an SMP version.
988 12/29/2000 olawlor Prevent infinite copy constructor recursion on
990 01/10/2001 olawlor Added "explicit" keyword to remove ambiguity
991 for KCC, which was confused by the private
992 PUP::er(int) "cast" constructor and the operator
993 |(PUP::er &p,T &t) into rejecting all operator|
994 (int,int) as ambiguous.
995 2001/01/17 gzheng fix the charmconfig bug on paragon-red: the
996 failure testing of fortran won't stop the
998 01/20/2001 olawlor Arrays reduction: Fixed bug-- reduction may end
999 because all contributors migrate away.
1000 01/29/2001 olawlor Fix heap-corrupting bug-- call ->init() on
1001 nodeGroupTable, which sets the "pending"
1002 message queue to NULL. This prevents a nasty
1003 delete-unitialized-data bug later on. Also
1004 delayed queue creation until messages actually
1007 --------------------------------------------------------------------------------
1008 Documentation Changes:
1009 --------------------------------------------------------------------------------
1011 01/31/2000 milind Installation manual: Fixed bugs pointed out by
1013 02/28/2000 wilmarth Added a new look Charm++ manual.
1014 06/20/2000 milind Added pdflatex support to generate PDF versions
1015 of manuals from LaTeX sources.
1016 12/05/2000 milind Added Orion's FEM manual. Converted from HTML.
1017 12/10/2000 milind Added pplmanual.sty for all manuals.
1018 12/17/2000 milind Added master-slave library documentation to
1020 12/21/2000 saboo Added DDT documentation.
1021 01/02/2001 olawlor Updated for new CCS version.
1023 --------------------------------------------------------------------------------
1025 --------------------------------------------------------------------------------
1027 10/24/1999 olawlor charmc is changed to Bourne shell script
1028 instead of csh. All conv-mach.csh are
1029 replaced by conv-mach.sh.
1030 10/25/1999 olawlor SUPER_INSTALL is converted to use bourne shell.
1031 10/28/1999 milind All Makefiles now take OPTS commandline
1033 01/16/2000 olawlor Simplified Charm++ interface translator.
1034 02/23/2000 ruiliu Changed rand() calls from all over the codes
1035 to the new Converse random number generator.
1036 02/26/2000 milind Simplified the converse scheduler loop by
1037 combining the maxmsgs and poll modes.
1038 08/31/2000 milind Imported system documentation into the CVS tree.
1039 Also added super_install target for docs with
1040 necessary Makefile modifications.
1041 09/08/2000 olawlor Made soft links use relative pathnames instead
1042 of absolute. This lets you move a charm++
1043 installation without having to recompile
1045 09/11/2000 olawlor Grouped commonly needed code in the new util
1046 directory. Also, added pup_c a C wrapper for
1048 09/11/2000 olawlor Slightly reorganized header structure. Now no
1049 headers should need to be listed twice (once in
1050 ALLHEADERS, again in CKHEADERS). Now headers
1051 are soft-linked instead of copied. This makes
1052 development much easier. Added support for the
1053 new Common/util directory.
1054 09/21/2000 olawlor Major reorganization of net-* codes. Now all
1055 the TCP socket routines are in separate files.
1056 Also combined windoes NT code with unix codes.
1057 09/21/2000 olawlor Major rewrite of CCS-- underlying protocol is
1058 now binary (send/recv binary data everywhere);
1059 conv-host forwards requests to nodes; and
1060 source has been significantly re-arranged.
1061 (especially if NODE_0_IS_CONVHOST).
1062 11/22/2000 milind Removed IDL translator from distribution.
1063 12/01/2000 olawlor Renamed conv-host charmrun; added test for
1064 script conv-host. Also added charmrun for most
1066 12/17/2000 milind Moved List related data structures into
1067 cklists.h in util. Removed most of the redundant
1068 list implementations.
1069 12/20/2000 gzheng SUPER_INSTALL: format the output of list of
1070 versions and make the help page fit into one
1072 12/24/2000 milind Added test-{charm,converse,ampi,fem} targets to
1074 12/28/2000 milind net-sol-smp now uses pthreads.
1075 01/29/2001 olawlor Merged windowsNT and unix build procedures by
1076 basing the Windows build on cygwin. Added
1077 scripts to deal with unix and windows