1 This file describes the most significant changes. For more detail, use
2 'git log' on a clone of the charm repository.
4 ================================================================================
5 What's new in Charm++ 6.7.0
6 ================================================================================
8 Over 120 bugs fixed, spanning areas across the entire system
12 - New API for efficient formula-based distributed spare array creation
14 - CkLoop is now built by default
16 - CBase_Foo::pup need not be called from Foo::pup in user code anymore - runtime
17 code handles this automatically
19 - Error reporting and recovery in .ci files is greatly improved, providing more
20 precise line numbers and often column information
22 - Many data races occurring under shared-memory builds (smp, multicore) were
23 fixed, facilitating use of tools like ThreadSanitizer and Helgrind
27 - Further MPI standard compliance in AMPI allows users to build and run
28 Hypre-2.10.1 on AMPI with virtualization, migration, etc.
30 - Improved AMPI Fortran2003 PUP interface 'apup', similiar to C++'s STL PUP
32 Platforms and Portability
34 - Compiling Charm++ now requires support for C++11 variadic templates. In GCC,
35 this became available with version 4.3, released in 2008
37 - New machine target for multicore Linux ARM7: multicore-linux-arm7
39 - Preliminary support for POWER8 processors, in preparation for the upcoming
40 Summit and Sierra supercomputers
42 - The charmrun process launcher is now much more robust in the face of slow
43 or rate-limited connections to compute nodes
45 - PXSHM now auto-detects the node size, so the '+nodesize' is no longer needed
47 - Out-of-tree builds are now supported
51 - CommLib has been removed.
53 - CmiBool has been dropped in favor of C++'s bool
55 ================================================================================
56 What's new in Charm++ 6.6.0
57 ================================================================================
59 - Machine target files for Cray XC systems ('gni-crayxc') have been added
61 - Interoperability with MPI code using native communication interfaces on Blue
62 Gene Q (PAMI) and Cray XE/XK/XC (uGNI) systems, in addition to the universal
63 MPI communication interface
65 - Support for partitioned jobs on all machine types, including TCP/IP and IB
66 Verbs networks using 'netlrts' and 'verbs' machine layers
68 - A substantially improved version of our asynchronous library, CkIO, for
69 parallel output of large files
71 - Narrowing the circumstances in which the runtime system will send
72 overhead-inducing ReductionStarting messages
74 - A new fully distributed load balancing strategy, DistributedLB, that produces
75 high quality results with very low latency
77 - An API for applications to feed custom per-object data to specialized load
78 balancing strategies (e.g. physical simulation coordinates)
80 - SMP builds on LRTS-based machine layers (pamilrts, gni, mpi, netlrts, verbs)
81 support tracing messages through communication threads
83 - Thread affinity mapping with +pemap now supports Intel's Hyperthreading more
86 - After restarting from a checkpoint, thread affinity will use new
87 +pemap/+commap arguments
89 - Queue order randomization options were added to assist in debugging race
90 conditions in application and runtime code
92 - The full runtime code and associated libraries can now compile under the C11
93 and C++11/14 standards.
95 - Numerous bug fixes, performance enhancements, and smaller improvements in the
96 provided runtime facilities
99 * The long-unsupported FEM library has been deprecated in favor of ParFUM
100 * The CmiBool typedefs have been deleted, as C++ bool has long been universal
101 * Future versions of the runtime system and libraries will require some degree
102 of support for C++11 features from compilers
104 ================================================================================
105 What's new in Charm++ 6.5.0
106 ================================================================================
108 - The Charm++ manual has been thoroughly revised to improve its organization,
109 comprehensiveness, and clarity, with many additional example code snippets
112 - The runtime system now includes the 'Metabalancer', which can provide
113 substantial performance improvements for applications that exhibit dynamic
114 load imbalance. It provides two primary benefits. First, it automatically
115 optimizes the frequency of load balancer invocation, to avoid work stoppage
116 when it will provide too little benefit. Second, calls to AtSync() are made
117 less synchronous, to further reduce overhead when the load balancer doesn't
118 need to run. To activate the Metabalancer, pass the option +MetaLB at
119 runtime. To get the full benefits, calls to AtSync() should be made at every
120 iteration, rather than at some arbitrary longer interval as was previously
123 - Many feature additions and usability improvements have been made in the
124 interface translator that generates code from .ci files:
125 * Charmxi now provides much better error reports, including more accurate
126 line numbers and clearer reasons for failure, including some semantic
127 problems that would otherwise appear when compiling the C++ code or even at
129 * A new SDAG construct 'case' has been added that defines a disjunction over a
130 set of 'when' clauses: only one 'when' out of a set will ever be triggered.
131 * Entry method templates are now supported. An example program can be found
132 in tests/charm++/method_templates/.
133 * SDAG keyword "atomic" has been deprecated in favor of the newly supported
134 keyword "serial". The two are synonymous, but "atomic" is now provided only
135 for backward compatibility.
136 * It is no longer necessary to call __sdag_init() in chares that contain SDAG
137 code - the generated code does this automatically. The function is left as
138 a no-op for compatibility, but may be removed in a future version.
139 * Code generated from .ci files is now primarily in .def.h files, with only
140 declarations in .decl.h. This improves debugging, speeds compilation,
141 provides clearer compiler output, and enables more complete encapsulation,
142 especially in SDAG code.
143 * Mainchare constructors are expected to take CkArgMsg*, and always have
144 been. However, charmxi would allow declarations with no argument, and
145 assume the message. This is now deprecated, and generates a warning.
147 - Projections tracing has been extended and improved in various ways
148 * The trace module can generate a record of network topology of the nodes in
149 a run for certain platforms (including Cray), which Projections can
151 * If the gzip library (libz) is available when Charm++ is compiled, traces
152 are compressed by default.
153 * If traces were flushed as a results of filled buffers during the run, a
154 warning will be printed at exit to indicate that the user should be wary of
155 interference that may have resulted.
156 * In SMP builds, it is now possible to trace message progression through the
157 communication threads. This is disabled by default to avoid overhead and
158 potential misleading interpretation.
160 - Array elements can be block-mapped at the SMP node level instead of at the
161 per-PE level (option "+useNodeBlkMapping").
163 - AMPI can now privatize global and static variables using TLS. This is
164 supported in C and C++ with __thread markings on the variable declarations
165 and definitions, and in Fortran with a patched version of the gfortran
166 compiler. To activate this feature, append '-tls' to the '-thread' option's
167 argument when you link your AMPI program.
169 - Charm can now be built to only support message priorities of a specific data
170 type. This enables an optimized message queue within the the runtime
171 system. Typical applications with medium sized compute grains may not benefit
172 noticeably when switching to the new scheduler. However, this may permit
173 further optimizations in later releases.
175 The new queue is enabled by specifying the data type of the message
176 priorities while building charm using --with-prio-type=dtype. Here, dtype can
177 be one of char, short, int, long, float, double and bitvec. Specifying bitvec
178 will permit arbitrary-length bitvector priorities, and is the current default
179 mode of operation. However, we may change this in a future release.
181 - Converse now provides a complete set of wrappers for
182 fopen/fread/fwrite/fclose to handle EINTR, which is not uncommon on the
183 increasingly-popular Lustre. They are named CmiF{open,read,write,close}, and
184 are available from C and C++ code.
186 - The utility class 'CkEntryOptions' now permits method chaining for cleaner
187 usage. This applies to all its set methods (setPriority, setQueueing,
188 setGroupDepID). Example usage can be found in examples/charm++/prio/pgm.C.
190 - When creating groups or chare arrays that depend on the previous construction
191 of another such entity on the local PE, it is now possible to declare that
192 dependence to the runtime. Creation messages whose dependence is not yet
193 satisfied will be buffered until it is.
195 - For any given chare class Foo and entry method Bar, the supporting class's
196 member CkIndex_Foo::Bar() is used to lookup/specify the entry method
197 index. This release adds a newer API for such members where the argument is a
198 function pointer of the same signature as the entry method. Those new
199 functions are used like CkIndex_Foo::idx_Bar(&Foo::Bar). This permits entry
200 point index lookup without instantiating temporary variables just to feed the
201 CkIndex_Foo::Bar() methods. In cases where Foo::Bar is overloaded, &Foo::Bar
202 must be cast to the desired type to disambiguate it.
204 - CkReduction::reducerType now have PUP methods defined; and can hence be
205 passed as parameter-marshalled arguments to entry methods.
207 - The runtime option +stacksize for controlling the allocation of user-level
208 threads' stacks now accepts shorthanded annotation such as 1M.
210 - The -optimize flag to the charmc compiler wrapper now passes more aggressive
211 options to the various underlying compilers than the previous '-O'.
213 - The charmc compiler wrapper now provides a flag -use-new-std to enable
214 support for C11 and C++11 where available. To use this in application code,
215 the runtime system must have been built with that flag as well.
217 - When using, CmiMemoryUsage(), the runtime can be instructed not to use the
218 underlying mallinfo() library call, which can be inaccurate in settings where
219 usage exceeds INT_MAX. This is accomplished by setting the environment
220 variable "MEMORYUSAGE_NO_MALLINFO".
222 - Experimental Features
223 * Initial implementation of a fast message-logging protocol. Use option
224 'mlogft' to build it.
225 * Message compression support for persistent message on Gemini machine layer.
226 * Node-level inter-PE loop/task parallelization is now supported through
228 * New temperature/CPU frequency aware load balancer
229 * Support interoperation of Charm++ and native MPI code through dynamically
230 switching control between the two
231 * API in centralized load balancers to get and set PE speed
232 * A new scheme for optimization of double in-memory checkpoint/restart.
233 * Message combining library for improved fine-grained communication
235 * Support for partitioning of allocated nodes into subsets that run
236 independent Charm++ instances but can interact with each other.
238 Platform-Specific Changes
239 -------------------------
242 * The gemini_gni network layer has been heavily tuned and optimized,
243 providing substantial improvements in performance, scalability, and
245 * The gemini_gni-crayxe machine layer supports a 'hugepages' option at build
246 time, rather than requiring manual configuration file editing.
247 * Persistent message optimizations can be used to reduce latency and
249 * Experimental support for 'urgent' sends, which are sent ahead of any other
250 outgoing messages queued for transmission.
252 - IBM Blue Gene Q: Experimental machine-layer support for the native PAMI
253 interface and MPI, with and without SMP support. This supports many new
254 systems, including LLNL's Sequoia, ALCF's Mira, and FZ Juelich's Juqueen.
256 There are three network-layer implementations for these systems: 'mpi',
257 'pami', and 'pamilrts'. The 'mpi' layer is stable, but its performance and
258 scalability suffers from the additional overhead of using MPI rather than
259 driving the interconnect directly. The 'pami' layer is well tested for NAMD,
260 but has shown instability for other applications. It is likely to be replaced
261 by the 'pamilrts' layer, which is more generally stable and seems to provide
262 the same performance, in the next release.
264 In addition to the common 'smp' option to build the runtime system with
265 shared memory support, there is an 'async' option which sometimes provides
266 better performance on SMP builds. This option passes tests on 'pamilrts', but
267 is still experimental.
269 Note: Applications that have large number of messages may crash in default
270 setup due to overflow in the low-level FIFOs. Environment variables
271 MUSPI_INJFIFOSIZE and PAMI_RGETINJFIFOSIZE can be set to avoid application
272 failures due to large number of small and large messages respectively. The
273 default value of these variable is 65536 which is sufficient for 1000
276 - Infiniband Verbs: Better support for more flavors of ibverbs libraries
279 * Experimental rendezvous protocol for better performance above some MPI
281 * Some tuning parameters ("+dynCapSend" and "+dynCapRecv") are now
282 configurable at job launch, rather than Charm++ compilation.
284 - PGI C++: Disable automatic 'using namespace std;'
286 - Charm++ now supports ARM, both non-smp and smp.
288 - Mac OS X: Compilation options to build and link correctly on newer versions
291 ================================================================================
292 What's new in Charm++ 6.4.0
293 ================================================================================
295 --------------------------------------------------------------------------------
297 --------------------------------------------------------------------------------
299 - Cray XE and XK systems using the Gemini network via either MPI
300 (mpi-crayxe) or the native uGNI (gemini_gni-crayxe)
302 - IBM Blue Gene Q, using MPI (mpi-bluegeneq) or PAMI (pami-bluegeneq)
304 - Clang, Cray, and Fujitsu compilers
306 - MPI-based machine layers can now run on >64k PEs
308 --------------------------------------------------------------------------------
310 --------------------------------------------------------------------------------
312 - Added a new [reductiontarget] attribute to enable
313 parameter-marshaled recipients of reduction messages
315 - Enabled pipelining of large messages in CkMulticast by default
317 - New load balancers added:
320 * Scotch graph partitioning based: ScotchLB and Refine and Topo variants
323 - Load balancing improvements:
325 * Allow reduced load database size using floats instead of doubles
326 * Improved hierarchical balancer
327 * Periodic balancing adapts its interval dynamically
328 * User code can request a callback when migration is complete
329 * More balancers properly consider object migratability and PE
330 availability and speed
331 * Instrumentation records multicasts
333 - Chare arrays support options that can enable some optimizations
335 - New 'completion detection' library for parallel process termination
336 detection, when the need for modularity excludes full quiescence
339 - New 'mesh streamer' library for fine-grain many-to-many collectives,
340 handling message bundling and network topology
342 - Memory pooling allocator performance and resource usage improved
345 - AMPI: More routines support MPI_IN_PLACE, and those that don't check
348 ================================================================================
349 What's new in Charm++ 6.2.1 (since 6.2.0)
350 ================================================================================
352 --------------------------------------------------------------------------------
353 New Supported Platforms:
354 --------------------------------------------------------------------------------
356 POWER7 with LAPI on Linux
358 Infiniband on PowerPC
360 --------------------------------------------------------------------------------
362 --------------------------------------------------------------------------------
364 - Better support for multicasts on groups
365 - Topology information gathering has been optimized
366 - Converse (seed) load balancers have many new optimizations applied
367 - CPU affinity can be set more easily using +pemap and +commap options
368 instead of the older +coremap
369 - HybridLB (hierarchical balancing for very large core-count systems)
370 has been substantially improved
371 - Load balancing infrastructure has further optimizations and bug fixes
372 - Object mappings can be read from a file, to allow offline
373 topology-aware placement
374 - Projections logs can be spread across multiple directories, speeding
375 up output when dealing with thousands of cores (+trace-subdirs N
376 will divide log files evenly among N subdirectories of the trace
377 root, named PROGNAME.projdir.K)
378 - AMPI now implements MPI_Issend
379 - AMPI's MPI_Alltoall uses a flooding algorithm more agressively,
380 versus pairwise exchange
381 - Virtualized ARMCI support has been extended to cover the functions
384 --------------------------------------------------------------------------------
385 Architecture-specific changes
386 --------------------------------------------------------------------------------
388 - LAPI SMP has many new optimizations applied
390 - Net builds support the use of clusters' mpiexec systems for job
391 launch, via the ++mpiexec option to charmrun
393 ================================================================================
394 What's new in Charm++ 6.2.0 (since 6.1)
395 ================================================================================
397 --------------------------------------------------------------------------------
398 New Supported Platforms:
399 --------------------------------------------------------------------------------
401 64-bit MIPS, such as SiCortex, using mpi-linux-mips64
403 Windows HPC cluster, using mpi-win32/mpi-win64
405 Mac OSX 10.6, Snow Leopard (32-bit and 64-bit).
407 --------------------------------------------------------------------------------
409 --------------------------------------------------------------------------------
412 - Smarter build/configure scripts
413 - A new interface for model-based load balancing
414 - new CPU topology API
415 - a general implementation of CmiMemoryUsage()
416 - Bug fix: Quiescence detection (QD) works with immediate messages
417 - New reduction functions implemented in Converse
418 - CCS (Converse Client-Server) can deliver message to more than one processor
419 - Added a memory-aware adaptive scheduler, which can be optionally
421 - Added preliminary support for automatic message prioritization
422 (disabled by default)
425 - Cross-array and cross-group sections
426 - Structured Dagger (SDAG): Support templated arguments properly
427 - Plain chares support checkpoint/restart (both in-memory and disk-based)
428 - Conditional packing of messages and parameters in SMP scenario
429 - Changes to the CkArrayIndex class hierarchy
430 -- sizeof() all CkArrayIndex* classes is now the same
431 -- Codes using custom array indices have to use placement-new to construct
432 their custom index. Refer example code: examples/charm++/hello/fancyarray/
433 -- *** Backward Incompatibility ***
434 CkArrayIndex[4D/5D/6D]::index are now of type int (instead of short)
435 However the data is stored as shorts. Access by casting
436 CkArrayIndexND::data() appropriately
437 -- *** Deprecated ***
438 The direct use of public data member
439 CkArrayIndexND::index (N=1..6) is deprecated. We reserve the right to
440 change/remove this variable in future releases of Charm++.
441 Instead, please access the indices via member function:
442 int CkArrayIndexND::data()
445 - Compilers renamed to avoid collision with host MPI (ampicc, ampiCC,
447 - Improved MPI standard conformance, and documentation of non-conformance
448 * Bug fixes in: MPI_Ssend, MPI_Cart_shift, MPI_Get_count
449 * Support MPI_IN_PLACE in MPI_(All)Reduce
450 * Define various missing constants
451 - Return the received message's tag in response to a non-blocking
452 wildcard receive, to support SuperLU
453 - Improved tracing for BigSim
455 Multiphase Shared Arrays (MSA)
456 - Typed handles to enforce phases
457 - Split-phase synchronization to enable message-driven execution
461 - Automatic tracing of API calls for simulation and analysis
464 - Wider support for architectures other than net- (in particular MPI layers)
465 - Improved support for large scale debugging (better scalability)
466 - Enhanced record/replay stability to handle various events, and to
467 signal unexpected messages
468 - New detailed record/replay: The full content of messages can be
469 recorded, and a single processor can be re-executed outside of the
473 - Tracing of nested entry methods
475 Automatic Performance Tuning
476 - Created an automatic tuning framework [still for experimental use only]
479 - Network-topology / node aware spanning trees used internally for and
480 lower bytes on the network and improved performance in multicasts and
481 reductions delegated to this library
484 - Improved OneTimeMulticastStrategy classes
487 - Out-of-core support, with prefetching capability
488 - Detailed tracing of MPI calls
489 - Detailed record/replay support at emulation time, capable of
490 replaying any emulated processor after obtained recorded logs.
492 --------------------------------------------------------------------------------
493 Architecture-specific changes
494 --------------------------------------------------------------------------------
497 - Can run jobs with more than 1024 PEs
500 - New charmrun option ++no-va-randomization to disable address space
501 randomization (ASLR). This is most useful for running AMPI with
505 - Default to using ampicxx instead of mpiCC
508 - The +p option now has the same semantics as in other smp builds
511 - Support for VSX in SIMD abstraction API
514 - Compilers and options have been updated to the latest ones
517 - Added routines for measuring performance counters on BG/P.
518 - Updated to support latest DCMF driver version. On ANL's Intrepid, you may
519 need to set BGP_INSTALL=/bgsys/drivers/V1R4M1_460_2009-091110P/ppc in your
520 environment. This is the default on ANL's Surveyor.
523 - cputopology information is now available on XT3/4/5
526 - Bug fix: plug memory leaks that caused failures in long runs
527 - Optimized to reduce startup delays
530 - Support for SMP (experimental)
533 ================================================================================
534 Note that changes from 5.9, 6.0, and 6.1 are not documented here. A partial list
535 can be found on the charm download page, or by reading through version control
538 ================================================================================
539 What's New since Charm++ 5.4 release 1
540 ================================================================================
542 --------------------------------------------------------------------------------
543 New Supported Platforms:
544 --------------------------------------------------------------------------------
545 1. Charm++ ported to IA64 Itanium running Win2K and Linux, Charm++ also support
546 Intel C/C++ compilers;
548 2. Charm++ ported to Power Macintosh powerpc running Darwin;
550 3. Charm++ ported to Myrinet networking with GM API;
552 --------------------------------------------------------------------------------
553 Summary of New Features:
554 --------------------------------------------------------------------------------
556 Structured Dagger is a coordination language built on top of CHARM++.
557 Structured Dagger allows easy expression of dependences among messages and
558 computations and also among computations within the same object using
559 when-blocks and various structured constructs.
561 2. Entry functions support parameter marshalling
562 Now you can declare and invoke remote entry functions using parameter
563 marshalling instead of defining messages.
565 3. Easier running - standalone mode
566 For net-* version running locally, you can now run Charm programs without
567 charmrun. Running a node program directly from command line is now the
568 same as "charmrun +p1 <program>"; for SMP version, you can also specify
569 multiple (local) processors, as in "program +p2".
572 --------------------------------------------------------------------------------
574 --------------------------------------------------------------------------------
575 1. "build" changed for compilation of Charm++
576 To build Charm++ from scratch, we now take additional command line options
577 to compile with addon features and using different compilers other than gcc.
578 For example, to build Linux IA64 with Myrinet support, type command:
579 ./build net-linux-ia64 gm
582 ******* Old Change histories *******
585 ================================================================================
586 What's New in Charm++ 5.4 release 1 since 5.0
587 ================================================================================
589 --------------------------------------------------------------------------------
590 New Supported Platforms:
591 --------------------------------------------------------------------------------
593 1. Win9x/2000/NT: with Visual C++ or Cygwin gcc/g++, you can compile and run
594 Charm++ programs on all Win32 platforms.
596 2. Scyld Beowulf: Charm++ has been ported to the Linux-based Scyld Beowulf
597 operating system. For more information on Scyld, see <http://www.scyld.com>
599 3. MPI with VMI: Charm++ has been ported to NCSA's Virtual Machine Interface,
600 which is an efficient messaging library for heterogeneous cluster
604 --------------------------------------------------------------------------------
605 Summary of New Features:
606 --------------------------------------------------------------------------------
607 1. Dynamic Load balancing:
608 Chare migration is supported in the new release. Migration-based dynamic
609 load balancing framework with various load balancing strategies library has
613 Charm++ array is supported. You can now create an array of Chare objects
614 and use array index to refer the Charm++ array elements. A reduction
615 library on top of Chare array has been implemented and included.
618 Projections, a Java application for Charm++ program performance analysis and
619 visualization, has been included and distributed in the new release. Two
620 trace modes are available: trace-projections and trace-summary. Trace-summary
621 is a light-weight trace library compared to trace-projections.
624 AMPI is a load-balancing based library for porting legacy MPI applications
625 to Charm++. With few changes in the original MPI code to AMPI, the new
626 legacy MPI application on Charm++ will gain from Charm++'s adptive
627 load balancing ability.
630 "Charmrun" is now available on all platforms, with a uniform command line
631 syntax. You can forget the difference between net-* versions and MPI versions,
632 and run charm++ application with this same charmrun command syntax.
633 ++local option is added in charmrun for net-* version, it provides
634 simple local use of Charm and no longer require the ability to
635 "rsh localhost" or a nodelist file in order to run charm only on the local
636 machine. This is especially attractive when you run Charm++ on Windows.
639 Many new libraries have been added in this release. They include:
640 1) master-slave library: for writing manager-worker paradigm programs.
641 2) receiver library: provide asynchronous communication mode for chare array.
642 3) f90charm: provides Fortran90 bindings for Charm++ Array.
643 4) BlueGene: a Charm++/Converse emulator for IBM proposed Blue Gene.
645 --------------------------------------------------------------------------------
647 --------------------------------------------------------------------------------
648 1. message declaration syntax in .ci file:
649 The message declaration syntax for packed/varsize messages has been changed.
650 The packed/varsize keywords are eliminated, and you can specify the actual
651 actual varsize arrays in the interface file and have the translator generate
652 alloc, pack and unpack.
655 Here is the detailed list of Changes:
657 --------------------------------------------------------------------------------
659 --------------------------------------------------------------------------------
661 10/06/1999 rbrunner Added migration-based dynamic load balancing
663 11/15/1999 olawlor Added reduction support foe Charm++ arrays
664 02/06/2000 milind Added AMPI, an implementation of MPI with
665 dynamic load balancing
666 02/18/2000 paranjpy New platforms supported: net-win32, and net-win32-smp
667 04/04/2000 olawlor Added arbitrarily indexed Charm++ arrays.
668 Also, added translator support for new arrays.
669 04/15/2000 olawlor Added "puppers" for packing and unpacking
671 06/14/2000 milind Added the threaded FEM framework.
673 --------------------------------------------------------------------------------
675 --------------------------------------------------------------------------------
677 10/09/1999 rbrunner Added packlib, a library for C and C++ to
678 pack-unpack data to/from Charm++ messages.
679 10/13/1999 gzheng New LB strategy: RefineLB
680 10/13/1999 paranjpy New LB Strategy: Heap
681 10/14/1999 milind New LB Strategy: Metis
682 10/19/1999 olawlor New test program for testing LB strategies.
683 10/21/1999 gzheng New trace mode: trace-summary
684 10/28/1999 milind New supported platform: net-sol-x86
685 10/29/1999 milind Added runtime checks for ChareID assignment.
686 11/10/1999 rbrunner Added Neighborhood base strategy for LB
688 11/15/1999 olawlor conv-host now reads in a startup file
690 11/15/1999 olawlor New test program for testing array reductions.
691 11/16/1999 rbrunner Added processor-speed checking functions to
693 11/19/1999 milind Mapped SIGUSR to a Ccd condtion handler
694 11/22/1999 rbrunner New LB strategy: WSLB
695 11/29/1999 ruiliu Modified Metis LB strategy to deal with
696 different processor speeds
697 12/16/1999 rbrunner New LB strategy: GreedyRef
698 12/16/1999 rbrunner New LB strategy: RandRef
699 12/21/1999 skumar2 New LB strategy: CommLB
700 01/03/2000 rbrunner New LB strategy: RecBisectBfLB
701 01/08/2000 skumar2 New LB strategy: Comm1LB, with varying processor
703 01/18/2000 milind Modified SM library syntax, and added a test
705 01/19/2000 gzheng Added irecv, a library to simplify conversion
706 of message-passing programs to Charm++
707 02/20/2000 olawlor Added preliminary broadcast support to Charm++
709 02/23/2000 paranjpy Added converse-level quiescence detection
710 03/02/2000 milind Added ++server-port option to pre-specify
712 03/10/2000 wilmarth Random seed-based load balancer now uses
713 bit-vector for active PEs.
714 03/21/2000 gzheng Added support for marking user-defined events
716 03/28/2000 wilmarth Added CMK_TRUECRASH. Very helpful for
717 post-mortem debugging of Charm++ programs on
719 03/31/2000 jdesouza Added Fortran90 support to the Charm++
720 interface translator.
721 03/09/2000 milind Added support for -LANG and -rpath options
722 in charmc for Origin2000.
723 04/28/2000 milind Added prioritized converse threads.
724 05/01/2000 milind Added test programs for TeMPO, AMPI and irecv.
725 05/04/2000 milind New supported platform: mpi-sp.
726 05/04/2000 gzheng Added irecv pingpong program.
727 05/17/2000 olawlor Each chare, group and array element now has to
728 have migration constructor.
729 05/24/2000 milind Added Jacobi3D programs for irecv and AMPI both.
730 05/24/2000 milind Made migratable an optional attribute of
731 chares, groups, and nodegroups.
732 Arrays are by default migratable.
733 05/29/2000 paranjpy Added pup methods to arrays, reductions etc
735 06/13/2000 milind Made CtvInitialize idempotent. That is, it
736 can be called by any number of threads now,
737 only the first one will actually do
739 06/20/2000 milind Added a simple test program for the FEM
741 07/06/2000 milind Imported Metis 4.0 sources in the CVS tree.
742 Also added code to make metis libraries and
743 executables to Makefile.
744 07/07/2000 milind Added more meaningfull error messages using
745 perror in addition to a cryptic error codes in
747 07/10/2000 milind fem and femf are now recognized as "languages"
749 07/10/2000 saboo Added the derived datatypes library.
750 07/13/2000 milind Added +idle_timeout functionality. It takes a
751 commandline parameter denoting milliseconds of
752 maximum consecutive idle time allowed per
754 07/14/2000 milind Added group multicast. Added
755 CkSendMsgBranchMulti, CldEnqueueMulti, and
756 translator changes to support it.
757 07/14/2000 milind SUPER_INSTALL now takes "-*" arguments prior
758 to the target, that will be passed to make as
759 "makeflags". This makes it easy to suppress
760 make's output of commands etc (with the -s
761 flag). As a result of this, several Makefiles
763 07/18/2000 milind Added support for using "dbx" on suns as
765 07/19/2000 milind Added ability to tracemode projections which
766 produces binary trace files. Use flag
767 +binary-trace on the command line.
768 07/26/2000 milind Separated AMPI from TeMPO.
769 07/28/2000 milind Added test programs to test reduce, alltoall
770 and allreduce functionality of AMPI.
771 08/02/2000 milind Added an option to let the user specify which
772 "xterm" to use. For example, on some systems
773 (CDE), only dtterm is installed. So, by
774 putting ++xterm dtterm on the conv-host
775 commandline, one can use dtterm when ++in-xterm
776 option is specified on conv-host commandline.
777 08/14/2000 milind FEM Framework: Added capabilities to handle
778 esoteric meshes to standalone offline programs.
779 Makefile now produces gmap and fgmap programs,
780 which are used for this purpose. They convert
781 the mesh to a graph before partitioning it
783 08/24/2000 milind Added the 2D crack propagation program as a
784 test program for FEM framework.
785 08/25/2000 milind Initial implementation of isomalloc-based
786 threads. This implementation uses a fixed
787 stack size for all threads (can be set at
789 08/26/2000 milind Added a macro CtvAccessOther that lets you
790 get/set a Ctv variable of any thread. It
791 should be invoked as CtvAccessOther(thread,
792 varname); Added CthGetData function to each of
793 the threads implementation. This function is
794 used in the CtvAccessOther macro.
795 08/27/2000 milind FEM Framework: Separated mesh to graph
796 conversion capability into a separate program.
797 This way, the generated graph can be partitioned
799 09/04/2000 milind Added the class static readonly variables to
801 09/05/2000 milind FEM Framework: A very fast O(n) algorithm for
802 mesh2graph , uses more memory, but the tradeoff
803 was worth it. Coded by Karthik Mahesh, minor
804 optimizations by Milind.
805 09/05/2000 milind Added a barebones charm kernel scheduling
806 overhead measurement program.
807 09/15/2000 milind Added pup support for AMPI and FEM framework.
808 09/20/2000 olawlor Added capability to have an array of base type
809 where individual element could be of derived
811 10/03/2000 gzheng New supported platform: net-linux-axp
812 10/05/2000 skumar2 Added program littleMD to the test suite.
813 10/07/2000 skumar2 New job scheduler (Faucets projects).
814 10/15/2000 milind Improved support for Fortran90 in charmc.
815 11/04/2000 jdesouza Made the Faucets scheduler multi-threaded.
816 11/05/2000 olawlor FEM Framework: supports multiple element types,
817 mesh re-assembly, etc.
818 11/15/2000 gzheng New platform support: net-cygwin
819 11/18/2000 gzheng conv-host no longer needs /bin/csh to start
821 CMK_CONV_HOST_CSH_UNAVAILABLE to 1 to use
823 11/25/2000 milind Finished experimental implementation of
824 converse-threads based on co-operative pthreads.
825 11/25/2000 milind Added a benchmark suite of all pingpongs in
827 11/28/2000 milind Removed deletion of _idx at the end of every
828 send or doneInserting call. Instead now it is
829 in the destructor of the proxy. This allows us
830 to cache proxies, when proxy creation becomes
832 11/28/2000 olawlor Added "seek blocks" to puppers. This should
833 allow out-of-order pup'ing without the ugliness
834 of getBuf; and in a way that works with all
836 11/29/2000 olawlor Simplified and regularized command-line-argument
838 11/29/2000 milind AMPI: Added multiple-communicators capability.
839 12/05/2000 gzheng Now /bin/sh is default shell to fork node
840 program on remote machines.
841 12/13/2000 olawlor Added charmrun wrapper for poe on mpi-sp.
842 12/14/2000 milind Added bluegene emulator sources and test
843 programs. Added "bluegene" as a language known
844 to charmc. Makefile now has a target called
845 bluegene. Added preliminary bluegene
846 documentation. (copied from Arun's webpage.)
847 12/15/2000 gzheng f90charm addition to Makefile and charmc. Also,
848 added fixed size arrays support to f90charm. A
849 test program f90charm/hello is checked in.
850 12/17/2000 milind Added rtest test program. Contributed by jim to
851 test Converse message transmission.
852 12/20/2000 olawlor Added charmconfig script. Enables automatic
853 determination of C++ compiler properties,
854 replacing the verbose and error-prone
855 conv-mach.h entries for CMK_BOOL,
856 CMK_STL_USE_DOT_H, CMK_CPP_CAST_OK, ...
857 12/20/2000 olawlor Charm++ Arrays optimizations: Key and object
858 now variable-length fields, instead of pointers.
859 This extra flexibility lets us save many
860 dynamic allocations in the array framework.
861 12/20/2000 olawlor Added PUP::able support-- dynamic type
862 identification, allocation, and deletion.
863 Allows you to write: p(objPtr); and
864 objPointer will be properly identified,
865 allocated, packed, and deallocated (depending
866 on the PUP::er). Requires you to register any
867 such classes with DECLARE_PUPable and
869 12/20/2000 olawlor Arrays optimizations: Made CkArrayIndex
870 fixed-size. This significantly improves
871 messaging speed (7 us instead of 10 us
872 roundtrip). Move spring cleaning check into a
873 CcdCallFnAfter, which gains more speed (down to
875 12/20/2000 olawlor More optimizations: Minor speed tweaks--
876 conv-ccs.c uses hashtable for handler lookup;
877 conv-conds skips timer test until needed;
878 convcore.c scheduler loop optmizations (no
879 superfluous EndIdle calls); threads.c
880 CMK_OPTIMIZE-> no mprotect.
881 12/20/2000 olawlor More Optimizations: Minor speed tweaks-- ck.C
882 groups cldEnqueue skip; init.h defines
883 CkLocalBranch inline; and supporting changes.
884 12/22/2000 gzheng IA64 support for Converse user level threads.
885 01/02/2001 olawlor CCS: Minor update-- enabled CcsProbe, cleaned
886 up superflous debug messages in server, added
887 Java interface (originally written for
889 01/09/2001 gzheng charmconfig converted to autoconf style, need
890 to change configure.in and conv-autoconfig.h.in,
891 and run autoconf to get configure and copy to
892 charmconfig. added fortran subroutine name
893 test and get libpthread.a
894 01/10/2001 milind Added telnet method of getting libpthread.a
895 from charm webserver.
896 01/11/2001 olawlor Moved projections files here from
897 CVSROOT/projections-java. Added fast Java
898 versions of the .log file input routines in
899 LogReader, LogLoader, LogAnalyzer, and
900 UsageCalc. Added "U.java" user interface
901 utility file, allowing times to be input in
902 seconds, milliseconds, or microseconds,
903 instead of just microseconds.
904 01/15/2001 gzheng add +trace-root to specify the directory to
905 put log files in. this is need in Scyld cluster
906 where there is no NFS mounting and no i/o
907 access to home directory sharing on nodes.
908 01/15/2001 milind Made AMPI into a f90 module instead of
909 'ampif.h' inclusion. AMPI f90 bindings are
910 now more inclusive. Fixed argc,argv handling
911 bugs in ArgsInfo message. Fixed a bug in pup
912 that caused thread not to be sized, but was
913 packed nevertheless. Moved irecv to waitall
914 instead of at in ampi_start. Made
915 AMPI_COMM_WORLD to be 0, because it clashed
916 with wildcard(-1). AMPI_COMM_UNIVERSE is now
917 handled properly in the AMPI module.
918 C/C++ data members are NOT visible to
920 01/18/2001 gzheng New supported platform: net-linux-scyld
921 01/20/2001 olawlor Moved array index field from CMessage_* to the
922 Ck envelope itself. This is the right thing
923 to do, because any message may be sent to/from
924 an array element. To reduce the wasted space
925 in a message, a union is used to overlay the
926 fields for the various possible message types.
927 01/29/2001 olawlor Freed charmrun on net-* version from using
928 remote shell to fork off processes. One can now
929 use a daemon provided in the distribution.
930 02/07/2001 olawlor Added debugging support to puppers.
931 02/13/2000 gzheng Added ++local option to charmrun to start node
932 program locally without any daemon; fix the
933 hang program if you type wrong pgm name in
934 scyld version, and redirect all output to
935 /dev/null, otherwise all node program can send
936 its output to console in scyld. Also implemented ++local in net-win32 version.
937 02/26/2000 milind Changed the varsize syntax. Now one can specify
938 actual varsize arrays in the interface file
939 and have the translator generate alloc, pack
942 --------------------------------------------------------------------------------
944 --------------------------------------------------------------------------------
946 10/29/1999 milind Replaced jmemcpy by memcpy in net versions, as
947 it was causing a bit to flip (bug reported
949 10/29/1999 milind Fixed multiline macros in all header files.
950 02/05/2000 milind Fixed linking errors by getting the order of
951 libraries right from the charmc command-line.
952 02/18/2000 paranjpy Fixed Charm++ initialization bug on SMPs.
953 02/21/2000 milind Fixed a context-switching bug in mipspro version
955 02/25/2000 milind Charm++ interface translator was segfaulting
956 on interface file errors. Fixed that. Also,
957 added linenumbers to error messages.
958 03/02/2000 milind Made CCS work on SMPs.
959 03/07/2000 milind Made ConverseInit consistent with the manual on
961 04/18/2000 milind Fixed a bug in CkWaitFuture, which was caching
962 a variable locally, while it was changed by
964 05/04/2000 paranjpy Fixed argv deletion bug on net-win32-smp.
965 06/08/2000 milind sp3 version: changed optimization flags, which
966 where power2 processor-specific.
967 06/20/2000 milind mpi-* versions: Fixed ConverseExit since it was
968 not obeying the following statement in the MPI
969 standard: The user must ensure that all pending
970 communications involving a process completes
971 before the process calls MPI_FINALIZE.
972 07/05/2000 milind Fixed a nasty bug in charmc in the -cp option.
973 It used to append the name provided to -o flag
974 to the directory provided to the -cp flag.
975 Thus, -o ../pgm -cp ../bin options meant that
976 the pgm would be copied to ../bin/.., which is
977 not the expected behavior. This fix correctly
978 copies pgm to ../bin.
979 07/07/2000 milind Removed variable arg_myhome, as it was not
980 being used anywhere, and also, setting it was
981 causing problems of env var HOME was not set.
982 07/27/2000 milind thishandle for the arrayelement was not being
983 correctly set. Bug was reported by Neelam.
984 08/26/2000 milind Origin2000: Changed the page alignment to
985 reflect the mmap alignment. The mmap man page
986 specifically states that it is not the same as
988 09/02/2000 milind Fixed a bug in code generated for threaded
989 (void) entry methods of array elements. The
990 dummy message that is passed to that method in
991 a thread has to be deleted before calling the
992 object method, because upon object method's
993 return, the thread might have migrated.
994 09/03/2000 olawlor Minor fix-fixes: 1.) Change to LBObjid hash
995 function would fail for >4-int object indices.
996 Replaced with proper function, which also
997 preserves the 1-int case. 2.) Array element
998 sends must go via the message queue to prevent
999 stack build-up for deep single-processor call
1000 chains. These might happen, e.g., in a driver
1001 element calling itself for the main time loop.
1002 Messages are now properly noted as sent, then
1003 wait through the queue for delivery. This
1004 entailed minor reorganization of the message
1006 09/21/2000 olawlor Tiny SMP thread fix-- registrations of a
1007 thread-private variable now reserve space on
1008 calls after the first. This wastes space for
1009 multiple CthInitialize's-- it's a quick hack to
1010 get threads working again on SMP versions.
1011 10/16/2000 olawlor A few CCS fixes: -Added split-phase reply
1012 (delay reply indefinitely) -Cleaned up error
1013 handling -Pass user data as "void *" instead of
1015 11/03/2000 wilmarth Removed 0 size array allocation in Charm++
1016 quiescence detection.
1017 11/20/2000 gzheng Rewrote part of Fiber thread, including a bug
1018 fix for a the non thread-safe function, and a
1019 different fiber free strategy.
1020 11/29/2000 gzheng The LB init procedure tried to allocate
1021 65536*160 as initial size, which is 10M memory
1022 for communication table, which is too big.
1023 Cut it down to roughly 1M, and it can expand
1025 12/05/2000 gzheng In many cases, conv-host exits without print
1026 out the error message from remote shell. try
1027 to fix it by calling sync to flush the pipe
1029 12/10/2000 milind net-linux: Made static linking the default
1030 option because dynamic linking runtime causes
1031 isomalloc threads to crash.
1032 12/18/2000 milind Increased portability of isomalloc threads by
1033 removing dependence on alloca.
1034 12/28/2000 milind Fixed ctrl-getone abort bug on SMP.
1035 12/28/2000 milind Made _groupTable a pointer on which a
1036 constructor is explicitly called. Since it
1037 was a Cpv variable, its constructor was not
1038 called by default in case of an SMP version.
1039 12/29/2000 olawlor Prevent infinite copy constructor recursion on
1041 01/10/2001 olawlor Added "explicit" keyword to remove ambiguity
1042 for KCC, which was confused by the private
1043 PUP::er(int) "cast" constructor and the operator
1044 |(PUP::er &p,T &t) into rejecting all operator|
1045 (int,int) as ambiguous.
1046 2001/01/17 gzheng fix the charmconfig bug on paragon-red: the
1047 failure testing of fortran won't stop the
1049 01/20/2001 olawlor Arrays reduction: Fixed bug-- reduction may end
1050 because all contributors migrate away.
1051 01/29/2001 olawlor Fix heap-corrupting bug-- call ->init() on
1052 nodeGroupTable, which sets the "pending"
1053 message queue to NULL. This prevents a nasty
1054 delete-unitialized-data bug later on. Also
1055 delayed queue creation until messages actually
1058 --------------------------------------------------------------------------------
1059 Documentation Changes:
1060 --------------------------------------------------------------------------------
1062 01/31/2000 milind Installation manual: Fixed bugs pointed out by
1064 02/28/2000 wilmarth Added a new look Charm++ manual.
1065 06/20/2000 milind Added pdflatex support to generate PDF versions
1066 of manuals from LaTeX sources.
1067 12/05/2000 milind Added Orion's FEM manual. Converted from HTML.
1068 12/10/2000 milind Added pplmanual.sty for all manuals.
1069 12/17/2000 milind Added master-slave library documentation to
1071 12/21/2000 saboo Added DDT documentation.
1072 01/02/2001 olawlor Updated for new CCS version.
1074 --------------------------------------------------------------------------------
1076 --------------------------------------------------------------------------------
1078 10/24/1999 olawlor charmc is changed to Bourne shell script
1079 instead of csh. All conv-mach.csh are
1080 replaced by conv-mach.sh.
1081 10/25/1999 olawlor SUPER_INSTALL is converted to use bourne shell.
1082 10/28/1999 milind All Makefiles now take OPTS commandline
1084 01/16/2000 olawlor Simplified Charm++ interface translator.
1085 02/23/2000 ruiliu Changed rand() calls from all over the codes
1086 to the new Converse random number generator.
1087 02/26/2000 milind Simplified the converse scheduler loop by
1088 combining the maxmsgs and poll modes.
1089 08/31/2000 milind Imported system documentation into the CVS tree.
1090 Also added super_install target for docs with
1091 necessary Makefile modifications.
1092 09/08/2000 olawlor Made soft links use relative pathnames instead
1093 of absolute. This lets you move a charm++
1094 installation without having to recompile
1096 09/11/2000 olawlor Grouped commonly needed code in the new util
1097 directory. Also, added pup_c a C wrapper for
1099 09/11/2000 olawlor Slightly reorganized header structure. Now no
1100 headers should need to be listed twice (once in
1101 ALLHEADERS, again in CKHEADERS). Now headers
1102 are soft-linked instead of copied. This makes
1103 development much easier. Added support for the
1104 new Common/util directory.
1105 09/21/2000 olawlor Major reorganization of net-* codes. Now all
1106 the TCP socket routines are in separate files.
1107 Also combined windoes NT code with unix codes.
1108 09/21/2000 olawlor Major rewrite of CCS-- underlying protocol is
1109 now binary (send/recv binary data everywhere);
1110 conv-host forwards requests to nodes; and
1111 source has been significantly re-arranged.
1112 (especially if NODE_0_IS_CONVHOST).
1113 11/22/2000 milind Removed IDL translator from distribution.
1114 12/01/2000 olawlor Renamed conv-host charmrun; added test for
1115 script conv-host. Also added charmrun for most
1117 12/17/2000 milind Moved List related data structures into
1118 cklists.h in util. Removed most of the redundant
1119 list implementations.
1120 12/20/2000 gzheng SUPER_INSTALL: format the output of list of
1121 versions and make the help page fit into one
1123 12/24/2000 milind Added test-{charm,converse,ampi,fem} targets to
1125 12/28/2000 milind net-sol-smp now uses pthreads.
1126 01/29/2001 olawlor Merged windowsNT and unix build procedures by
1127 basing the Windows build on cygwin. Added
1128 scripts to deal with unix and windows