1 This file describes the most significant changes. For more detail, use
2 'git log' on a clone of the charm repository.
4 ================================================================================
5 What's new in Charm++ 6.6.0
6 ================================================================================
11 * The long-unsupported FEM library has been deprecated in favor of ParFUM
12 * The CmiBool typedefs have been deleted, as C++ bool has long been universal
14 ================================================================================
15 What's new in Charm++ 6.5.0
16 ================================================================================
18 - The Charm++ manual has been thoroughly revised to improve its organization,
19 comprehensiveness, and clarity, with many additional example code snippets
22 - The runtime system now includes the 'Metabalancer', which can provide
23 substantial performance improvements for applications that exhibit dynamic
24 load imbalance. It provides two primary benefits. First, it automatically
25 optimizes the frequency of load balancer invocation, to avoid work stoppage
26 when it will provide too little benefit. Second, calls to AtSync() are made
27 less synchronous, to further reduce overhead when the load balancer doesn't
28 need to run. To activate the Metabalancer, pass the option +MetaLB at
29 runtime. To get the full benefits, calls to AtSync() should be made at every
30 iteration, rather than at some arbitrary longer interval as was previously
33 - Many feature additions and usability improvements have been made in the
34 interface translator that generates code from .ci files:
35 * Charmxi now provides much better error reports, including more accurate
36 line numbers and clearer reasons for failure, including some semantic
37 problems that would otherwise appear when compiling the C++ code or even at
39 * A new SDAG construct 'case' has been added that defines a disjunction over a
40 set of 'when' clauses: only one 'when' out of a set will ever be triggered.
41 * Entry method templates are now supported. An example program can be found
42 in tests/charm++/method_templates/.
43 * SDAG keyword "atomic" has been deprecated in favor of the newly supported
44 keyword "serial". The two are synonymous, but "atomic" is now provided only
45 for backward compatibility.
46 * It is no longer necessary to call __sdag_init() in chares that contain SDAG
47 code - the generated code does this automatically. The function is left as
48 a no-op for compatibility, but may be removed in a future version.
49 * Code generated from .ci files is now primarily in .def.h files, with only
50 declarations in .decl.h. This improves debugging, speeds compilation,
51 provides clearer compiler output, and enables more complete encapsulation,
52 especially in SDAG code.
53 * Mainchare constructors are expected to take CkArgMsg*, and always have
54 been. However, charmxi would allow declarations with no argument, and
55 assume the message. This is now deprecated, and generates a warning.
57 - Projections tracing has been extended and improved in various ways
58 * The trace module can generate a record of network topology of the nodes in
59 a run for certain platforms (including Cray), which Projections can
61 * If the gzip library (libz) is available when Charm++ is compiled, traces
62 are compressed by default.
63 * If traces were flushed as a results of filled buffers during the run, a
64 warning will be printed at exit to indicate that the user should be wary of
65 interference that may have resulted.
66 * In SMP builds, it is now possible to trace message progression through the
67 communication threads. This is disabled by default to avoid overhead and
68 potential misleading interpretation.
70 - Array elements can be block-mapped at the SMP node level instead of at the
71 per-PE level (option "+useNodeBlkMapping").
73 - AMPI can now privatize global and static variables using TLS. This is
74 supported in C and C++ with __thread markings on the variable declarations
75 and definitions, and in Fortran with a patched version of the gfortran
76 compiler. To activate this feature, append '-tls' to the '-thread' option's
77 argument when you link your AMPI program.
79 - Charm can now be built to only support message priorities of a specific data
80 type. This enables an optimized message queue within the the runtime
81 system. Typical applications with medium sized compute grains may not benefit
82 noticeably when switching to the new scheduler. However, this may permit
83 further optimizations in later releases.
85 The new queue is enabled by specifying the data type of the message
86 priorities while building charm using --with-prio-type=dtype. Here, dtype can
87 be one of char, short, int, long, float, double and bitvec. Specifying bitvec
88 will permit arbitrary-length bitvector priorities, and is the current default
89 mode of operation. However, we may change this in a future release.
91 - Converse now provides a complete set of wrappers for
92 fopen/fread/fwrite/fclose to handle EINTR, which is not uncommon on the
93 increasingly-popular Lustre. They are named CmiF{open,read,write,close}, and
94 are available from C and C++ code.
96 - The utility class 'CkEntryOptions' now permits method chaining for cleaner
97 usage. This applies to all its set methods (setPriority, setQueueing,
98 setGroupDepID). Example usage can be found in examples/charm++/prio/pgm.C.
100 - When creating groups or chare arrays that depend on the previous construction
101 of another such entity on the local PE, it is now possible to declare that
102 dependence to the runtime. Creation messages whose dependence is not yet
103 satisfied will be buffered until it is.
105 - For any given chare class Foo and entry method Bar, the supporting class's
106 member CkIndex_Foo::Bar() is used to lookup/specify the entry method
107 index. This release adds a newer API for such members where the argument is a
108 function pointer of the same signature as the entry method. Those new
109 functions are used like CkIndex_Foo::idx_Bar(&Foo::Bar). This permits entry
110 point index lookup without instantiating temporary variables just to feed the
111 CkIndex_Foo::Bar() methods. In cases where Foo::Bar is overloaded, &Foo::Bar
112 must be cast to the desired type to disambiguate it.
114 - CkReduction::reducerType now have PUP methods defined; and can hence be
115 passed as parameter-marshalled arguments to entry methods.
117 - The runtime option +stacksize for controlling the allocation of user-level
118 threads' stacks now accepts shorthanded annotation such as 1M.
120 - The -optimize flag to the charmc compiler wrapper now passes more aggressive
121 options to the various underlying compilers than the previous '-O'.
123 - The charmc compiler wrapper now provides a flag -use-new-std to enable
124 support for C11 and C++11 where available. To use this in application code,
125 the runtime system must have been built with that flag as well.
127 - When using, CmiMemoryUsage(), the runtime can be instructed not to use the
128 underlying mallinfo() library call, which can be inaccurate in settings where
129 usage exceeds INT_MAX. This is accomplished by setting the environment
130 variable "MEMORYUSAGE_NO_MALLINFO".
132 - Experimental Features
133 * Initial implementation of a fast message-logging protocol. Use option
134 'mlogft' to build it.
135 * Message compression support for persistent message on Gemini machine layer.
136 * Node-level inter-PE loop/task parallelization is now supported through
138 * New temperature/CPU frequency aware load balancer
139 * Support interoperation of Charm++ and native MPI code through dynamically
140 switching control between the two
141 * API in centralized load balancers to get and set PE speed
142 * A new scheme for optimization of double in-memory checkpoint/restart.
143 * Message combining library for improved fine-grained communication
145 * Support for partitioning of allocated nodes into subsets that run
146 independent Charm++ instances but can interact with each other.
148 Platform-Specific Changes
149 -------------------------
152 * The gemini_gni network layer has been heavily tuned and optimized,
153 providing substantial improvements in performance, scalability, and
155 * The gemini_gni-crayxe machine layer supports a 'hugepages' option at build
156 time, rather than requiring manual configuration file editing.
157 * Persistent message optimizations can be used to reduce latency and
159 * Experimental support for 'urgent' sends, which are sent ahead of any other
160 outgoing messages queued for transmission.
162 - IBM Blue Gene Q: Experimental machine-layer support for the native PAMI
163 interface and MPI, with and without SMP support. This supports many new
164 systems, including LLNL's Sequoia, ALCF's Mira, and FZ Juelich's Juqueen.
166 There are three network-layer implementations for these systems: 'mpi',
167 'pami', and 'pamilrts'. The 'mpi' layer is stable, but its performance and
168 scalability suffers from the additional overhead of using MPI rather than
169 driving the interconnect directly. The 'pami' layer is well tested for NAMD,
170 but has shown instability for other applications. It is likely to be replaced
171 by the 'pamilrts' layer, which is more generally stable and seems to provide
172 the same performance, in the next release.
174 In addition to the common 'smp' option to build the runtime system with
175 shared memory support, there is an 'async' option which sometimes provides
176 better performance on SMP builds. This option passes tests on 'pamilrts', but
177 is still experimental.
179 Note: Applications that have large number of messages may crash in default
180 setup due to overflow in the low-level FIFOs. Environment variables
181 MUSPI_INJFIFOSIZE and PAMI_RGETINJFIFOSIZE can be set to avoid application
182 failures due to large number of small and large messages respectively. The
183 default value of these variable is 65536 which is sufficient for 1000
186 - Infiniband Verbs: Better support for more flavors of ibverbs libraries
189 * Experimental rendezvous protocol for better performance above some MPI
191 * Some tuning parameters ("+dynCapSend" and "+dynCapRecv") are now
192 configurable at job launch, rather than Charm++ compilation.
194 - PGI C++: Disable automatic 'using namespace std;'
196 - Charm++ now supports ARM, both non-smp and smp.
198 - Mac OS X: Compilation options to build and link correctly on newer versions
201 ================================================================================
202 What's new in Charm++ 6.4.0
203 ================================================================================
205 --------------------------------------------------------------------------------
207 --------------------------------------------------------------------------------
209 - Cray XE and XK systems using the Gemini network via either MPI
210 (mpi-crayxe) or the native uGNI (gemini_gni-crayxe)
212 - IBM Blue Gene Q, using MPI (mpi-bluegeneq) or PAMI (pami-bluegeneq)
214 - Clang, Cray, and Fujitsu compilers
216 - MPI-based machine layers can now run on >64k PEs
218 --------------------------------------------------------------------------------
220 --------------------------------------------------------------------------------
222 - Added a new [reductiontarget] attribute to enable
223 parameter-marshaled recipients of reduction messages
225 - Enabled pipelining of large messages in CkMulticast by default
227 - New load balancers added:
230 * Scotch graph partitioning based: ScotchLB and Refine and Topo variants
233 - Load balancing improvements:
235 * Allow reduced load database size using floats instead of doubles
236 * Improved hierarchical balancer
237 * Periodic balancing adapts its interval dynamically
238 * User code can request a callback when migration is complete
239 * More balancers properly consider object migratability and PE
240 availability and speed
241 * Instrumentation records multicasts
243 - Chare arrays support options that can enable some optimizations
245 - New 'completion detection' library for parallel process termination
246 detection, when the need for modularity excludes full quiescence
249 - New 'mesh streamer' library for fine-grain many-to-many collectives,
250 handling message bundling and network topology
252 - Memory pooling allocator performance and resource usage improved
255 - AMPI: More routines support MPI_IN_PLACE, and those that don't check
258 ================================================================================
259 What's new in Charm++ 6.2.1 (since 6.2.0)
260 ================================================================================
262 --------------------------------------------------------------------------------
263 New Supported Platforms:
264 --------------------------------------------------------------------------------
266 POWER7 with LAPI on Linux
268 Infiniband on PowerPC
270 --------------------------------------------------------------------------------
272 --------------------------------------------------------------------------------
274 - Better support for multicasts on groups
275 - Topology information gathering has been optimized
276 - Converse (seed) load balancers have many new optimizations applied
277 - CPU affinity can be set more easily using +pemap and +commap options
278 instead of the older +coremap
279 - HybridLB (hierarchical balancing for very large core-count systems)
280 has been substantially improved
281 - Load balancing infrastructure has further optimizations and bug fixes
282 - Object mappings can be read from a file, to allow offline
283 topology-aware placement
284 - Projections logs can be spread across multiple directories, speeding
285 up output when dealing with thousands of cores (+trace-subdirs N
286 will divide log files evenly among N subdirectories of the trace
287 root, named PROGNAME.projdir.K)
288 - AMPI now implements MPI_Issend
289 - AMPI's MPI_Alltoall uses a flooding algorithm more agressively,
290 versus pairwise exchange
291 - Virtualized ARMCI support has been extended to cover the functions
294 --------------------------------------------------------------------------------
295 Architecture-specific changes
296 --------------------------------------------------------------------------------
298 - LAPI SMP has many new optimizations applied
300 - Net builds support the use of clusters' mpiexec systems for job
301 launch, via the ++mpiexec option to charmrun
303 ================================================================================
304 What's new in Charm++ 6.2.0 (since 6.1)
305 ================================================================================
307 --------------------------------------------------------------------------------
308 New Supported Platforms:
309 --------------------------------------------------------------------------------
311 64-bit MIPS, such as SiCortex, using mpi-linux-mips64
313 Windows HPC cluster, using mpi-win32/mpi-win64
315 Mac OSX 10.6, Snow Leopard (32-bit and 64-bit).
317 --------------------------------------------------------------------------------
319 --------------------------------------------------------------------------------
322 - Smarter build/configure scripts
323 - A new interface for model-based load balancing
324 - new CPU topology API
325 - a general implementation of CmiMemoryUsage()
326 - Bug fix: Quiescence detection (QD) works with immediate messages
327 - New reduction functions implemented in Converse
328 - CCS (Converse Client-Server) can deliver message to more than one processor
329 - Added a memory-aware adaptive scheduler, which can be optionally
331 - Added preliminary support for automatic message prioritization
332 (disabled by default)
335 - Cross-array and cross-group sections
336 - Structured Dagger (SDAG): Support templated arguments properly
337 - Plain chares support checkpoint/restart (both in-memory and disk-based)
338 - Conditional packing of messages and parameters in SMP scenario
339 - Changes to the CkArrayIndex class hierarchy
340 -- sizeof() all CkArrayIndex* classes is now the same
341 -- Codes using custom array indices have to use placement-new to construct
342 their custom index. Refer example code: examples/charm++/hello/fancyarray/
343 -- *** Backward Incompatibility ***
344 CkArrayIndex[4D/5D/6D]::index are now of type int (instead of short)
345 However the data is stored as shorts. Access by casting
346 CkArrayIndexND::data() appropriately
347 -- *** Deprecated ***
348 The direct use of public data member
349 CkArrayIndexND::index (N=1..6) is deprecated. We reserve the right to
350 change/remove this variable in future releases of Charm++.
351 Instead, please access the indices via member function:
352 int CkArrayIndexND::data()
355 - Compilers renamed to avoid collision with host MPI (ampicc, ampiCC,
357 - Improved MPI standard conformance, and documentation of non-conformance
358 * Bug fixes in: MPI_Ssend, MPI_Cart_shift, MPI_Get_count
359 * Support MPI_IN_PLACE in MPI_(All)Reduce
360 * Define various missing constants
361 - Return the received message's tag in response to a non-blocking
362 wildcard receive, to support SuperLU
363 - Improved tracing for BigSim
365 Multiphase Shared Arrays (MSA)
366 - Typed handles to enforce phases
367 - Split-phase synchronization to enable message-driven execution
371 - Automatic tracing of API calls for simulation and analysis
374 - Wider support for architectures other than net- (in particular MPI layers)
375 - Improved support for large scale debugging (better scalability)
376 - Enhanced record/replay stability to handle various events, and to
377 signal unexpected messages
378 - New detailed record/replay: The full content of messages can be
379 recorded, and a single processor can be re-executed outside of the
383 - Tracing of nested entry methods
385 Automatic Performance Tuning
386 - Created an automatic tuning framework [still for experimental use only]
389 - Network-topology / node aware spanning trees used internally for and
390 lower bytes on the network and improved performance in multicasts and
391 reductions delegated to this library
394 - Improved OneTimeMulticastStrategy classes
397 - Out-of-core support, with prefetching capability
398 - Detailed tracing of MPI calls
399 - Detailed record/replay support at emulation time, capable of
400 replaying any emulated processor after obtained recorded logs.
402 --------------------------------------------------------------------------------
403 Architecture-specific changes
404 --------------------------------------------------------------------------------
407 - Can run jobs with more than 1024 PEs
410 - New charmrun option ++no-va-randomization to disable address space
411 randomization (ASLR). This is most useful for running AMPI with
415 - Default to using ampicxx instead of mpiCC
418 - The +p option now has the same semantics as in other smp builds
421 - Support for VSX in SIMD abstraction API
424 - Compilers and options have been updated to the latest ones
427 - Added routines for measuring performance counters on BG/P.
428 - Updated to support latest DCMF driver version. On ANL's Intrepid, you may
429 need to set BGP_INSTALL=/bgsys/drivers/V1R4M1_460_2009-091110P/ppc in your
430 environment. This is the default on ANL's Surveyor.
433 - cputopology information is now available on XT3/4/5
436 - Bug fix: plug memory leaks that caused failures in long runs
437 - Optimized to reduce startup delays
440 - Support for SMP (experimental)
443 ================================================================================
444 Note that changes from 5.9, 6.0, and 6.1 are not documented here. A partial list
445 can be found on the charm download page, or by reading through version control
448 ================================================================================
449 What's New since Charm++ 5.4 release 1
450 ================================================================================
452 --------------------------------------------------------------------------------
453 New Supported Platforms:
454 --------------------------------------------------------------------------------
455 1. Charm++ ported to IA64 Itanium running Win2K and Linux, Charm++ also support
456 Intel C/C++ compilers;
458 2. Charm++ ported to Power Macintosh powerpc running Darwin;
460 3. Charm++ ported to Myrinet networking with GM API;
462 --------------------------------------------------------------------------------
463 Summary of New Features:
464 --------------------------------------------------------------------------------
466 Structured Dagger is a coordination language built on top of CHARM++.
467 Structured Dagger allows easy expression of dependences among messages and
468 computations and also among computations within the same object using
469 when-blocks and various structured constructs.
471 2. Entry functions support parameter marshalling
472 Now you can declare and invoke remote entry functions using parameter
473 marshalling instead of defining messages.
475 3. Easier running - standalone mode
476 For net-* version running locally, you can now run Charm programs without
477 charmrun. Running a node program directly from command line is now the
478 same as "charmrun +p1 <program>"; for SMP version, you can also specify
479 multiple (local) processors, as in "program +p2".
482 --------------------------------------------------------------------------------
484 --------------------------------------------------------------------------------
485 1. "build" changed for compilation of Charm++
486 To build Charm++ from scratch, we now take additional command line options
487 to compile with addon features and using different compilers other than gcc.
488 For example, to build Linux IA64 with Myrinet support, type command:
489 ./build net-linux-ia64 gm
492 ******* Old Change histories *******
495 ================================================================================
496 What's New in Charm++ 5.4 release 1 since 5.0
497 ================================================================================
499 --------------------------------------------------------------------------------
500 New Supported Platforms:
501 --------------------------------------------------------------------------------
503 1. Win9x/2000/NT: with Visual C++ or Cygwin gcc/g++, you can compile and run
504 Charm++ programs on all Win32 platforms.
506 2. Scyld Beowulf: Charm++ has been ported to the Linux-based Scyld Beowulf
507 operating system. For more information on Scyld, see <http://www.scyld.com>
509 3. MPI with VMI: Charm++ has been ported to NCSA's Virtual Machine Interface,
510 which is an efficient messaging library for heterogeneous cluster
514 --------------------------------------------------------------------------------
515 Summary of New Features:
516 --------------------------------------------------------------------------------
517 1. Dynamic Load balancing:
518 Chare migration is supported in the new release. Migration-based dynamic
519 load balancing framework with various load balancing strategies library has
523 Charm++ array is supported. You can now create an array of Chare objects
524 and use array index to refer the Charm++ array elements. A reduction
525 library on top of Chare array has been implemented and included.
528 Projections, a Java application for Charm++ program performance analysis and
529 visualization, has been included and distributed in the new release. Two
530 trace modes are available: trace-projections and trace-summary. Trace-summary
531 is a light-weight trace library compared to trace-projections.
534 AMPI is a load-balancing based library for porting legacy MPI applications
535 to Charm++. With few changes in the original MPI code to AMPI, the new
536 legacy MPI application on Charm++ will gain from Charm++'s adptive
537 load balancing ability.
540 "Charmrun" is now available on all platforms, with a uniform command line
541 syntax. You can forget the difference between net-* versions and MPI versions,
542 and run charm++ application with this same charmrun command syntax.
543 ++local option is added in charmrun for net-* version, it provides
544 simple local use of Charm and no longer require the ability to
545 "rsh localhost" or a nodelist file in order to run charm only on the local
546 machine. This is especially attractive when you run Charm++ on Windows.
549 Many new libraries have been added in this release. They include:
550 1) master-slave library: for writing manager-worker paradigm programs.
551 2) receiver library: provide asynchronous communication mode for chare array.
552 3) f90charm: provides Fortran90 bindings for Charm++ Array.
553 4) BlueGene: a Charm++/Converse emulator for IBM proposed Blue Gene.
555 --------------------------------------------------------------------------------
557 --------------------------------------------------------------------------------
558 1. message declaration syntax in .ci file:
559 The message declaration syntax for packed/varsize messages has been changed.
560 The packed/varsize keywords are eliminated, and you can specify the actual
561 actual varsize arrays in the interface file and have the translator generate
562 alloc, pack and unpack.
565 Here is the detailed list of Changes:
567 --------------------------------------------------------------------------------
569 --------------------------------------------------------------------------------
571 10/06/1999 rbrunner Added migration-based dynamic load balancing
573 11/15/1999 olawlor Added reduction support foe Charm++ arrays
574 02/06/2000 milind Added AMPI, an implementation of MPI with
575 dynamic load balancing
576 02/18/2000 paranjpy New platforms supported: net-win32, and net-win32-smp
577 04/04/2000 olawlor Added arbitrarily indexed Charm++ arrays.
578 Also, added translator support for new arrays.
579 04/15/2000 olawlor Added "puppers" for packing and unpacking
581 06/14/2000 milind Added the threaded FEM framework.
583 --------------------------------------------------------------------------------
585 --------------------------------------------------------------------------------
587 10/09/1999 rbrunner Added packlib, a library for C and C++ to
588 pack-unpack data to/from Charm++ messages.
589 10/13/1999 gzheng New LB strategy: RefineLB
590 10/13/1999 paranjpy New LB Strategy: Heap
591 10/14/1999 milind New LB Strategy: Metis
592 10/19/1999 olawlor New test program for testing LB strategies.
593 10/21/1999 gzheng New trace mode: trace-summary
594 10/28/1999 milind New supported platform: net-sol-x86
595 10/29/1999 milind Added runtime checks for ChareID assignment.
596 11/10/1999 rbrunner Added Neighborhood base strategy for LB
598 11/15/1999 olawlor conv-host now reads in a startup file
600 11/15/1999 olawlor New test program for testing array reductions.
601 11/16/1999 rbrunner Added processor-speed checking functions to
603 11/19/1999 milind Mapped SIGUSR to a Ccd condtion handler
604 11/22/1999 rbrunner New LB strategy: WSLB
605 11/29/1999 ruiliu Modified Metis LB strategy to deal with
606 different processor speeds
607 12/16/1999 rbrunner New LB strategy: GreedyRef
608 12/16/1999 rbrunner New LB strategy: RandRef
609 12/21/1999 skumar2 New LB strategy: CommLB
610 01/03/2000 rbrunner New LB strategy: RecBisectBfLB
611 01/08/2000 skumar2 New LB strategy: Comm1LB, with varying processor
613 01/18/2000 milind Modified SM library syntax, and added a test
615 01/19/2000 gzheng Added irecv, a library to simplify conversion
616 of message-passing programs to Charm++
617 02/20/2000 olawlor Added preliminary broadcast support to Charm++
619 02/23/2000 paranjpy Added converse-level quiescence detection
620 03/02/2000 milind Added ++server-port option to pre-specify
622 03/10/2000 wilmarth Random seed-based load balancer now uses
623 bit-vector for active PEs.
624 03/21/2000 gzheng Added support for marking user-defined events
626 03/28/2000 wilmarth Added CMK_TRUECRASH. Very helpful for
627 post-mortem debugging of Charm++ programs on
629 03/31/2000 jdesouza Added Fortran90 support to the Charm++
630 interface translator.
631 03/09/2000 milind Added support for -LANG and -rpath options
632 in charmc for Origin2000.
633 04/28/2000 milind Added prioritized converse threads.
634 05/01/2000 milind Added test programs for TeMPO, AMPI and irecv.
635 05/04/2000 milind New supported platform: mpi-sp.
636 05/04/2000 gzheng Added irecv pingpong program.
637 05/17/2000 olawlor Each chare, group and array element now has to
638 have migration constructor.
639 05/24/2000 milind Added Jacobi3D programs for irecv and AMPI both.
640 05/24/2000 milind Made migratable an optional attribute of
641 chares, groups, and nodegroups.
642 Arrays are by default migratable.
643 05/29/2000 paranjpy Added pup methods to arrays, reductions etc
645 06/13/2000 milind Made CtvInitialize idempotent. That is, it
646 can be called by any number of threads now,
647 only the first one will actually do
649 06/20/2000 milind Added a simple test program for the FEM
651 07/06/2000 milind Imported Metis 4.0 sources in the CVS tree.
652 Also added code to make metis libraries and
653 executables to Makefile.
654 07/07/2000 milind Added more meaningfull error messages using
655 perror in addition to a cryptic error codes in
657 07/10/2000 milind fem and femf are now recognized as "languages"
659 07/10/2000 saboo Added the derived datatypes library.
660 07/13/2000 milind Added +idle_timeout functionality. It takes a
661 commandline parameter denoting milliseconds of
662 maximum consecutive idle time allowed per
664 07/14/2000 milind Added group multicast. Added
665 CkSendMsgBranchMulti, CldEnqueueMulti, and
666 translator changes to support it.
667 07/14/2000 milind SUPER_INSTALL now takes "-*" arguments prior
668 to the target, that will be passed to make as
669 "makeflags". This makes it easy to suppress
670 make's output of commands etc (with the -s
671 flag). As a result of this, several Makefiles
673 07/18/2000 milind Added support for using "dbx" on suns as
675 07/19/2000 milind Added ability to tracemode projections which
676 produces binary trace files. Use flag
677 +binary-trace on the command line.
678 07/26/2000 milind Separated AMPI from TeMPO.
679 07/28/2000 milind Added test programs to test reduce, alltoall
680 and allreduce functionality of AMPI.
681 08/02/2000 milind Added an option to let the user specify which
682 "xterm" to use. For example, on some systems
683 (CDE), only dtterm is installed. So, by
684 putting ++xterm dtterm on the conv-host
685 commandline, one can use dtterm when ++in-xterm
686 option is specified on conv-host commandline.
687 08/14/2000 milind FEM Framework: Added capabilities to handle
688 esoteric meshes to standalone offline programs.
689 Makefile now produces gmap and fgmap programs,
690 which are used for this purpose. They convert
691 the mesh to a graph before partitioning it
693 08/24/2000 milind Added the 2D crack propagation program as a
694 test program for FEM framework.
695 08/25/2000 milind Initial implementation of isomalloc-based
696 threads. This implementation uses a fixed
697 stack size for all threads (can be set at
699 08/26/2000 milind Added a macro CtvAccessOther that lets you
700 get/set a Ctv variable of any thread. It
701 should be invoked as CtvAccessOther(thread,
702 varname); Added CthGetData function to each of
703 the threads implementation. This function is
704 used in the CtvAccessOther macro.
705 08/27/2000 milind FEM Framework: Separated mesh to graph
706 conversion capability into a separate program.
707 This way, the generated graph can be partitioned
709 09/04/2000 milind Added the class static readonly variables to
711 09/05/2000 milind FEM Framework: A very fast O(n) algorithm for
712 mesh2graph , uses more memory, but the tradeoff
713 was worth it. Coded by Karthik Mahesh, minor
714 optimizations by Milind.
715 09/05/2000 milind Added a barebones charm kernel scheduling
716 overhead measurement program.
717 09/15/2000 milind Added pup support for AMPI and FEM framework.
718 09/20/2000 olawlor Added capability to have an array of base type
719 where individual element could be of derived
721 10/03/2000 gzheng New supported platform: net-linux-axp
722 10/05/2000 skumar2 Added program littleMD to the test suite.
723 10/07/2000 skumar2 New job scheduler (Faucets projects).
724 10/15/2000 milind Improved support for Fortran90 in charmc.
725 11/04/2000 jdesouza Made the Faucets scheduler multi-threaded.
726 11/05/2000 olawlor FEM Framework: supports multiple element types,
727 mesh re-assembly, etc.
728 11/15/2000 gzheng New platform support: net-cygwin
729 11/18/2000 gzheng conv-host no longer needs /bin/csh to start
731 CMK_CONV_HOST_CSH_UNAVAILABLE to 1 to use
733 11/25/2000 milind Finished experimental implementation of
734 converse-threads based on co-operative pthreads.
735 11/25/2000 milind Added a benchmark suite of all pingpongs in
737 11/28/2000 milind Removed deletion of _idx at the end of every
738 send or doneInserting call. Instead now it is
739 in the destructor of the proxy. This allows us
740 to cache proxies, when proxy creation becomes
742 11/28/2000 olawlor Added "seek blocks" to puppers. This should
743 allow out-of-order pup'ing without the ugliness
744 of getBuf; and in a way that works with all
746 11/29/2000 olawlor Simplified and regularized command-line-argument
748 11/29/2000 milind AMPI: Added multiple-communicators capability.
749 12/05/2000 gzheng Now /bin/sh is default shell to fork node
750 program on remote machines.
751 12/13/2000 olawlor Added charmrun wrapper for poe on mpi-sp.
752 12/14/2000 milind Added bluegene emulator sources and test
753 programs. Added "bluegene" as a language known
754 to charmc. Makefile now has a target called
755 bluegene. Added preliminary bluegene
756 documentation. (copied from Arun's webpage.)
757 12/15/2000 gzheng f90charm addition to Makefile and charmc. Also,
758 added fixed size arrays support to f90charm. A
759 test program f90charm/hello is checked in.
760 12/17/2000 milind Added rtest test program. Contributed by jim to
761 test Converse message transmission.
762 12/20/2000 olawlor Added charmconfig script. Enables automatic
763 determination of C++ compiler properties,
764 replacing the verbose and error-prone
765 conv-mach.h entries for CMK_BOOL,
766 CMK_STL_USE_DOT_H, CMK_CPP_CAST_OK, ...
767 12/20/2000 olawlor Charm++ Arrays optimizations: Key and object
768 now variable-length fields, instead of pointers.
769 This extra flexibility lets us save many
770 dynamic allocations in the array framework.
771 12/20/2000 olawlor Added PUP::able support-- dynamic type
772 identification, allocation, and deletion.
773 Allows you to write: p(objPtr); and
774 objPointer will be properly identified,
775 allocated, packed, and deallocated (depending
776 on the PUP::er). Requires you to register any
777 such classes with DECLARE_PUPable and
779 12/20/2000 olawlor Arrays optimizations: Made CkArrayIndex
780 fixed-size. This significantly improves
781 messaging speed (7 us instead of 10 us
782 roundtrip). Move spring cleaning check into a
783 CcdCallFnAfter, which gains more speed (down to
785 12/20/2000 olawlor More optimizations: Minor speed tweaks--
786 conv-ccs.c uses hashtable for handler lookup;
787 conv-conds skips timer test until needed;
788 convcore.c scheduler loop optmizations (no
789 superfluous EndIdle calls); threads.c
790 CMK_OPTIMIZE-> no mprotect.
791 12/20/2000 olawlor More Optimizations: Minor speed tweaks-- ck.C
792 groups cldEnqueue skip; init.h defines
793 CkLocalBranch inline; and supporting changes.
794 12/22/2000 gzheng IA64 support for Converse user level threads.
795 01/02/2001 olawlor CCS: Minor update-- enabled CcsProbe, cleaned
796 up superflous debug messages in server, added
797 Java interface (originally written for
799 01/09/2001 gzheng charmconfig converted to autoconf style, need
800 to change configure.in and conv-autoconfig.h.in,
801 and run autoconf to get configure and copy to
802 charmconfig. added fortran subroutine name
803 test and get libpthread.a
804 01/10/2001 milind Added telnet method of getting libpthread.a
805 from charm webserver.
806 01/11/2001 olawlor Moved projections files here from
807 CVSROOT/projections-java. Added fast Java
808 versions of the .log file input routines in
809 LogReader, LogLoader, LogAnalyzer, and
810 UsageCalc. Added "U.java" user interface
811 utility file, allowing times to be input in
812 seconds, milliseconds, or microseconds,
813 instead of just microseconds.
814 01/15/2001 gzheng add +trace-root to specify the directory to
815 put log files in. this is need in Scyld cluster
816 where there is no NFS mounting and no i/o
817 access to home directory sharing on nodes.
818 01/15/2001 milind Made AMPI into a f90 module instead of
819 'ampif.h' inclusion. AMPI f90 bindings are
820 now more inclusive. Fixed argc,argv handling
821 bugs in ArgsInfo message. Fixed a bug in pup
822 that caused thread not to be sized, but was
823 packed nevertheless. Moved irecv to waitall
824 instead of at in ampi_start. Made
825 AMPI_COMM_WORLD to be 0, because it clashed
826 with wildcard(-1). AMPI_COMM_UNIVERSE is now
827 handled properly in the AMPI module.
828 C/C++ data members are NOT visible to
830 01/18/2001 gzheng New supported platform: net-linux-scyld
831 01/20/2001 olawlor Moved array index field from CMessage_* to the
832 Ck envelope itself. This is the right thing
833 to do, because any message may be sent to/from
834 an array element. To reduce the wasted space
835 in a message, a union is used to overlay the
836 fields for the various possible message types.
837 01/29/2001 olawlor Freed charmrun on net-* version from using
838 remote shell to fork off processes. One can now
839 use a daemon provided in the distribution.
840 02/07/2001 olawlor Added debugging support to puppers.
841 02/13/2000 gzheng Added ++local option to charmrun to start node
842 program locally without any daemon; fix the
843 hang program if you type wrong pgm name in
844 scyld version, and redirect all output to
845 /dev/null, otherwise all node program can send
846 its output to console in scyld. Also implemented ++local in net-win32 version.
847 02/26/2000 milind Changed the varsize syntax. Now one can specify
848 actual varsize arrays in the interface file
849 and have the translator generate alloc, pack
852 --------------------------------------------------------------------------------
854 --------------------------------------------------------------------------------
856 10/29/1999 milind Replaced jmemcpy by memcpy in net versions, as
857 it was causing a bit to flip (bug reported
859 10/29/1999 milind Fixed multiline macros in all header files.
860 02/05/2000 milind Fixed linking errors by getting the order of
861 libraries right from the charmc command-line.
862 02/18/2000 paranjpy Fixed Charm++ initialization bug on SMPs.
863 02/21/2000 milind Fixed a context-switching bug in mipspro version
865 02/25/2000 milind Charm++ interface translator was segfaulting
866 on interface file errors. Fixed that. Also,
867 added linenumbers to error messages.
868 03/02/2000 milind Made CCS work on SMPs.
869 03/07/2000 milind Made ConverseInit consistent with the manual on
871 04/18/2000 milind Fixed a bug in CkWaitFuture, which was caching
872 a variable locally, while it was changed by
874 05/04/2000 paranjpy Fixed argv deletion bug on net-win32-smp.
875 06/08/2000 milind sp3 version: changed optimization flags, which
876 where power2 processor-specific.
877 06/20/2000 milind mpi-* versions: Fixed ConverseExit since it was
878 not obeying the following statement in the MPI
879 standard: The user must ensure that all pending
880 communications involving a process completes
881 before the process calls MPI_FINALIZE.
882 07/05/2000 milind Fixed a nasty bug in charmc in the -cp option.
883 It used to append the name provided to -o flag
884 to the directory provided to the -cp flag.
885 Thus, -o ../pgm -cp ../bin options meant that
886 the pgm would be copied to ../bin/.., which is
887 not the expected behavior. This fix correctly
888 copies pgm to ../bin.
889 07/07/2000 milind Removed variable arg_myhome, as it was not
890 being used anywhere, and also, setting it was
891 causing problems of env var HOME was not set.
892 07/27/2000 milind thishandle for the arrayelement was not being
893 correctly set. Bug was reported by Neelam.
894 08/26/2000 milind Origin2000: Changed the page alignment to
895 reflect the mmap alignment. The mmap man page
896 specifically states that it is not the same as
898 09/02/2000 milind Fixed a bug in code generated for threaded
899 (void) entry methods of array elements. The
900 dummy message that is passed to that method in
901 a thread has to be deleted before calling the
902 object method, because upon object method's
903 return, the thread might have migrated.
904 09/03/2000 olawlor Minor fix-fixes: 1.) Change to LBObjid hash
905 function would fail for >4-int object indices.
906 Replaced with proper function, which also
907 preserves the 1-int case. 2.) Array element
908 sends must go via the message queue to prevent
909 stack build-up for deep single-processor call
910 chains. These might happen, e.g., in a driver
911 element calling itself for the main time loop.
912 Messages are now properly noted as sent, then
913 wait through the queue for delivery. This
914 entailed minor reorganization of the message
916 09/21/2000 olawlor Tiny SMP thread fix-- registrations of a
917 thread-private variable now reserve space on
918 calls after the first. This wastes space for
919 multiple CthInitialize's-- it's a quick hack to
920 get threads working again on SMP versions.
921 10/16/2000 olawlor A few CCS fixes: -Added split-phase reply
922 (delay reply indefinitely) -Cleaned up error
923 handling -Pass user data as "void *" instead of
925 11/03/2000 wilmarth Removed 0 size array allocation in Charm++
926 quiescence detection.
927 11/20/2000 gzheng Rewrote part of Fiber thread, including a bug
928 fix for a the non thread-safe function, and a
929 different fiber free strategy.
930 11/29/2000 gzheng The LB init procedure tried to allocate
931 65536*160 as initial size, which is 10M memory
932 for communication table, which is too big.
933 Cut it down to roughly 1M, and it can expand
935 12/05/2000 gzheng In many cases, conv-host exits without print
936 out the error message from remote shell. try
937 to fix it by calling sync to flush the pipe
939 12/10/2000 milind net-linux: Made static linking the default
940 option because dynamic linking runtime causes
941 isomalloc threads to crash.
942 12/18/2000 milind Increased portability of isomalloc threads by
943 removing dependence on alloca.
944 12/28/2000 milind Fixed ctrl-getone abort bug on SMP.
945 12/28/2000 milind Made _groupTable a pointer on which a
946 constructor is explicitly called. Since it
947 was a Cpv variable, its constructor was not
948 called by default in case of an SMP version.
949 12/29/2000 olawlor Prevent infinite copy constructor recursion on
951 01/10/2001 olawlor Added "explicit" keyword to remove ambiguity
952 for KCC, which was confused by the private
953 PUP::er(int) "cast" constructor and the operator
954 |(PUP::er &p,T &t) into rejecting all operator|
955 (int,int) as ambiguous.
956 2001/01/17 gzheng fix the charmconfig bug on paragon-red: the
957 failure testing of fortran won't stop the
959 01/20/2001 olawlor Arrays reduction: Fixed bug-- reduction may end
960 because all contributors migrate away.
961 01/29/2001 olawlor Fix heap-corrupting bug-- call ->init() on
962 nodeGroupTable, which sets the "pending"
963 message queue to NULL. This prevents a nasty
964 delete-unitialized-data bug later on. Also
965 delayed queue creation until messages actually
968 --------------------------------------------------------------------------------
969 Documentation Changes:
970 --------------------------------------------------------------------------------
972 01/31/2000 milind Installation manual: Fixed bugs pointed out by
974 02/28/2000 wilmarth Added a new look Charm++ manual.
975 06/20/2000 milind Added pdflatex support to generate PDF versions
976 of manuals from LaTeX sources.
977 12/05/2000 milind Added Orion's FEM manual. Converted from HTML.
978 12/10/2000 milind Added pplmanual.sty for all manuals.
979 12/17/2000 milind Added master-slave library documentation to
981 12/21/2000 saboo Added DDT documentation.
982 01/02/2001 olawlor Updated for new CCS version.
984 --------------------------------------------------------------------------------
986 --------------------------------------------------------------------------------
988 10/24/1999 olawlor charmc is changed to Bourne shell script
989 instead of csh. All conv-mach.csh are
990 replaced by conv-mach.sh.
991 10/25/1999 olawlor SUPER_INSTALL is converted to use bourne shell.
992 10/28/1999 milind All Makefiles now take OPTS commandline
994 01/16/2000 olawlor Simplified Charm++ interface translator.
995 02/23/2000 ruiliu Changed rand() calls from all over the codes
996 to the new Converse random number generator.
997 02/26/2000 milind Simplified the converse scheduler loop by
998 combining the maxmsgs and poll modes.
999 08/31/2000 milind Imported system documentation into the CVS tree.
1000 Also added super_install target for docs with
1001 necessary Makefile modifications.
1002 09/08/2000 olawlor Made soft links use relative pathnames instead
1003 of absolute. This lets you move a charm++
1004 installation without having to recompile
1006 09/11/2000 olawlor Grouped commonly needed code in the new util
1007 directory. Also, added pup_c a C wrapper for
1009 09/11/2000 olawlor Slightly reorganized header structure. Now no
1010 headers should need to be listed twice (once in
1011 ALLHEADERS, again in CKHEADERS). Now headers
1012 are soft-linked instead of copied. This makes
1013 development much easier. Added support for the
1014 new Common/util directory.
1015 09/21/2000 olawlor Major reorganization of net-* codes. Now all
1016 the TCP socket routines are in separate files.
1017 Also combined windoes NT code with unix codes.
1018 09/21/2000 olawlor Major rewrite of CCS-- underlying protocol is
1019 now binary (send/recv binary data everywhere);
1020 conv-host forwards requests to nodes; and
1021 source has been significantly re-arranged.
1022 (especially if NODE_0_IS_CONVHOST).
1023 11/22/2000 milind Removed IDL translator from distribution.
1024 12/01/2000 olawlor Renamed conv-host charmrun; added test for
1025 script conv-host. Also added charmrun for most
1027 12/17/2000 milind Moved List related data structures into
1028 cklists.h in util. Removed most of the redundant
1029 list implementations.
1030 12/20/2000 gzheng SUPER_INSTALL: format the output of list of
1031 versions and make the help page fit into one
1033 12/24/2000 milind Added test-{charm,converse,ampi,fem} targets to
1035 12/28/2000 milind net-sol-smp now uses pthreads.
1036 01/29/2001 olawlor Merged windowsNT and unix build procedures by
1037 basing the Windows build on cygwin. Added
1038 scripts to deal with unix and windows