Enable checkpointing for modular simulator
This introduces the checkpoint helper and checkpoint helper clients.
The `CheckpointHelper` is responsible to write checkpoints. In the
longer term, it will also be responsible to read checkpoints, but this
is not yet implemented.
Writing checkpoints is done just before neighbor-searching (NS) steps,
or before the last step. Checkpointing occurs periodically (by default,
every 15 minutes), and needs two NS steps to take effect - on the first
NS step, the checkpoint helper on master rank signals to all other ranks
that checkpointing is about to occur. At the next NS step, the checkpoint
is written. On the last step, checkpointing happens immediately before the
step (no signalling). To be able to react to last step being signalled,
the CheckpointHelper does also implement the `ISimulatorElement` interface,
but does only register a function if the last step has been called.
Checkpointing happens at the top of a simulation step, which gives a
straightforward re-entry point at the top of the simulator loop. Moving
the checkpointing position requires to sum the kinetic energy (in
compute globals) a step earlier for md-vv.
In the current implementation, the clients of CheckpointHelper fill a
legacy t_state object (passed via pointer) with whatever data they need
to store. The CheckpointHelper then writes the t_state object to file.
This is an intermediate state of the code, as the long-term plan is for
modules to read and write from a checkpoint file directly, without the
need for a central object. The current implementation allows, however,
to define clearly which modules take part in checkpointing, while using
the current infrastructure for reading and writing to checkpoint.
This change also adds a lambda vector to the force element, since in
some cases, the force routine access the lambda vector even if FEP is
off, which leads to segmentation faults. It also moves the ownership
of needToSumEkinhOld_ from the compute globals element to the energy
element. It reintroduces the observablesHistory, which is owned by
the energy element. It also allows the CheckpointHelper to access
the outf private variable of TrajectoryElement.
Finally, this change also fixes a bug in the NeighborSearchSignaller
which would fail to signal the first step of the simulation run as a
NS step if it didn't fulfill do_per_step(step, nstlist). This was
not a problem as long as we only allowed new simulations but may
cause errors on restarts from checkpoints.
Change-Id: I15066fa66d653567f680a1c616a13ccfb7e3e955
16 files changed: