1 ========================
2 Python packaging roadmap
3 ========================
5 Describe functional goals, feature tracking, and associated tests.
10 General tool implementation sequence
11 ------------------------------------
13 1. Tools in ``operation`` submodule wrap Python code with compliant interface
14 2. ``commandline_operation()`` provides UI for (nearly) arbitrary CLI tools
15 3. ``tool`` submodule provides trivially wrapped ``gmx`` tools with well-defined inputs and outputs
17 Simulation tool implementation sequence
18 ---------------------------------------
27 * ``result()`` methods force data locality for use by non-API calls
28 * output properties allow proxied access to (fault tolerant, portable) result futures
29 * ``logical_and()``, etc. allow manipulation of data / signals "in flight"
31 Data flow topology tools
32 ------------------------
34 * implicit scatter / map
36 * ``gather()`` makes results local
37 * ``reduce()`` allows operation across an ensemble (implicit allReduce where appropriate)
42 * ``subgraph`` allows several operations to be bundled with a scope that allows
43 user-declared data persistence (e.g. when used in a loop)
44 * ``while_loop`` allows repeated execution of a subgraph, subject to a declared
45 condition of the subgraph state/output, managing accessibilty of data handles
46 internal and external to the graph
51 1. Top level module function ``run()`` launches work to produce requested data
52 2. Infrastructure manages / negotiates dispatching or allocation of available resources
57 1. Python package can be installed from GROMACS source directory after GROMACS installation.
58 2. Python package can be installed by a user from a hand-built GROMACS installation.
59 3. Python package can be installed from a Linux binary GROMACS distribution using
60 appropriate optimized binary libraries.
62 Installation procedures are testable with Dockerfiles in a Docker-based CI environment.
63 Ref: https://github.com/kassonlab/gmxapi/blob/master/ci_scripts/test_installers.sh
68 In addition to basic scalar types,
69 some structured data types are immediately necessary.
71 * gmxapi.NDArray supports an N-dimensional array of a single scalar gmxapi data type
72 (with sufficient metadata for common array view APIs)
73 * gmxapi.Map supports an associative container with key strings and Variant values
75 NDArray is immediately necessary to disambiguate non-scalar parameters/inputs from
76 implicit data flow topological operations.
78 Data structures currently coupled to file formats should be decomposable into
79 Maps with well-specified schema, but in the absence of more complete API data
80 abstractions we just use filename Strings as the API-level content of
81 input/output data handles. (Suggest: no need for file objects at
82 the higher-level API if read_Xfiletype(/write_Yfiletype) operations
83 produce(/consume) named and typed data objects of specified API types.)
85 The API specification should be clear about policies for narrowing and widening
91 Features development sequence based on functional priorities and dependencies.
93 * fr1: wrap importable Python code.
94 * fr2: output proxy establishes execution dependency (superseded by fr3)
95 * fr3: output proxy can be used as input
96 * fr4: dimensionality and typing of named data causes generation of correct work topologies
97 * fr5: explicit many-to-one or many-to-many data flow
98 * fr7: Python bindings for launching simulations
99 * fr8: gmx.mdrun understands ensemble work
101 * fr10: fused operations for use in looping constructs
102 * fr11: Python access to TPR file contents
103 * fr12: Simulation checkpoint handling
104 * fr13: ``run`` module function simplifies user experience
105 * fr14: Easy access to GROMACS run time parameters
106 * fr15: Simulation input modification
107 * fr16: Create simulation input from simulation output
108 * fr17: Prepare simulation input from multiple sources
109 * fr18: GROMACS CLI tools receive improved Python-level support over generic commandline_operations
110 * fr19: GROMACS CLI tools receive improved C++-level support over generic commandline_operations
111 * fr20: Python bindings use C++ API for expressing user interface
112 * fr21 User insulated from filesystem paths
113 * fr22 MPI-based ensemble management from Python
114 * fr23 Ensemble simulations can themselves use MPI
116 Expectations on Mark for Q1-Q2 2019 GROMACS master changes
117 ==========================================================
119 * Broker and implement build system amenable to multiple use
120 cases. Need to be able to build and deploy python module from single
121 source repo that is usable (i.e. can run the acceptance tests).
123 - Some kind of nested structure likely appropriate, perhaps
124 structured as nested CMake projects that in principle could stand
125 alone. That's probably workable because nested projects can see
126 the parent project's cache variables (TODO check this)
127 - probably a top-level project coordinating a libgromacs build and a
128 python module build, with the former typically feeding the latter
129 - the libgromacs build may be able to leverage independent efforts
130 towards a multi-configuration build (so SIMD/MPI/GPU agnostic)
131 - top-level project offers much the same UI as now, passing much of
132 it through to the libgromacs project
133 - top-level project offers the option to find a Python (or be told
134 which to use), to find a libgromacs (or be told, or be told to
135 build), to build any necessary wrapper binaries (ie. classical gmx
136 and mdrun), and to deploy all linked artefacts to
137 CMAKE_INSTALL_PREFIX or the appropriate Python site-packages
138 - the top-level project will be used by e.g. setup.py wrapper
139 from scikit-build/distutils
140 - requires reform of compiler flags handling
141 - probably requires some re-organization of external dependencies
143 - follow online "Modern CMake" best practices as far as practicable
144 - library should be available for static linking with position
145 independent code to allow a single shared object to be built for
148 * Dissolve boundary between libgmxapi and libgromacs
150 - no effort on form and stability of the C++ headers and library in
151 2019, beyond what facilitates implementing the Python interface
153 - existing libgromacs declarations of "public API" and installed
156 * libgromacs to be able to be use an MPI communicator passed in,
157 rather than hard-coding MPI_COMM_WORLD anywhere. It is likely that
158 existing wrapper binaries can use the same mechanism to pass
159 MPI_COMM_WORLD to libgromacs.
161 * UI helpers should express.
162 - preferred name for datum as a string: ``nsteps``, ``tau-t``, etc.
163 - setter (function object, pointer to a builder method, )
164 - typing and type discovery (could be deducible from setter, but something to allow user input checking, or determination
165 of the suitability of a data source to provide the given input)
166 - help text: can be recycled to provide auto-extracted documentation, command-line help, and annotation in Python docstrings.
167 - for CLI: short name for flag. E.g. 'p' for "topology_file"
168 - for compatibility: deprecated / alternate names. E.g. "nstlist" for "neighbor_list_rebuild_interval", or "orire" for
169 "enable_orientation_restraints"
172 Possible GROMACS source changes whose impact is currently unknown
173 =================================================================
174 * gmx::Any (which is a flavour of C++17 std::any) type could be
175 helpful at API boundary. Also perhaps a flavour of C++17
176 std::optional or std::variant.
181 Some project goals are integrations or optimizations that are explicitly hidden from the user
182 and not testable in a high level script, but should be reflected as milestones in a roadmap.
184 GROMACS source changes deferred to later in 2019
185 ================================================
186 * Build system works also from tarball
187 * Build system can produce maximally static artefacts (for performance
188 on HPC infrastructure)
189 * express grompp and mdrun options handling with gmx::Options to
190 prepare for future dictionary-like handling in Python without
191 serializing a .tpr file