pmdalinux: add maximum memory bandwidth per numa node metric
Adds a new metric to monitor the maximum memory bandwidth per numa
node. This metric is limited by a config file named bandwidth.conf
which contains the node bandwidth information.
For e.g.:
node0:40960
node1:40960
...
Each row represents a numa node of the system along with the maximum
memory bandwidth (in MB/sec) it supports. The maximum memory bandwidth
can be found using several benchmarking tools by saturating and
measuring the bandwidth.
pmdalinux agent parses the config file and checks whether the node is
present in sysfs/devices/system/node/ directory. The node name in the
config file must match the name of any of the nodes in node/ directory.
The bandwidth value is taken from this config file and updated in the
node_info struct for each node.
# pminfo | grep bandwidth
mem.numa.max_bandwidth
# pmval mem.numa.max_bandwidth
metric: mem.numa.max_bandwidth
host: <some_host>
semantics: instantaneous value
units: Mbyte / sec
samples: all
node0 node1
4.096E+04 4.096E+04
4.096E+04 4.096E+04
...
Few things to note:
- The user/client can run some benchmarking tools to saturate the
bandwidth and can update this information in the .config file.
- The max bandwidth value can be given as a floating point.
- The node names mentioned in the .config must match any of the node
names found in sysfs/devices/system/node/ directory.
- Right now, automatic update of max bandwidth is not supported due to
lack of non-standard/arch-independent tools.
- Support for automatic updates for max bandwidth using some
benchmarking tools will be added later.
Purpose of this metric:
As of now, we have hardware counters for measuring the current memory
bandwidth (read and write) and that can be aggregated per
node. "perfevent" agent for PCP can be used for that. However, to make
decisions regarding placement/migration of workloads across nodes (or
systems) solely based on the current bandwidth is not sufficient. We
also need the maximum bandwidth supported on the nodes to find out the
utilization. The maximum bandwidth metric can be used for this purpose.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>