1 Title: Introduce docker monitoring with Check_MK
11 With this change we prepare Check_MK for monitoring docker environments out
12 of the box. These checks work in different layers (node, container).
14 The docker monitoring is currently available through the linux agent. To get
15 a docker node monitored it should be enough to simply deploy the agent as
16 usual on the node. Check_MK will find all relevant checks automatically.
18 The agent on the node will iterate over all containers and execute the
19 Check_MK agent in the context of the container. In case there is a agent already
20 installed in the container, the agent of the container will be used. Otherwise
21 the node will execute the nodes agent in the context of the container.
23 In case you need specific agent plugins executed in the container, you can
24 add them to the container image together with the agent just like you would
25 do it for regular hosts.
27 By default the docker container specific parts are transported via piggyback
28 from the node to the Check_MK server. This means that you will have to create
29 hosts in your Check_MK use the short container ID as name.
31 For the docker container hosts please use the following configuration:
34 <li>Set the "Check_MK Agent" option to "No agent".</li>
35 <li>Set the "IP address family" to "No IP" for only processing the piggyback data.</li>
36 <li>Set the docker node as parent.</li>
37 <li>Enable HW/SW inventory for the node and the containers</li>
40 The manual (or scripted) configuration of these hosts will be necessary with
41 the 1.5. Check_MK 1.6 will solve this problem automatically in a more elegant way.
43 There are other use cases, for example if you have not access to the node,
44 then you can also install the agent (including optional config and plugins)
45 into the image and make the container open a dedicated network port for
48 We'll add a dedicated docker monitoring page to the documentation in the
49 near future to describe this in detail.
51 The following changes have been made for now:
53 <h3>New check plugins</h3>
56 <li>docker_node_info: Check the status of docker daemon<br>
58 Whether or not the docker daemon is running and functional on the docker
62 <li>docker_node_info.containers: Count number of containers<br>
64 Counts the number of containers in the different states. Creates metrics
65 out of these information. Thresholds can be configured on the number of
66 containers in the different states.
69 <li>docker_node_disk_usage: Disk usage of docker files<br>
71 This check summarizes the disk usage of docker files (images, ...) on
72 the disks. It tells you whether or not you can safe disk space by
76 <li>docker_container_cpu: Check the CPU utilization of a docker container<br>
78 This check reports the percentage CPU utilization of a docker container.
79 Unlike the Linux CPU utilization check (kernel.util) it does only report
80 user and system time. More detailed values, like iowait, are not available.
83 <li>docker_container_mem: Docker container specific memory checking<br>
85 Instead of using the default linux memory check (mem), Check_MK is now
86 using the container specific memory check.
88 The main reason is that the memory information in the container is not
89 available through <tt>/proc/meminfo</tt> as usual. The memory data is available
90 through the kernels cgroup interface which is available in the containers
91 context below <tt>/sys/fs/cgroup/memory/memory.stat</tt>
93 The features of both checks are exactly the same.
96 <li>docker_container_status: Checks running state of container<br>
98 The check docker_container_status checks whether a container is running or not.
101 <li>docker_container_status.health: Check healthcheck API of containers<br>
103 Check the status of containers as reported by Docker's healthcheck API.
108 <h3>New HW / SW inventory plugins</h3>
111 <li>docker_node_images: Inventorize docker node information<br>
113 Inventorizes information about repository, tag, ID, creation time, size,
114 labels and the amount of docker images. It also collect information about
115 how many containers currently use this image.
118 <li>docker_node_info: Inventory plugin displaying docker version<br>
120 Adds the docker version and node labels to the inventory tree.
122 <li>docker_container_labels: Inventorize the labels of container</li>
123 <li>docker_container_node_name: Inventorize node name of containers</li>
127 <h3>Preparing linux agent for docker monitoring</h3>
130 <li>The agent now detects whether or not it is being executed
131 in a docker container context.
134 <li>Find docker containers and execute agent in context<br>
136 In case the agent is running on a docker node, it iterates
137 all running containers and executes the Check_MK agent in
138 to context of the container to gather container specific
141 In case a check_mk_agent is already installed in the
142 container, then this agent is executed.
144 In case there is no check_mk_agent installed, the agent
145 of the docker node is executed in the container.
149 <h3>Changed checks</h3>
152 <li>lnx_if: Exclude veth* network interfaces on docker nodes<br>
154 The veth* network interfaces created for docker containers are now
155 excluded by the linux agent in all cases. The interface names have no
156 direct match with the docker container name or ID. They seem to have
157 some kind of random nature.
159 These container specific interfaces are not relevant to be monitored
160 on the node. We are monitoring the docker network interfaces in the
164 <li>df: Exclude docker local storage mounts on docker nodes<br>
166 The df check is now excluding all filesystems found below
167 <tt>/var/lib/docker</tt>, which is the default location for
168 the docker container local storage.
170 Depending on the used storage engine docker creates overlay
171 filesystems and mounts below this hierarchy for the started
174 The filesystems are not interesting for our monitoring. They
175 will be monitored from the container context.
178 <li>df mounts: Skip docker mounts for name resolution in container<br>
180 When docker containers are configured to perform name resolution there are
181 mounts at <tt>/etc/resolv.conf</tt>, <tt>/etc/hostname</tt> and
182 <tt>/etc/hosts</tt> which are not relevant to be monitored. These checks are
186 <li>uptime: Is now reported correctly for docker containers<br>
188 In previous versions of the linux agent the uptime of the
189 docker node was reported by the agent when it is being executed
190 in a docker container context.
193 <li>Checks disabled in docker container contexts<br>
195 These checks do not make sense in the context of a docker container.
196 The agent is now skipping this section when executed in a container.
197 For some of the checks docker specific ones have been added (see above).