1 <?xml version=
"1.0" encoding=
"UTF-8"?>
3 <html xmlns=
"http://www.w3.org/1999/xhtml">
5 <h1>KVM/QEMU hypervisor driver
</h1>
10 The libvirt KVM/QEMU driver can manage any QEMU emulator from
11 version
1.5.0 or later.
14 <h2><a id=
"project">Project Links
</a></h2>
18 The
<a href=
"https://www.linux-kvm.org/">KVM
</a> Linux
22 The
<a href=
"https://wiki.qemu.org/Index.html">QEMU
</a> emulator
26 <h2><a id=
"prereq">Deployment pre-requisites
</a></h2>
30 <strong>QEMU emulators
</strong>: The driver will probe
<code>/usr/bin
</code>
31 for the presence of
<code>qemu
</code>,
<code>qemu-system-x86_64
</code>,
32 <code>qemu-system-microblaze
</code>,
33 <code>qemu-system-microblazeel
</code>,
34 <code>qemu-system-mips
</code>,
<code>qemu-system-mipsel
</code>,
35 <code>qemu-system-sparc
</code>,
<code>qemu-system-ppc
</code>. The results
36 of this can be seen from the capabilities XML output.
39 <strong>KVM hypervisor
</strong>: The driver will probe
<code>/usr/bin
</code>
40 for the presence of
<code>qemu-kvm
</code> and
<code>/dev/kvm
</code> device
41 node. If both are found, then KVM fully virtualized, hardware accelerated
42 guests will be available.
46 <h2><a id=
"uris">Connections to QEMU driver
</a></h2>
49 The libvirt QEMU driver is a multi-instance driver, providing a single
50 system wide privileged driver (the
"system" instance), and per-user
51 unprivileged drivers (the
"session" instance). The URI driver protocol
52 is
"qemu". Some example connection URIs for the libvirt driver are:
56 qemu:///session (local access to per-user instance)
57 qemu+unix:///session (local access to per-user instance)
59 qemu:///system (local access to system instance)
60 qemu+unix:///system (local access to system instance)
61 qemu://example.com/system (remote access, TLS/x509)
62 qemu+tcp://example.com/system (remote access, SASl/Kerberos)
63 qemu+ssh://root@example.com/system (remote access, SSH tunnelled)
66 <h2><a id=
"security">Driver security architecture
</a></h2>
69 There are multiple layers to security in the QEMU driver, allowing for
70 flexibility in the use of QEMU based virtual machines.
73 <h3><a id=
"securitydriver">Driver instances
</a></h3>
76 As explained above there are two ways to access the QEMU driver
77 in libvirt. The
"qemu:///session" family of URIs connect to a
78 libvirtd instance running as the same user/group ID as the client
79 application. Thus the QEMU instances spawned from this driver will
80 share the same privileges as the client application. The intended
81 use case for this driver is desktop virtualization, with virtual
82 machines storing their disk images in the user's home directory and
83 being managed from the local desktop login session.
87 The
"qemu:///system" family of URIs connect to a
88 libvirtd instance running as the privileged system account 'root'.
89 Thus the QEMU instances spawned from this driver may have much
90 higher privileges than the client application managing them.
91 The intended use case for this driver is server virtualization,
92 where the virtual machines may need to be connected to host
93 resources (block, PCI, USB, network devices) whose access requires
97 <h3><a id=
"securitydac">POSIX users/groups
</a></h3>
100 In the
"session" instance, the POSIX users/groups model restricts QEMU
101 virtual machines (and libvirtd in general) to only have access to resources
102 with the same user/group ID as the client application. There is no
103 finer level of configuration possible for the
"session" instances.
107 In the
"system" instance, libvirt releases from
0.7.0 onwards allow
108 control over the user/group that the QEMU virtual machines are run
109 as. A build of libvirt with no configuration parameters set will
110 still run QEMU processes as root:root. It is possible to change
111 this default by using the --with-qemu-user=$USERNAME and
112 --with-qemu-group=$GROUPNAME arguments to 'configure' during
113 build. It is strongly recommended that vendors build with both
114 of these arguments set to 'qemu'. Regardless of this build time
115 default, administrators can set a per-host default setting in
116 the
<code>/etc/libvirt/qemu.conf
</code> configuration file via
117 the
<code>user=$USERNAME
</code> and
<code>group=$GROUPNAME
</code>
118 parameters. When a non-root user or group is configured, the
119 libvirt QEMU driver will change uid/gid to match immediately
120 before executing the QEMU binary for a virtual machine.
124 If QEMU virtual machines from the
"system" instance are being
125 run as non-root, there will be greater restrictions on what
126 host resources the QEMU process will be able to access. The
127 libvirtd daemon will attempt to manage permissions on resources
128 to minimise the likelihood of unintentional security denials,
129 but the administrator / application developer must be aware of
130 some of the consequences / restrictions.
136 The directories
<code>/var/run/libvirt/qemu/
</code>,
137 <code>/var/lib/libvirt/qemu/
</code> and
138 <code>/var/cache/libvirt/qemu/
</code> must all have their
139 ownership set to match the user / group ID that QEMU
140 guests will be run as. If the vendor has set a non-root
141 user/group for the QEMU driver at build time, the
142 permissions should be set automatically at install time.
143 If a host administrator customizes user/group in
144 <code>/etc/libvirt/qemu.conf
</code>, they will need to
145 manually set the ownership on these directories.
150 When attaching USB and PCI devices to a QEMU guest,
151 QEMU will need to access files in
<code>/dev/bus/usb
</code>
152 and
<code>/sys/bus/pci/devices
</code> respectively. The libvirtd daemon
153 will automatically set the ownership on specific devices
154 that are assigned to a guest at start time. There should
155 not be any need for administrator changes in this respect.
160 Any files/devices used as guest disk images must be
161 accessible to the user/group ID that QEMU guests are
162 configured to run as. The libvirtd daemon will automatically
163 set the ownership of the file/device path to the correct
164 user/group ID. Applications / administrators must be aware
165 though that the parent directory permissions may still
166 deny access. The directories containing disk images
167 must either have their ownership set to match the user/group
168 configured for QEMU, or their UNIX file permissions must
169 have the 'execute/search' bit enabled for 'others'.
172 The simplest option is the latter one, of just enabling
173 the 'execute/search' bit. For any directory to be used
174 for storing disk images, this can be achieved by running
175 the following command on the directory itself, and any
179 chmod o+x /path/to/directory
182 In particular note that if using the
"system" instance
183 and attempting to store disk images in a user home
184 directory, the default permissions on $HOME are typically
185 too restrictive to allow access.
190 <h3><a id=
"securitycap">Linux process capabilities
</a></h3>
193 The libvirt QEMU driver has a build time option allowing it to use
194 the
<a href=
"http://people.redhat.com/sgrubb/libcap-ng/index.html">libcap-ng
</a>
195 library to manage process capabilities. If this build option is
196 enabled, then the QEMU driver will use this to ensure that all
197 process capabilities are dropped before executing a QEMU virtual
198 machine. Process capabilities are what gives the 'root' account
199 its high power, in particular the CAP_DAC_OVERRIDE capability
200 is what allows a process running as 'root' to access files owned
205 If the QEMU driver is configured to run virtual machines as non-root,
206 then they will already lose all their process capabilities at time
207 of startup. The Linux capability feature is thus aimed primarily at
208 the scenario where the QEMU processes are running as root. In this
209 case, before launching a QEMU virtual machine, libvirtd will use
210 libcap-ng APIs to drop all process capabilities. It is important
211 for administrators to note that this implies the QEMU process will
212 <strong>only
</strong> be able to access files owned by root, and
213 not files owned by any other user.
217 Thus, if a vendor / distributor has configured their libvirt package
218 to run as 'qemu' by default, a number of changes will be required
219 before an administrator can change a host to run guests as root.
220 In particular it will be necessary to change ownership on the
221 directories
<code>/var/run/libvirt/qemu/
</code>,
222 <code>/var/lib/libvirt/qemu/
</code> and
223 <code>/var/cache/libvirt/qemu/
</code> back to root, in addition
224 to changing the
<code>/etc/libvirt/qemu.conf
</code> settings.
227 <h3><a id=
"securityselinux">SELinux basic confinement
</a></h3>
230 The basic SELinux protection for QEMU virtual machines is intended to
231 protect the host OS from a compromised virtual machine process. There
232 is no protection between guests.
236 In the basic model, all QEMU virtual machines run under the confined
237 domain
<code>root:system_r:qemu_t
</code>. It is required that any
238 disk image assigned to a QEMU virtual machine is labelled with
239 <code>system_u:object_r:virt_image_t
</code>. In a default deployment,
240 package vendors/distributor will typically ensure that the directory
241 <code>/var/lib/libvirt/images
</code> has this label, such that any
242 disk images created in this directory will automatically inherit the
243 correct labelling. If attempting to use disk images in another
244 location, the user/administrator must ensure the directory has be
245 given this requisite label. Likewise physical block devices must
246 be labelled
<code>system_u:object_r:virt_image_t
</code>.
249 Not all filesystems allow for labelling of individual files. In
250 particular NFS, VFat and NTFS have no support for labelling. In
251 these cases administrators must use the 'context' option when
252 mounting the filesystem to set the default label to
253 <code>system_u:object_r:virt_image_t
</code>. In the case of
254 NFS, there is an alternative option, of enabling the
<code>virt_use_nfs
</code>
258 <h3><a id=
"securitysvirt">SELinux sVirt confinement
</a></h3>
261 The SELinux sVirt protection for QEMU virtual machines builds to the
262 basic level of protection, to also allow individual guests to be
263 protected from each other.
267 In the sVirt model, each QEMU virtual machine runs under its own
268 confined domain, which is based on
<code>system_u:system_r:svirt_t:s0
</code>
269 with a unique category appended, eg,
<code>system_u:system_r:svirt_t:s0:c34,c44
</code>.
270 The rules are setup such that a domain can only access files which are
271 labelled with the matching category level, eg
272 <code>system_u:object_r:svirt_image_t:s0:c34,c44
</code>. This prevents one
273 QEMU process accessing any file resources that are prevent to another QEMU
278 There are two ways of assigning labels to virtual machines under sVirt.
279 In the default setup, if sVirt is enabled, guests will get an automatically
280 assigned unique label each time they are booted. The libvirtd daemon will
281 also automatically relabel exclusive access disk images to match this
282 label. Disks that are marked as
<shared
> will get a generic
283 label
<code>system_u:system_r:svirt_image_t:s0
</code> allowing all guests
284 read/write access them, while disks marked as
<readonly
> will
285 get a generic label
<code>system_u:system_r:svirt_content_t:s0
</code>
286 which allows all guests read-only access.
290 With statically assigned labels, the application should include the
291 desired guest and file labels in the XML at time of creating the
292 guest with libvirt. In this scenario the application is responsible
293 for ensuring the disk images
& similar resources are suitably
294 labelled to match, libvirtd will not attempt any relabelling.
298 If the sVirt security model is active, then the node capabilities
299 XML will include its details. If a virtual machine is currently
300 protected by the security model, then the guest XML will include
301 its assigned labels. If enabled at compile time, the sVirt security
302 model will always be activated if SELinux is available on the host
303 OS. To disable sVirt, and revert to the basic level of SELinux
304 protection (host protection only), the
<code>/etc/libvirt/qemu.conf
</code>
305 file can be used to change the setting to
<code>security_driver=
"none"</code>
308 <h3><a id=
"securitysvirtaa">AppArmor sVirt confinement
</a></h3>
311 When using basic AppArmor protection for the libvirtd daemon and
312 QEMU virtual machines, the intention is to protect the host OS
313 from a compromised virtual machine process. There is no protection
318 The AppArmor sVirt protection for QEMU virtual machines builds on
319 this basic level of protection, to also allow individual guests to
320 be protected from each other.
324 In the sVirt model, if a profile is loaded for the libvirtd daemon,
325 then each
<code>qemu:///system
</code> QEMU virtual machine will have
326 a profile created for it when the virtual machine is started if one
327 does not already exist. This generated profile uses a profile name
328 based on the UUID of the QEMU virtual machine and contains rules
329 allowing access to only the files it needs to run, such as its disks,
330 pid file and log files. Just before the QEMU virtual machine is
331 started, the libvirtd daemon will change into this unique profile,
332 preventing the QEMU process from accessing any file resources that
333 are present in another QEMU process or the host machine.
337 The AppArmor sVirt implementation is flexible in that it allows an
338 administrator to customize the template file in
339 <code>/etc/apparmor.d/libvirt/TEMPLATE
</code> for site-specific
340 access for all newly created QEMU virtual machines. Also, when a new
341 profile is generated, two files are created:
342 <code>/etc/apparmor.d/libvirt/libvirt-
<uuid
></code> and
343 <code>/etc/apparmor.d/libvirt/libvirt-
<uuid
>.files
</code>. The
344 former can be fine-tuned by the administrator to allow custom access
345 for this particular QEMU virtual machine, and the latter will be
346 updated appropriately when required file access changes, such as when
347 a disk is added. This flexibility allows for situations such as
348 having one virtual machine in complain mode with all others in
353 While users can define their own AppArmor profile scheme, a typical
354 configuration will include a profile for
<code>/usr/sbin/libvirtd
</code>,
355 <code>/usr/lib/libvirt/virt-aa-helper
</code> (a helper program which the
356 libvirtd daemon uses instead of manipulating AppArmor directly), and
357 an abstraction to be included by
<code>/etc/apparmor.d/libvirt/TEMPLATE
</code>
358 (typically
<code>/etc/apparmor.d/abstractions/libvirt-qemu
</code>).
359 An example profile scheme can be found in the examples/apparmor
360 directory of the source distribution.
364 If the sVirt security model is active, then the node capabilities
365 XML will include its details. If a virtual machine is currently
366 protected by the security model, then the guest XML will include
367 its assigned profile name. If enabled at compile time, the sVirt
368 security model will be activated if AppArmor is available on the host
369 OS and a profile for the libvirtd daemon is loaded when libvirtd is
370 started. To disable sVirt, and revert to the basic level of AppArmor
371 protection (host protection only), the
<code>/etc/libvirt/qemu.conf
</code>
372 file can be used to change the setting to
<code>security_driver=
"none"</code>.
376 <h3><a id=
"securityacl">Cgroups device ACLs
</a></h3>
379 Linux kernels have a capability known as
"cgroups" which is used
380 for resource management. It is implemented via a number of
"controllers",
381 each controller covering a specific task/functional area. One of the
382 available controllers is the
"devices" controller, which is able to
383 setup whitelists of block/character devices that a cgroup should be
384 allowed to access. If the
"devices" controller is mounted on a host,
385 then libvirt will automatically create a dedicated cgroup for each
386 QEMU virtual machine and setup the device whitelist so that the QEMU
387 process can only access shared devices, and explicitly disks images
388 backed by block devices.
392 The list of shared devices a guest is allowed access to is
396 /dev/null, /dev/full, /dev/zero,
397 /dev/random, /dev/urandom,
403 In the event of unanticipated needs arising, this can be customized
404 via the
<code>/etc/libvirt/qemu.conf
</code> file.
405 To mount the cgroups device controller, the following command
406 should be run as root, prior to starting libvirtd
411 mount -t cgroup none /dev/cgroup -o devices
415 libvirt will then place each virtual machine in a cgroup at
416 <code>/dev/cgroup/libvirt/qemu/$VMNAME/
</code>
419 <h2><a id=
"imex">Import and export of libvirt domain XML configs
</a></h2>
421 <p>The QEMU driver currently supports a single native
422 config format known as
<code>qemu-argv
</code>. The data for this format
423 is expected to be a single line first a list of environment variables,
424 then the QEMu binary name, finally followed by the QEMU command line
427 <h3><a id=
"xmlimport">Converting from QEMU args to domain XML
</a></h3>
430 <b>Note:
</b> this operation is
<span class=
"removed"> deleted as of
431 5.5.0</span> and will return an error.
434 The
<code>virsh domxml-from-native
</code> provides a way to
435 convert an existing set of QEMU args into a guest description
436 using libvirt Domain XML that can then be used by libvirt.
437 Please note that this command is intended to be used to convert
438 existing qemu guests previously started from the command line to
439 be managed through libvirt. It should not be used a method of
440 creating new guests from scratch. New guests should be created
441 using an application calling the libvirt APIs (see
442 the
<a href=
"apps.html">libvirt applications page
</a> for some
443 examples) or by manually crafting XML to pass to virsh.
446 <h3><a id=
"xmlexport">Converting from domain XML to QEMU args
</a></h3>
449 The
<code>virsh domxml-to-native
</code> provides a way to convert a
450 guest description using libvirt Domain XML, into a set of QEMU args
451 that can be run manually. Note that currently the command line formatted
452 by libvirt is no longer suited for manually running qemu as the
453 configuration expects various resources and open file descriptors passed
454 to the process which are usually prepared by libvirtd.
457 <h2><a id=
"qemucommand">Pass-through of arbitrary qemu
460 <p>Libvirt provides an XML namespace and an optional
461 library
<code>libvirt-qemu.so
</code> for dealing specifically
462 with qemu. When used correctly, these extensions allow testing
463 specific qemu features that have not yet been ported to the
464 generic libvirt XML and API interfaces. However, they
465 are
<b>unsupported
</b>, in that the library is not guaranteed to
466 have a stable API, abusing the library or XML may result in
467 inconsistent state the crashes libvirtd, and upgrading either
468 qemu-kvm or libvirtd may break behavior of a domain that was
469 relying on a qemu-specific pass-through. If you find yourself
470 needing to use them to access a particular qemu feature, then
471 please post an RFE to the libvirt mailing list to get that
472 feature incorporated into the stable libvirt XML and API
475 <p>The library provides two
476 API:
<code>virDomainQemuMonitorCommand
</code>, for sending an
477 arbitrary monitor command (in either HMP or QMP format) to a
478 qemu guest (
<span class=
"since">Since
0.8.3</span>),
479 and
<code>virDomainQemuAttach
</code>, for registering a qemu
480 domain that was manually started so that it can then be managed
481 by libvirtd (
<span class=
"since">Since
0.9.4</span>,
482 <span class=
"removed">removed as of
5.5.0</span>).
484 <p>Additionally, the following XML additions allow fine-tuning of
485 the command line given to qemu when starting a domain
486 (
<span class=
"since">Since
0.8.3</span>). In order to use the
487 XML additions, it is necessary to issue an XML namespace request
488 (the special
<code>xmlns:
<i>name
</i></code> attribute) that
489 pulls in
<code>http://libvirt.org/schemas/domain/qemu/
1.0</code>;
490 typically, the namespace is given the name
491 of
<code>qemu
</code>. With the namespace in place, it is then
492 possible to add an element
<code><qemu:commandline
></code>
493 under
<code>driver
</code>, with the following sub-elements
494 repeated as often as needed:
497 <dt><code>qemu:arg
</code></dt>
498 <dd>Add an additional command-line argument to the qemu
499 process when starting the domain, given by the value of the
500 attribute
<code>value
</code>.
502 <dt><code>qemu:env
</code></dt>
503 <dd>Add an additional environment variable to the qemu
504 process when starting the domain, given with the name-value
505 pair recorded in the attributes
<code>name
</code>
506 and optional
<code>value
</code>.
</dd>
509 <domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/
1.0'
>
510 <name
>QEMU-fedora-i686
</name
>
511 <memory
>219200</memory
>
513 <type arch='i686' machine='pc'
>hvm
</type
>
516 <emulator
>/usr/bin/qemu-system-x86_64
</emulator
>
518 <qemu:commandline
>
519 <qemu:arg value='-newarg'/
>
520 <qemu:env name='QEMU_ENV' value='VAL'/
>
521 </qemu:commandline
>
525 <h2><a id=
"xmlnsfeatures">QEMU feature configuration for testing
</a></h2>
528 In some cases e.g. when developing a new feature or for testing it may
529 be required to control a given qemu feature (or qemu capability) to test
530 it before it's complete or disable it for debugging purposes.
531 <span class=
"since">Since
5.5.0</span> it's possible to use the same
532 special qemu namespace as above
533 (
<code>http://libvirt.org/schemas/domain/qemu/
1.0</code>) and use
534 <code><qemu:capabilities
></code> element to add
535 (
<code><qemu:add
capability=
"capname"/
></code>) or remove
536 (
<code><qemu:del
capability=
"capname"/
></code>) capability bits.
537 The naming of the feature bits is the same libvirt uses in the status
538 XML. Note that this feature is meant for experiments only and should
539 _not_ be used in production.
543 <domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/
1.0'
>
544 <name
>testvm
</name
>
548 <qemu:capabilities
>
549 <qemu:add capability='blockdev'/
>
550 <qemu:del capability='drive'/
>
551 </qemu:capabilities
>
555 <h2><a id=
"xmlconfig">Example domain XML config
</a></h2>
557 <h3>QEMU emulated guest on x86_64
</h3>
559 <pre><domain type='qemu'
>
560 <name
>QEMU-fedora-i686
</name
>
561 <uuid
>c7a5fdbd-cdaf-
9455-
926a-d65c16db1809
</uuid
>
562 <memory
>219200</memory
>
563 <currentMemory
>219200</currentMemory
>
564 <vcpu
>2</vcpu
>
566 <type arch='i686' machine='pc'
>hvm
</type
>
567 <boot dev='cdrom'/
>
570 <emulator
>/usr/bin/qemu-system-x86_64
</emulator
>
571 <disk type='file' device='cdrom'
>
572 <source file='/home/user/boot.iso'/
>
573 <target dev='hdc'/
>
576 <disk type='file' device='disk'
>
577 <source file='/home/user/fedora.img'/
>
578 <target dev='hda'/
>
580 <interface type='network'
>
581 <source network='default'/
>
583 <graphics type='vnc' port='-
1'/
>
585 </domain
></pre>
587 <h3>KVM hardware accelerated guest on i686
</h3>
589 <pre><domain type='kvm'
>
590 <name
>demo2
</name
>
591 <uuid
>4dea24b3-
1d52-d8f3-
2516-
782e98a23fa0
</uuid
>
592 <memory
>131072</memory
>
593 <vcpu
>1</vcpu
>
595 <type
arch=
"i686">hvm
</type
>
597 <clock
sync=
"localtime"/
>
599 <emulator
>/usr/bin/qemu-kvm
</emulator
>
600 <disk type='file' device='disk'
>
601 <source file='/var/lib/libvirt/images/demo2.img'/
>
602 <target dev='hda'/
>
604 <interface type='network'
>
605 <source network='default'/
>
606 <mac address='
24:
42:
53:
21:
52:
45'/
>
608 <graphics type='vnc' port='-
1' keymap='de'/
>
610 </domain
></pre>