1 ===============================
2 IOMMUFD BACKEND usage with VFIO
3 ===============================
5 (Same meaning for backend/container/BE)
7 With the introduction of iommufd, the Linux kernel provides a generic
8 interface for user space drivers to propagate their DMA mappings to kernel
9 for assigned devices. While the legacy kernel interface is group-centric,
10 the new iommufd interface is device-centric, relying on device fd and iommufd.
12 To support both interfaces in the QEMU VFIO device, introduce a base container
13 to abstract the common part of VFIO legacy and iommufd container. So that the
14 generic VFIO code can use either container.
16 The base container implements generic functions such as memory_listener and
17 address space management whereas the derived container implements callbacks
18 specific to either legacy or iommufd. Each container has its own way to setup
19 secure context and dma management interface. The below diagram shows how it
20 looks like with both containers.
24 VFIO AddressSpace/Memory
25 +-------+ +----------+ +-----+ +-----+
26 | pci | | platform | | ap | | ccw |
27 +---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+
28 | | | | | AddressSpace |
29 | | | | +------------+---------+
30 +---V-----------V-----------V--------V----+ /
31 | VFIOAddressSpace | <------------+
33 | VFIOContainerBase list |
34 +-------+----------------------------+----+
37 +-------V------+ +--------V----------+
38 | iommufd | | vfio legacy |
39 | container | | container |
40 +-------+------+ +--------+----------+
42 | /dev/iommu | /dev/vfio/vfio
43 | /dev/vfio/devices/vfioX | /dev/vfio/$group_id
45 ============+============================+===========================
47 +---------------+ | group/container fd
48 | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
49 | ATTACH_IOAS) | | device fd
51 | +-------V------------V-----------------+
53 (map/unmap | +---------+--------------------+-------+
54 ioas_copy) | | | map/unmap
56 +------V------+ +-----V------+ +------V--------+
57 | iommfd core | | device | | vfio iommu |
58 +-------------+ +------------+ +---------------+
60 * Secure Context setup
62 - iommufd BE: uses device fd and iommufd to setup secure context
63 (bind_iommufd, attach_ioas)
64 - vfio legacy BE: uses group fd and container fd to setup secure context
65 (set_container, set_iommu)
69 - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
70 - vfio legacy BE: device fd is retrieved from group fd ioctl
74 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
75 2. VFIO populates DMA map/unmap via the container BEs
76 * iommufd BE: uses iommufd
77 * vfio legacy BE: uses container fd
82 Step 1: configure the host device
83 ---------------------------------
85 It's exactly same as the VFIO device with legacy VFIO container.
87 Step 2: configure QEMU
88 ----------------------
90 Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
91 object (compiled in with the ``CONFIG_IOMMUFD`` option).
93 Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
94 be linked with an iommufd object. It gets a new optional property
95 named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
100 -object iommufd,id=iommufd0
101 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
103 Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
104 management layer. In such a case the fd is passed, the fd supports a
105 string naming the fd or a number, for example:
109 -object iommufd,id=iommufd0,fd=22
110 -device vfio-pci,iommufd=iommufd0,fd=23
112 If the ``fd`` property is not passed, the fd is opened by QEMU.
114 If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
115 is not used and the user gets the behavior based on the legacy VFIO
120 -device vfio-pci,host=0000:02:00.0
125 Supports x86, ARM and s390x currently.
133 Dirty page sync with iommufd backend is unsupported yet, live migration is
134 disabled by default. But it can be force enabled like below, low efficient
139 -object iommufd,id=iommufd0
140 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
145 PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
146 BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
150 qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
151 qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
156 ``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
157 If FD passing is used, there is no way to know that and the mdev is treated
158 like a real PCI device. There is an error as below if user wants to enable
159 RAM discarding for mdev.
163 qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
165 ``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
166 devices are always mdev and RAM discarding is force enabled.