10eb73424SAlex WilliamsonIntel Graphics Device (IGD) assignment with vfio-pci 20eb73424SAlex Williamson==================================================== 30eb73424SAlex Williamson 45dbe25e9STomita MoekoUsing vfio-pci, we can passthrough Intel Graphics Device (IGD) to guest, either 55dbe25e9STomita Moekoserve as primary and exclusive graphics adapter, or used in combination with an 65dbe25e9STomita Moekoemulated primary graphics device, depending on the config and guest driver 75dbe25e9STomita Moekosupport. However, IGD devices are not "clean" PCI devices, they use extra 85dbe25e9STomita Moekomemory regions other than BARs. Special handling is required to make them work 95dbe25e9STomita Moekoproperly, including: 100eb73424SAlex Williamson 115dbe25e9STomita Moeko* OpRegion for accessing Virtual BIOS Table (VBT) that contains display output 125dbe25e9STomita Moeko information. 135dbe25e9STomita Moeko* Data Stolen Memory (DSM) region used as VRAM at early stage (BIOS/UEFI) 140eb73424SAlex Williamson 155dbe25e9STomita MoekoCertain guest software also depends on following conditions to work: 165dbe25e9STomita Moeko(*-Required by) 170eb73424SAlex Williamson 185dbe25e9STomita Moeko| Condition | Linux | Windows | VBIOS | EFI GOP | 195dbe25e9STomita Moeko|---------------------------------------------|-------|---------|-------|---------| 205dbe25e9STomita Moeko| #1 IGD has a valid OpRegion containing VBT | * ^1 | * | * | * | 215dbe25e9STomita Moeko| #2 VID/DID of LPC bridge at 00:1f.0 matches | | | * | * | 225dbe25e9STomita Moeko| #3 IGD is assigned to BDF 00:02.0 | | | * | * | 235dbe25e9STomita Moeko| #4 IGD has VGA controller device class | | | * | * | 245dbe25e9STomita Moeko| #5 Host's VGA ranges are mapped to IGD | | | * | | 255dbe25e9STomita Moeko| #6 Guest has valid VBIOS or UEFI Option ROM | | | * | * | 260eb73424SAlex Williamson 275dbe25e9STomita Moeko^1 Though i915 driver is able to mock a OpRegion, it is still recommended to 285dbe25e9STomita Moeko use the VBT copied from host OpRegion to prevent incorrect configuration. 290eb73424SAlex Williamson 305dbe25e9STomita MoekoFor #1, the "x-igd-opregion=on" option exposes a copy of host IGD OpRegion to 315dbe25e9STomita Moekoguest via fw_cfg, where guest firmware can set up guest OpRegion with it. 325dbe25e9STomita Moeko 335dbe25e9STomita MoekoFor #2, "x-igd-lpc=on" option copies the IDs of host LPC bridge and host bridge 345dbe25e9STomita Moekoto guest. Currently this is only supported on i440fx machines as there is 355dbe25e9STomita Moekoalready an ICH9 LPC bridge present on q35 machines, overwriting its IDs may 365dbe25e9STomita Moekolead to unexpected behavior. 375dbe25e9STomita Moeko 385dbe25e9STomita MoekoFor #3, "addr=2.0" assigns IGD to 00:02.0. 395dbe25e9STomita Moeko 405dbe25e9STomita MoekoFor #4, the primary display must be set to IGD in host BIOS. 415dbe25e9STomita Moeko 425dbe25e9STomita MoekoFor #5, "x-vga=on" enables guest access to standard VGA IO/MMIO ranges. 435dbe25e9STomita Moeko 445dbe25e9STomita MoekoFor #6, ROM either provided via the ROM BAR or romfile= option is needed, this 455dbe25e9STomita MoekoIntel document [1] shows how to dump VBIOS to file. For UEFI Option ROM, see 465dbe25e9STomita Moeko"Guest firmware" section. 475dbe25e9STomita Moeko 485dbe25e9STomita MoekoQEMU also provides a "Legacy" mode that implicitly enables full functionality 495dbe25e9STomita Moekoon IGD, it is automatically enabled when 50dd69d846STomita Moeko* IGD generation is 6 to 9 (Sandy Bridge to Comet Lake) 515dbe25e9STomita Moeko* Machine type is i440fx 525dbe25e9STomita Moeko* IGD is assigned to guest BDF 00:02.0 535dbe25e9STomita Moeko* ROM BAR or romfile is present 545dbe25e9STomita Moeko 555dbe25e9STomita MoekoIn "Legacy" mode, QEMU will automatically setup OpRegion, LPC bridge IDs and 565dbe25e9STomita MoekoVGA range access, which is equivalent to: 575dbe25e9STomita Moeko x-igd-opregion=on,x-igd-lpc=on,x-vga=on 585dbe25e9STomita Moeko 595dbe25e9STomita MoekoBy default, "Legacy" mode won't fail, it continues on error. User can set 605dbe25e9STomita Moeko"x-igd-legacy-mode=on" to force enabling legacy mode, this also checks if the 615dbe25e9STomita Moekoconditions above for legacy mode is met, and if any error occurs, QEMU will 625dbe25e9STomita Moekofail immediately. Users can also set "x-igd-legacy-mode=off" to disable legacy 635dbe25e9STomita Moekomode. 645dbe25e9STomita Moeko 655dbe25e9STomita MoekoIn legacy mode, as the guest VGA ranges are assigned to IGD device, all other 665dbe25e9STomita Moekographics devices should be removed, this can be done using "-nographic" or 675dbe25e9STomita Moeko"-vga none" or "-nodefaults", along with adding the device using vfio-pci. 680eb73424SAlex Williamson 690eb73424SAlex WilliamsonFor either mode, depending on the host kernel, the i915 driver in the host 700eb73424SAlex Williamsonmay generate faults and errors upon re-binding to an IGD device after it 710eb73424SAlex Williamsonhas been assigned to a VM. It's therefore generally recommended to prevent 720eb73424SAlex Williamsonsuch driver binding unless the host driver is known to work well for this. 730eb73424SAlex WilliamsonThere are numerous ways to do this, i915 can be blacklisted on the host, 740eb73424SAlex Williamsonthe driver_override option can be used to ensure that only vfio-pci can bind 750eb73424SAlex Williamsonto the device on the host[2], virsh nodedev-detach can be used to bind the 760eb73424SAlex Williamsondevice to vfio drivers and then managed='no' set in the VM xml to prevent 770eb73424SAlex Williamsonre-binding to i915, etc. Also note that IGD is also typically the primary 780eb73424SAlex Williamsongraphics in the host and special options may be required beyond simply 790eb73424SAlex Williamsonblacklisting i915 or using pci-stub/vfio-pci to take ownership of IGD as a 800eb73424SAlex WilliamsonPCI class device. Lower level drivers exist that may still claim the device. 810eb73424SAlex WilliamsonIt may therefore be necessary to use kernel boot options video=vesafb:off or 820eb73424SAlex Williamsonvideo=efifb:off (depending on host BIOS/UEFI) or these can be combined to 830eb73424SAlex Williamsona catch-all, video=vesafb:off,efifb:off. Error messages such as: 840eb73424SAlex Williamson 850eb73424SAlex Williamson Failed to mmap 0000:00:02.0 BAR <>. Performance may be slow 860eb73424SAlex Williamson 870eb73424SAlex Williamsonare a good indicator that such a problem exists. The host files /proc/iomem 880eb73424SAlex Williamsonand /proc/ioports are often useful for identifying drivers consuming ranges 890eb73424SAlex Williamsonof the device to cause such conflicts. 900eb73424SAlex Williamson 910eb73424SAlex WilliamsonAdditionally, IGD device are known to generate small numbers of DMAR faults 920eb73424SAlex Williamsonwhen initially assigned. It is believed that this is simply the IGD attempting 930eb73424SAlex Williamsonto access the reserved GTT space after reset, which it no longer has access to 940eb73424SAlex Williamsonwhen accessed from userspace. So long as the DMAR faults are small in number 950eb73424SAlex Williamsonand most importantly, not ongoing, these are not an indication of an error. 960eb73424SAlex Williamson 970eb73424SAlex WilliamsonAdditionally++, analog VGA output (as opposed to digital outputs like HDMI, 980eb73424SAlex WilliamsonDVI, or DisplayPort) may be unsupported in some use cases. In the author's 990eb73424SAlex Williamsonexperience, even DP to VGA adapters can be troublesome while adapters between 1000eb73424SAlex Williamsondigital formats work well. 1010eb73424SAlex Williamson 1020eb73424SAlex Williamson 1035dbe25e9STomita MoekoOptions 1045dbe25e9STomita Moeko======= 10516cbb433STomita Moeko* x-igd-opregion=[*on*|off] 1065dbe25e9STomita Moeko Copy host IGD OpRegion and expose it to guest with fw_cfg 1070eb73424SAlex Williamson 1085dbe25e9STomita Moeko* x-igd-lpc=[on|*off*] 1095dbe25e9STomita Moeko Creates a dummy LPC bridge at 00:1f:0 with host VID/DID (i440fx only) 1100eb73424SAlex Williamson 1115dbe25e9STomita Moeko* x-igd-legacy-mode=[on|off|*auto*] 1125dbe25e9STomita Moeko Enable/Disable legacy mode 1130eb73424SAlex Williamson 1145dbe25e9STomita Moeko* x-igd-gms=[hex, default 0] 1155dbe25e9STomita Moeko Overriding DSM region size in GGC register, 0 means uses host value. 1165dbe25e9STomita Moeko Use this only when the DSM size cannot be changed through the 1175dbe25e9STomita Moeko 'DVMT Pre-Allocated' option in host BIOS. 1185dbe25e9STomita Moeko 1195dbe25e9STomita Moeko 1205dbe25e9STomita MoekoExamples 1215dbe25e9STomita Moeko======== 1225dbe25e9STomita Moeko* Adding IGD with automatically legacy mode support 1235dbe25e9STomita Moeko -device vfio-pci,host=00:02.0,id=hostdev0,addr=2.0 1245dbe25e9STomita Moeko 1255dbe25e9STomita Moeko* Adding IGD with OpRegion and LPC ID hack, but without VGA ranges 1265dbe25e9STomita Moeko (For UEFI guests) 12716cbb433STomita Moeko -device vfio-pci,host=00:02.0,id=hostdev0,addr=2.0,x-igd-legacy-mode=off,x-igd-lpc=on,romfile=efi_oprom.rom 1285dbe25e9STomita Moeko 1295dbe25e9STomita Moeko 1305dbe25e9STomita MoekoGuest firmware 1315dbe25e9STomita Moeko============== 1325dbe25e9STomita MoekoGuest firmware is responsible for setting up OpRegion and Base of Data Stolen 1335dbe25e9STomita MoekoMemory (BDSM) in guest address space. IGD passthrough support imposes two 1345dbe25e9STomita Moekofw_cfg requirements on the VM firmware: 1350eb73424SAlex Williamson 1360eb73424SAlex Williamson1) "etc/igd-opregion" 1370eb73424SAlex Williamson 1380eb73424SAlex Williamson This fw_cfg file exposes the OpRegion for the IGD device. A reserved 1390eb73424SAlex Williamson region should be created below 4GB (recommended 4KB alignment), sized 1400eb73424SAlex Williamson sufficient for the fw_cfg file size, and the content of this file copied 1410eb73424SAlex Williamson to it. The dword based address of this reserved memory region must also 1420eb73424SAlex Williamson be written to the ASLS register at offset 0xFC on the IGD device. It is 1430eb73424SAlex Williamson recommended that firmware should make use of this fw_cfg entry for any 1440eb73424SAlex Williamson PCI class VGA device with Intel vendor ID. Multiple of such devices 1450eb73424SAlex Williamson within a VM is undefined. 1460eb73424SAlex Williamson 1470eb73424SAlex Williamson2) "etc/igd-bdsm-size" 1480eb73424SAlex Williamson 1490eb73424SAlex Williamson This fw_cfg file contains an 8-byte, little endian integer indicating 1500eb73424SAlex Williamson the size of the reserved memory region required for IGD stolen memory. 1510eb73424SAlex Williamson Firmware must allocate a reserved memory below 4GB with required 1MB 1520eb73424SAlex Williamson alignment equal to this size. Additionally the base address of this 1530eb73424SAlex Williamson reserved region must be written to the dword BDSM register in PCI config 1545dbe25e9STomita Moeko space of the IGD device at offset 0x5C (or 0xC0 for Gen 11+ devices using 1555dbe25e9STomita Moeko 64-bit BDSM). As this support is related to running the IGD ROM, which 1565dbe25e9STomita Moeko has other dependencies on the device appearing at guest address 00:02.0, 1575dbe25e9STomita Moeko it's expected that this fw_cfg file is only relevant to a single PCI 1585dbe25e9STomita Moeko class VGA device with Intel vendor ID, appearing at PCI bus address 00:02.0. 1595dbe25e9STomita Moeko 160*7969cf46STomita Moeko Starting from Meteor Lake, IGD devices access stolen memory via its MMIO 161*7969cf46STomita Moeko BAR2 (LMEMBAR) and removed the BDSM register in config space. There is 162*7969cf46STomita Moeko no need for guest firmware to allocate data stolen memory in guest address 163*7969cf46STomita Moeko space and write it to BDSM register. Value of this fw_cfg file is 0 in 164*7969cf46STomita Moeko such case. 165*7969cf46STomita Moeko 1665dbe25e9STomita MoekoUpstream Seabios has OpRegion and BDSM (pre-Gen11 device only) support. 1675dbe25e9STomita MoekoHowever, the support is not accepted by upstream EDK2/OVMF. A recommended 1685dbe25e9STomita Moekosolution is to create a virtual OpRom with following DXE drivers: 1695dbe25e9STomita Moeko 1705dbe25e9STomita Moeko* IgdAssignmentDxe: Set up OpRegion and BDSM according to fw_cfg (must) 1715dbe25e9STomita Moeko* IntelGopDriver: Closed-source Intel GOP driver 1725dbe25e9STomita Moeko* PlatformGopPolicy: Protocol required by IntelGopDriver 1735dbe25e9STomita Moeko 1745dbe25e9STomita MoekoIntelGopDriver and PlatformGopPolicy is only required when enabling GOP on IGD. 1755dbe25e9STomita Moeko 1765dbe25e9STomita MoekoThe original IgdAssignmentDxe can be found at [3]. A Intel maintained version 1775dbe25e9STomita Moekowith PlatformGopPolicy for industrial computing is at [4]. There is also an 1785dbe25e9STomita Moekounofficially maintained version with newer Gen11+ device support at [5]. 1795dbe25e9STomita MoekoYou need to build them with EDK2. 1805dbe25e9STomita Moeko 1815dbe25e9STomita MoekoFor the IntelGopDriver, Intel never released it to public. You may contact 1825dbe25e9STomita MoekoIntel support to get one as [4] said, if you are an Intel Premier Support 1835dbe25e9STomita Moekocustomer, or you can try extracting it from your host firmware using 1845dbe25e9STomita Moeko"UEFI BIOS Updater"[6]. 1855dbe25e9STomita Moeko 1865dbe25e9STomita MoekoOnce you got all the required DXE drivers, a Option ROM can be generated with 1875dbe25e9STomita MoekoEfiRom utility in EDK2, using 1885dbe25e9STomita Moeko EfiRom -f 0x8086 -i <Device ID of your IGD> -o output.rom \ 1895dbe25e9STomita Moeko -e IgdAssignmentDxe.efi PlatformGOPPolicy.efi IntelGopDriver.efi 1905dbe25e9STomita Moeko 1915dbe25e9STomita Moeko 1925dbe25e9STomita MoekoKnown issues 1935dbe25e9STomita Moeko============ 1945dbe25e9STomita MoekoWhen using OVMF as guest firmware, you may encounter the following warning: 1955dbe25e9STomita Moekowarning: vfio_container_dma_map(0x55fab36ce610, 0x380010000000, 0x108000, 0x7fd336000000) = -22 (Invalid argument) 1965dbe25e9STomita Moeko 1975dbe25e9STomita MoekoSolution: 1985dbe25e9STomita MoekoSet the host physical address bits to IOMMU address width using 1995dbe25e9STomita Moeko -cpu host,host-phys-bits-limit=<IOMMU address width> 2005dbe25e9STomita MoekoOr in libvirt XML with 2015dbe25e9STomita Moeko <cpu> 2025dbe25e9STomita Moeko <maxphysaddr mode='passthrough' limit='<IOMMU address width>'/> 2035dbe25e9STomita Moeko </cpu> 2045dbe25e9STomita MoekoThe IOMMU address width can be determined with 2055dbe25e9STomita Moeko echo $(( ((0x$(cat /sys/devices/virtual/iommu/dmar0/intel-iommu/cap) & 0x3F0000) >> 16) + 1 )) 2065dbe25e9STomita MoekoRefer https://edk2.groups.io/g/devel/topic/patch_v1/102359124 for more details 2075dbe25e9STomita Moeko 2085dbe25e9STomita Moeko 2095dbe25e9STomita MoekoMemory View 2105dbe25e9STomita Moeko=========== 2115dbe25e9STomita MoekoIGD has it own address space. To use system RAM as VRAM, a single-level page 2125dbe25e9STomita Moekotable named Global Graphics Translation Table (GTT) is used for the address 2135dbe25e9STomita Moekotranslation. Each page table entry points a 4KB page. Illustration below shows 2145dbe25e9STomita Moekothe translation flow on IGD with 64-bit GTT PTEs. 2155dbe25e9STomita Moeko 2165dbe25e9STomita Moeko(PTE_SIZE == 8) +-------------+---+ 2175dbe25e9STomita Moeko | Address | V | V: Valid Bit 2185dbe25e9STomita Moeko +-------------+---+ 2195dbe25e9STomita Moeko | ... | | 2205dbe25e9STomita MoekoIGD:0x01ae9010 0xd740| 0x70ffc000 | 1 | Mem:0x42ba3e010^ 2215dbe25e9STomita Moeko-----------------------> 0xd748| 0x42ba3e000 | 1 +------------------> 2225dbe25e9STomita Moeko(addr >> 12) * PTE_SIZE 0xd750| 0x42ba3f000 | 1 | 2235dbe25e9STomita Moeko | ... | | 2245dbe25e9STomita Moeko +-------------+---+ 2255dbe25e9STomita Moeko^ The address may be remapped by IOMMU 2265dbe25e9STomita Moeko 2275dbe25e9STomita MoekoThe memory region store GTT is called GTT Stolen Memory (GSM) it is located 2285dbe25e9STomita Moekoright below the Data Stolen Memory (DSM). Accessing this region directly is 2295dbe25e9STomita Moekonot allowed, any access will immediately freeze the whole system. The only way 2305dbe25e9STomita Moekoto access it is through the second half of MMIO BAR0. 2315dbe25e9STomita Moeko 2325dbe25e9STomita MoekoThe Data Stolen Memory is reserved by firmware, and acts as the VRAM in pre-OS 2335dbe25e9STomita Moekoenvironments. In QEMU, guest firmware (Seabios/OVMF) is responsible for 2345dbe25e9STomita Moekoreserving a continuous region and program its base address to BDSM register, 2355dbe25e9STomita Moekothen let VBIOS/GOP driver initializing this region. Illustration below shows 2365dbe25e9STomita Moekohow DSM is mapped. 2375dbe25e9STomita Moeko 2385dbe25e9STomita Moeko IGD Addr Space Host Addr Space Guest Addr Space 2395dbe25e9STomita Moeko +-------------+ +-------------+ +-------------+ 2405dbe25e9STomita Moeko | | | | | | 2415dbe25e9STomita Moeko | | | | | | 2425dbe25e9STomita Moeko | | +-------------+ +-------------+ 2435dbe25e9STomita Moeko | | | Data Stolen | | Data Stolen | 2445dbe25e9STomita Moeko | | | (Guest) | | (Guest) | 2455dbe25e9STomita Moeko | | +------------>+-------------+<------->+-------------+<--Guest BDSM 2465dbe25e9STomita Moeko | | | Passthrough | | EPT | | Emulated by QEMU 2475dbe25e9STomita MoekoDSMSIZE+-------------+ | with IOMMU | | Mapping | | Programmed by guest FW 2485dbe25e9STomita Moeko | | | | | | | 2495dbe25e9STomita Moeko | | | | | | | 2505dbe25e9STomita Moeko 0+-------------+--+ | | | | 2515dbe25e9STomita Moeko | +-------------+ | | 2525dbe25e9STomita Moeko | | Data Stolen | +-------------+ 2535dbe25e9STomita Moeko | | (Host) | 2545dbe25e9STomita Moeko +------------>+-------------+<--Host BDSM 2555dbe25e9STomita Moeko Non- | | "real" one in HW 2565dbe25e9STomita Moeko Passthrough | | Programmed by host FW 2575dbe25e9STomita Moeko +-------------+ 2580eb73424SAlex Williamson 2590eb73424SAlex WilliamsonFootnotes 2600eb73424SAlex Williamson========= 2615dbe25e9STomita Moeko[1] https://www.intel.com/content/www/us/en/docs/graphics-for-linux/developer-reference/1-0/dump-video-bios.html 2620eb73424SAlex Williamson[2] # echo "vfio-pci" > /sys/bus/pci/devices/0000:00:02.0/driver_override 2635dbe25e9STomita Moeko[3] https://web.archive.org/web/20240827012422/https://bugzilla.tianocore.org/show_bug.cgi?id=935 2645dbe25e9STomita Moeko Tianocore bugzilla was down since Jan 2025 :( 2655dbe25e9STomita Moeko[4] https://eci.intel.com/docs/3.3/components/kvm-hypervisor.html, Patch 0001-0004 2665dbe25e9STomita Moeko[5] https://github.com/tomitamoeko/VfioIgdPkg 2675dbe25e9STomita Moeko[6] https://winraid.level1techs.com/t/tool-guide-news-uefi-bios-updater-ubu/30357 268