1*c02c112aSPeter MaydellSecurity 2*c02c112aSPeter Maydell======== 3*c02c112aSPeter Maydell 4*c02c112aSPeter MaydellOverview 5*c02c112aSPeter Maydell-------- 6*c02c112aSPeter Maydell 7*c02c112aSPeter MaydellThis chapter explains the security requirements that QEMU is designed to meet 8*c02c112aSPeter Maydelland principles for securely deploying QEMU. 9*c02c112aSPeter Maydell 10*c02c112aSPeter MaydellSecurity Requirements 11*c02c112aSPeter Maydell--------------------- 12*c02c112aSPeter Maydell 13*c02c112aSPeter MaydellQEMU supports many different use cases, some of which have stricter security 14*c02c112aSPeter Maydellrequirements than others. The community has agreed on the overall security 15*c02c112aSPeter Maydellrequirements that users may depend on. These requirements define what is 16*c02c112aSPeter Maydellconsidered supported from a security perspective. 17*c02c112aSPeter Maydell 18*c02c112aSPeter MaydellVirtualization Use Case 19*c02c112aSPeter Maydell''''''''''''''''''''''' 20*c02c112aSPeter Maydell 21*c02c112aSPeter MaydellThe virtualization use case covers cloud and virtual private server (VPS) 22*c02c112aSPeter Maydellhosting, as well as traditional data center and desktop virtualization. These 23*c02c112aSPeter Maydelluse cases rely on hardware virtualization extensions to execute guest code 24*c02c112aSPeter Maydellsafely on the physical CPU at close-to-native speed. 25*c02c112aSPeter Maydell 26*c02c112aSPeter MaydellThe following entities are untrusted, meaning that they may be buggy or 27*c02c112aSPeter Maydellmalicious: 28*c02c112aSPeter Maydell 29*c02c112aSPeter Maydell- Guest 30*c02c112aSPeter Maydell- User-facing interfaces (e.g. VNC, SPICE, WebSocket) 31*c02c112aSPeter Maydell- Network protocols (e.g. NBD, live migration) 32*c02c112aSPeter Maydell- User-supplied files (e.g. disk images, kernels, device trees) 33*c02c112aSPeter Maydell- Passthrough devices (e.g. PCI, USB) 34*c02c112aSPeter Maydell 35*c02c112aSPeter MaydellBugs affecting these entities are evaluated on whether they can cause damage in 36*c02c112aSPeter Maydellreal-world use cases and treated as security bugs if this is the case. 37*c02c112aSPeter Maydell 38*c02c112aSPeter MaydellNon-virtualization Use Case 39*c02c112aSPeter Maydell''''''''''''''''''''''''''' 40*c02c112aSPeter Maydell 41*c02c112aSPeter MaydellThe non-virtualization use case covers emulation using the Tiny Code Generator 42*c02c112aSPeter Maydell(TCG). In principle the TCG and device emulation code used in conjunction with 43*c02c112aSPeter Maydellthe non-virtualization use case should meet the same security requirements as 44*c02c112aSPeter Maydellthe virtualization use case. However, for historical reasons much of the 45*c02c112aSPeter Maydellnon-virtualization use case code was not written with these security 46*c02c112aSPeter Maydellrequirements in mind. 47*c02c112aSPeter Maydell 48*c02c112aSPeter MaydellBugs affecting the non-virtualization use case are not considered security 49*c02c112aSPeter Maydellbugs at this time. Users with non-virtualization use cases must not rely on 50*c02c112aSPeter MaydellQEMU to provide guest isolation or any security guarantees. 51*c02c112aSPeter Maydell 52*c02c112aSPeter MaydellArchitecture 53*c02c112aSPeter Maydell------------ 54*c02c112aSPeter Maydell 55*c02c112aSPeter MaydellThis section describes the design principles that ensure the security 56*c02c112aSPeter Maydellrequirements are met. 57*c02c112aSPeter Maydell 58*c02c112aSPeter MaydellGuest Isolation 59*c02c112aSPeter Maydell''''''''''''''' 60*c02c112aSPeter Maydell 61*c02c112aSPeter MaydellGuest isolation is the confinement of guest code to the virtual machine. When 62*c02c112aSPeter Maydellguest code gains control of execution on the host this is called escaping the 63*c02c112aSPeter Maydellvirtual machine. Isolation also includes resource limits such as throttling of 64*c02c112aSPeter MaydellCPU, memory, disk, or network. Guests must be unable to exceed their resource 65*c02c112aSPeter Maydelllimits. 66*c02c112aSPeter Maydell 67*c02c112aSPeter MaydellQEMU presents an attack surface to the guest in the form of emulated devices. 68*c02c112aSPeter MaydellThe guest must not be able to gain control of QEMU. Bugs in emulated devices 69*c02c112aSPeter Maydellcould allow malicious guests to gain code execution in QEMU. At this point the 70*c02c112aSPeter Maydellguest has escaped the virtual machine and is able to act in the context of the 71*c02c112aSPeter MaydellQEMU process on the host. 72*c02c112aSPeter Maydell 73*c02c112aSPeter MaydellGuests often interact with other guests and share resources with them. A 74*c02c112aSPeter Maydellmalicious guest must not gain control of other guests or access their data. 75*c02c112aSPeter MaydellDisk image files and network traffic must be protected from other guests unless 76*c02c112aSPeter Maydellexplicitly shared between them by the user. 77*c02c112aSPeter Maydell 78*c02c112aSPeter MaydellPrinciple of Least Privilege 79*c02c112aSPeter Maydell'''''''''''''''''''''''''''' 80*c02c112aSPeter Maydell 81*c02c112aSPeter MaydellThe principle of least privilege states that each component only has access to 82*c02c112aSPeter Maydellthe privileges necessary for its function. In the case of QEMU this means that 83*c02c112aSPeter Maydelleach process only has access to resources belonging to the guest. 84*c02c112aSPeter Maydell 85*c02c112aSPeter MaydellThe QEMU process should not have access to any resources that are inaccessible 86*c02c112aSPeter Maydellto the guest. This way the guest does not gain anything by escaping into the 87*c02c112aSPeter MaydellQEMU process since it already has access to those same resources from within 88*c02c112aSPeter Maydellthe guest. 89*c02c112aSPeter Maydell 90*c02c112aSPeter MaydellFollowing the principle of least privilege immediately fulfills guest isolation 91*c02c112aSPeter Maydellrequirements. For example, guest A only has access to its own disk image file 92*c02c112aSPeter Maydell``a.img`` and not guest B's disk image file ``b.img``. 93*c02c112aSPeter Maydell 94*c02c112aSPeter MaydellIn reality certain resources are inaccessible to the guest but must be 95*c02c112aSPeter Maydellavailable to QEMU to perform its function. For example, host system calls are 96*c02c112aSPeter Maydellnecessary for QEMU but are not exposed to guests. A guest that escapes into 97*c02c112aSPeter Maydellthe QEMU process can then begin invoking host system calls. 98*c02c112aSPeter Maydell 99*c02c112aSPeter MaydellNew features must be designed to follow the principle of least privilege. 100*c02c112aSPeter MaydellShould this not be possible for technical reasons, the security risk must be 101*c02c112aSPeter Maydellclearly documented so users are aware of the trade-off of enabling the feature. 102*c02c112aSPeter Maydell 103*c02c112aSPeter MaydellIsolation mechanisms 104*c02c112aSPeter Maydell'''''''''''''''''''' 105*c02c112aSPeter Maydell 106*c02c112aSPeter MaydellSeveral isolation mechanisms are available to realize this architecture of 107*c02c112aSPeter Maydellguest isolation and the principle of least privilege. With the exception of 108*c02c112aSPeter MaydellLinux seccomp, these mechanisms are all deployed by management tools that 109*c02c112aSPeter Maydelllaunch QEMU, such as libvirt. They are also platform-specific so they are only 110*c02c112aSPeter Maydelldescribed briefly for Linux here. 111*c02c112aSPeter Maydell 112*c02c112aSPeter MaydellThe fundamental isolation mechanism is that QEMU processes must run as 113*c02c112aSPeter Maydellunprivileged users. Sometimes it seems more convenient to launch QEMU as 114*c02c112aSPeter Maydellroot to give it access to host devices (e.g. ``/dev/net/tun``) but this poses a 115*c02c112aSPeter Maydellhuge security risk. File descriptor passing can be used to give an otherwise 116*c02c112aSPeter Maydellunprivileged QEMU process access to host devices without running QEMU as root. 117*c02c112aSPeter MaydellIt is also possible to launch QEMU as a non-root user and configure UNIX groups 118*c02c112aSPeter Maydellfor access to ``/dev/kvm``, ``/dev/net/tun``, and other device nodes. 119*c02c112aSPeter MaydellSome Linux distros already ship with UNIX groups for these devices by default. 120*c02c112aSPeter Maydell 121*c02c112aSPeter Maydell- SELinux and AppArmor make it possible to confine processes beyond the 122*c02c112aSPeter Maydell traditional UNIX process and file permissions model. They restrict the QEMU 123*c02c112aSPeter Maydell process from accessing processes and files on the host system that are not 124*c02c112aSPeter Maydell needed by QEMU. 125*c02c112aSPeter Maydell 126*c02c112aSPeter Maydell- Resource limits and cgroup controllers provide throughput and utilization 127*c02c112aSPeter Maydell limits on key resources such as CPU time, memory, and I/O bandwidth. 128*c02c112aSPeter Maydell 129*c02c112aSPeter Maydell- Linux namespaces can be used to make process, file system, and other system 130*c02c112aSPeter Maydell resources unavailable to QEMU. A namespaced QEMU process is restricted to only 131*c02c112aSPeter Maydell those resources that were granted to it. 132*c02c112aSPeter Maydell 133*c02c112aSPeter Maydell- Linux seccomp is available via the QEMU ``--sandbox`` option. It disables 134*c02c112aSPeter Maydell system calls that are not needed by QEMU, thereby reducing the host kernel 135*c02c112aSPeter Maydell attack surface. 136*c02c112aSPeter Maydell 137*c02c112aSPeter MaydellSensitive configurations 138*c02c112aSPeter Maydell------------------------ 139*c02c112aSPeter Maydell 140*c02c112aSPeter MaydellThere are aspects of QEMU that can have security implications which users & 141*c02c112aSPeter Maydellmanagement applications must be aware of. 142*c02c112aSPeter Maydell 143*c02c112aSPeter MaydellMonitor console (QMP and HMP) 144*c02c112aSPeter Maydell''''''''''''''''''''''''''''' 145*c02c112aSPeter Maydell 146*c02c112aSPeter MaydellThe monitor console (whether used with QMP or HMP) provides an interface 147*c02c112aSPeter Maydellto dynamically control many aspects of QEMU's runtime operation. Many of the 148*c02c112aSPeter Maydellcommands exposed will instruct QEMU to access content on the host file system 149*c02c112aSPeter Maydelland/or trigger spawning of external processes. 150*c02c112aSPeter Maydell 151*c02c112aSPeter MaydellFor example, the ``migrate`` command allows for the spawning of arbitrary 152*c02c112aSPeter Maydellprocesses for the purpose of tunnelling the migration data stream. The 153*c02c112aSPeter Maydell``blockdev-add`` command instructs QEMU to open arbitrary files, exposing 154*c02c112aSPeter Maydelltheir content to the guest as a virtual disk. 155*c02c112aSPeter Maydell 156*c02c112aSPeter MaydellUnless QEMU is otherwise confined using technologies such as SELinux, AppArmor, 157*c02c112aSPeter Maydellor Linux namespaces, the monitor console should be considered to have privileges 158*c02c112aSPeter Maydellequivalent to those of the user account QEMU is running under. 159*c02c112aSPeter Maydell 160*c02c112aSPeter MaydellIt is further important to consider the security of the character device backend 161*c02c112aSPeter Maydellover which the monitor console is exposed. It needs to have protection against 162*c02c112aSPeter Maydellmalicious third parties which might try to make unauthorized connections, or 163*c02c112aSPeter Maydellperform man-in-the-middle attacks. Many of the character device backends do not 164*c02c112aSPeter Maydellsatisfy this requirement and so must not be used for the monitor console. 165*c02c112aSPeter Maydell 166*c02c112aSPeter MaydellThe general recommendation is that the monitor console should be exposed over 167*c02c112aSPeter Maydella UNIX domain socket backend to the local host only. Use of the TCP based 168*c02c112aSPeter Maydellcharacter device backend is inappropriate unless configured to use both TLS 169*c02c112aSPeter Maydellencryption and authorization control policy on client connections. 170*c02c112aSPeter Maydell 171*c02c112aSPeter MaydellIn summary, the monitor console is considered a privileged control interface to 172*c02c112aSPeter MaydellQEMU and as such should only be made accessible to a trusted management 173*c02c112aSPeter Maydellapplication or user. 174