1# Memory 2 3Cloud-Hypervisor has many ways to expose memory to the guest VM. This document 4aims to explain what Cloud-Hypervisor is capable of and how it can be used to 5meet the needs of very different use cases. 6 7## Basic Parameters 8 9`MemoryConfig` or what is known as `--memory` from the CLI perspective is the 10easiest way to get started with Cloud-Hypervisor. 11 12```rust 13struct MemoryConfig { 14 size: u64, 15 mergeable: bool, 16 shared: bool, 17 hugepages: bool, 18 hugepage_size: Option<u64>, 19 hotplug_method: HotplugMethod, 20 hotplug_size: Option<u64>, 21 hotplugged_size: Option<u64>, 22 zones: Option<Vec<MemoryZoneConfig>>, 23} 24``` 25 26``` 27--memory <memory> Memory parameters "size=<guest_memory_size>,mergeable=on|off,shared=on|off,hugepages=on|off,hotplug_method=acpi|virtio-mem,hotplug_size=<hotpluggable_memory_size>,hotplugged_size=<hotplugged_memory_size>" 28``` 29 30### `size` 31 32Size of the RAM in the guest VM. 33 34This option is mandatory when using the `--memory` parameter. 35 36Value is an unsigned integer of 64 bits. 37 38_Example_ 39 40``` 41--memory size=1G 42``` 43 44### `mergeable` 45 46Specifies if the pages from the guest RAM must be marked as _mergeable_. In 47case this option is `true` or `on`, the pages will be marked with `madvise(2)` 48to let the host kernel know which pages are eligible for being merged by the 49KSM daemon. 50 51This option can be used when trying to reach a higher density of VMs running 52on a single host, as it will reduce the amount of memory consumed by each VM. 53 54By default this option is turned off. 55 56_Example_ 57 58``` 59--memory size=1G,mergeable=on 60``` 61 62### `shared` 63 64Specifies if the memory must be `mmap(2)` with `MAP_SHARED` flag. 65 66By sharing a memory mapping, one can share the guest RAM with other processes 67running on the host. One can use this option when running vhost-user devices 68as part of the VM device model, as they will be driven by standalone daemons 69needing access to the guest RAM content. 70 71By default this option is turned off, which results in performing `mmap(2)` 72with `MAP_PRIVATE` flag. 73 74_Example_ 75 76``` 77--memory size=1G,shared=on 78``` 79 80### `hugepages` and `hugepage_size` 81 82Specifies if the memory must be created and `mmap(2)` with `MAP_HUGETLB` and size 83flags. This performs a memory mapping relying on the specified huge page size. If no huge page size is supplied the system's default huge page size is used. 84 85By using hugepages, one can improve the overall performance of the VM, assuming 86the guest will allocate hugepages as well. Another interesting use case is VFIO 87as it speeds up the VM's boot time since the amount of IOMMU mappings are 88reduced. 89 90The user is responsible for ensuring there are sufficient huge pages of the specified size for the VMM to use. Failure to do so may result in strange VMM behaviour. 91 92By default this option is turned off. 93 94_Example_ 95 96``` 97--memory size=1G,hugepages=on,hugepage_size=2M 98``` 99 100### `hotplug_method` 101 102Selects the way of adding and/or removing memory to/from a booted VM. 103 104Possible values are `acpi` and `virtio-mem`. Default value is `acpi`. 105 106_Example_ 107 108``` 109--memory size=1G,hotplug_method=acpi 110``` 111 112### `hotplug_size` 113 114Amount of memory that can be dynamically added to the VM. 115 116Value is an unsigned integer of 64 bits. A value of 0 is invalid. 117 118_Example_ 119 120``` 121--memory size=1G,hotplug_size=1G 122``` 123 124### `hotplugged_size` 125 126Amount of memory that will be dynamically added to the VM at boot. This option 127allows for starting a VM with a certain amount of memory that can be reduced 128during runtime. 129 130This is only valid when the `hotplug_method` is `virtio-mem` as it does not 131make sense for the `acpi` use case. When using ACPI, the memory can't be 132resized after it has been extended. 133 134This option is only valid when `hotplug_size` is specified, and its value can't 135exceed the value of `hotplug_size`. 136 137Value is an unsigned integer of 64 bits. A value of 0 is invalid. 138 139_Example_ 140 141``` 142--memory size=1G,hotplug_method=virtio-mem,hotplug_size=1G,hotplugged_size=512M 143``` 144 145## Advanced Parameters 146 147`MemoryZoneConfig` or what is known as `--memory-zone` from the CLI perspective 148is a power user parameter. It allows for a full description of the guest RAM, 149describing how every memory region is backed and exposed to the guest. 150 151```rust 152struct MemoryZoneConfig { 153 id: String, 154 size: u64, 155 file: Option<PathBuf>, 156 shared: bool, 157 hugepages: bool, 158 host_numa_node: Option<u32>, 159 hotplug_size: Option<u64>, 160 hotplugged_size: Option<u64>, 161} 162``` 163 164``` 165--memory-zone <memory-zone> User defined memory zone parameters "size=<guest_memory_region_size>,file=<backing_file>,shared=on|off,hugepages=on|off,host_numa_node=<node_id>,id=<zone_identifier>,hotplug_size=<hotpluggable_memory_size>,hotplugged_size=<hotplugged_memory_size>" 166``` 167 168This parameter expects one or more occurences, allowing for a list of memory 169zones to be defined. It must be used with `--memory size=0`, clearly indicating 170that the memory will be described through advanced parameters. 171 172Each zone is given a list of options which we detail through the following 173sections. 174 175### `id` 176 177Memory zone identifier. This identifier must be unique, otherwise an error will 178be returned. 179 180This option is useful when referring to a memory zone previously created. In 181particular, the `--numa` parameter can associate a memory zone to a specific 182NUMA node based on the memory zone identifier. 183 184This option is mandatory when using the `--memory-zone` parameter. 185 186Value is a string. 187 188_Example_ 189 190``` 191--memory size=0 192--memory-zone id=mem0,size=1G 193``` 194 195### `size` 196 197Size of the memory zone. 198 199This option is mandatory when using the `--memory-zone` parameter. 200 201Value is an unsigned integer of 64 bits. 202 203_Example_ 204 205``` 206--memory size=0 207--memory-zone id=mem0,size=1G 208``` 209 210### `file` 211 212Path to the file backing the memory zone. This can be either a file or a 213directory. In case of a file, it will be opened and used as the backing file 214for the `mmap(2)` operation. In case of a directory, a temporary file with no 215hard link on the filesystem will be created. This file will be used as the 216backing file for the `mmap(2)` operation. 217 218This option can be particularly useful when trying to back a part of the guest 219RAM with a well known file. In the context of the snapshot/restore feature, and 220if the provided path is a file, the snapshot operation will not perform any 221copy of the guest RAM content for this specific memory zone since the user has 222access to it and it would duplicate data already stored on the current 223filesystem. 224 225Value is a string. 226 227_Example_ 228 229``` 230--memory size=0 231--memory-zone id=mem0,size=1G,file=/foo/bar 232``` 233 234### `shared` 235 236Specifies if the memory zone must be `mmap(2)` with `MAP_SHARED` flag. 237 238By sharing a memory zone mapping, one can share part of the guest RAM with 239other processes running on the host. One can use this option when running 240vhost-user devices as part of the VM device model, as they will be driven 241by standalone daemons needing access to the guest RAM content. 242 243By default this option is turned off, which result in performing `mmap(2)` 244with `MAP_PRIVATE` flag. 245 246_Example_ 247 248``` 249--memory size=0 250--memory-zone id=mem0,size=1G,shared=on 251``` 252 253### `hugepages` 254 255Specifies if the memory zone must be `mmap(2)` with `MAP_HUGETLB` and 256`MAP_HUGE_2MB` flags. This performs a memory zone mapping relying on 2MiB 257pages instead of the default 4kiB pages. 258 259By using hugepages, one can improve the overall performance of the VM, assuming 260the guest will allocate hugepages as well. Another interesting use case is VFIO 261as it speeds up the VM's boot time since the amount of IOMMU mappings are 262reduced. 263 264By default this option is turned off. 265 266_Example_ 267 268``` 269--memory size=0 270--memory-zone id=mem0,size=1G,hugepages=on 271``` 272 273### `host_numa_node` 274 275Node identifier of a node present on the host. This option will let the user 276pick a specific NUMA node from which the memory must be allocated. After the 277memory zone is `mmap(2)`, the NUMA policy for this memory mapping will be 278applied through `mbind(2)`, relying on the provided node identifier. If the 279node does not exist on the host, the call to `mbind(2)` will fail. 280 281This option is useful when trying to back a VM memory with a specific type of 282memory from the host. Assuming a host has two types of memory, with one slower 283than the other, each related to a distinct NUMA node, one could create a VM 284with slower memory accesses by backing the entire guest RAM from the furthest 285NUMA node on the host. 286 287This option also gives the opportunity to create a VM with non uniform memory 288accesses as one could define a first memory zone backed by fast memory, and a 289second memory zone backed by slow memory. 290 291Value is an unsigned integer of 32 bits. 292 293_Example_ 294 295``` 296--memory size=0 297--memory-zone id=mem0,size=1G,host_numa_node=0 298``` 299 300### `hotplug_size` 301 302Amount of memory that can be dynamically added to the memory zone. Since 303`virtio-mem` is the only way of resizing a memory zone, one must specify 304the `hotplug_method=virtio-mem` to the `--memory` parameter. 305 306Value is an unsigned integer of 64 bits. A value of 0 is invalid. 307 308_Example_ 309 310``` 311--memory size=0,hotplug_method=virtio-mem 312--memory-zone id=mem0,size=1G,hotplug_size=1G 313``` 314 315### `hotplugged_size` 316 317Amount of memory that will be dynamically added to a memory zone at VM's boot. 318This option allows for starting a VM with a certain amount of memory that can 319be reduced during runtime. 320 321This is only valid when the `hotplug_method` is `virtio-mem` as it does not 322make sense for the `acpi` use case. When using ACPI, the memory can't be 323resized after it has been extended. 324 325This option is only valid when `hotplug_size` is specified, and its value can't 326exceed the value of `hotplug_size`. 327 328Value is an unsigned integer of 64 bits. A value of 0 is invalid. 329 330_Example_ 331 332``` 333--memory size=0,hotplug_method=virtio-mem 334--memory-zone id=mem0,size=1G,hotplug_size=1G,hotplugged_size=512M 335``` 336 337## NUMA settings 338 339`NumaConfig` or what is known as `--numa` from the CLI perspective has been 340introduced to define a guest NUMA topology. It allows for a fine description 341about the CPUs and memory ranges associated with each NUMA node. Additionally 342it allows for specifying the distance between each NUMA node. 343 344```rust 345struct NumaConfig { 346 guest_numa_id: u32, 347 cpus: Option<Vec<u8>>, 348 distances: Option<Vec<NumaDistance>>, 349 memory_zones: Option<Vec<String>>, 350 sgx_epc_sections: Option<Vec<String>>, 351} 352``` 353 354``` 355--numa <numa> Settings related to a given NUMA node "guest_numa_id=<node_id>,cpus=<cpus_id>,distances=<list_of_distances_to_destination_nodes>,memory_zones=<list_of_memory_zones>,sgx_epc_sections=<list_of_sgx_epc_sections>" 356``` 357 358### `guest_numa_id` 359 360Node identifier of a guest NUMA node. This identifier must be unique, otherwise 361an error will be returned. 362 363This option is mandatory when using the `--numa` parameter. 364 365Value is an unsigned integer of 32 bits. 366 367_Example_ 368 369``` 370--numa guest_numa_id=0 371``` 372 373### `cpus` 374 375List of virtual CPUs attached to the guest NUMA node identified by the 376`guest_numa_id` option. This allows for describing a list of CPUs which 377must be seen by the guest as belonging to the NUMA node `guest_numa_id`. 378 379One can use this option for a fine grained description of the NUMA topology 380regarding the CPUs associated with it, which might help the guest run more 381efficiently. 382 383Multiple values can be provided to define the list. Each value is an unsigned 384integer of 8 bits. 385 386For instance, if one needs to attach all CPUs from 0 to 4 to a specific node, 387the syntax using `-` will help define a contiguous range with `cpus=0-4`. The 388same example could also be described with `cpus=0:1:2:3:4`. 389 390A combination of both `-` and `:` separators is useful when one might need to 391describe a list containing all CPUs from 0 to 99 and the CPU 255, as it could 392simply be described with `cpus=0-99:255`. 393 394_Example_ 395 396``` 397--cpus boot=8 398--numa guest_numa_id=0,cpus=1-3:7 guest_numa_id=1,cpus=0:4-6 399``` 400 401### `distances` 402 403List of distances between the current NUMA node referred by `guest_numa_id` 404and the destination NUMA nodes listed along with distances. This option let 405the user choose the distances between guest NUMA nodes. This is important to 406provide an accurate description of the way non uniform memory accesses will 407perform in the guest. 408 409One or more tuple of two values must be provided through this option. The first 410value is an unsigned integer of 32 bits as it represents the destination NUMA 411node. The second value is an unsigned integer of 8 bits as it represents the 412distance between the current NUMA node and the destination NUMA node. The two 413values are separated by `@` (`value1@value2`), meaning the destination NUMA 414node `value1` is located at a distance of `value2`. Each tuple is separated 415from the others with `:` separator. 416 417For instance, if one wants to define 3 NUMA nodes, with each node located at 418different distances, it can be described with the following example. 419 420_Example_ 421 422``` 423--numa guest_numa_id=0,distances=1@15:2@25 guest_numa_id=1,distances=0@15:2@20 guest_numa_id=2,distances=0@25:1@20 424``` 425 426### `memory_zones` 427 428List of memory zones attached to the guest NUMA node identified by the 429`guest_numa_id` option. This allows for describing a list of memory ranges 430which must be seen by the guest as belonging to the NUMA node `guest_numa_id`. 431 432This option can be very useful and powerful when combined with `host_numa_node` 433option from `--memory-zone` parameter as it allows for creating a VM with non 434uniform memory accesses, and let the guest know about it. It allows for 435exposing memory zones through different NUMA nodes, which can help the guest 436workload run more efficiently. 437 438Multiple values can be provided to define the list. Each value is a string 439referring to an existing memory zone identifier. Values are separated from 440each other with the `:` separator. 441 442_Example_ 443 444``` 445--memory size=0 446--memory-zone id=mem0,size=1G id=mem1,size=1G id=mem2,size=1G 447--numa guest_numa_id=0,memory_zones=mem0:mem2 guest_numa_id=1,memory_zones=mem1 448``` 449 450### `sgx_epc_sections` 451 452List of SGX EPC sections attached to the guest NUMA node identified by the 453`guest_numa_id` option. This allows for describing a list of SGX EPC sections 454which must be seen by the guest as belonging to the NUMA node `guest_numa_id`. 455 456Multiple values can be provided to define the list. Each value is a string 457referring to an existing SGX EPC section identifier. Values are separated from 458each other with the `:` separator. 459 460_Example_ 461 462``` 463--sgx-epc id=epc0,size=32M id=epc1,size=64M id=epc2,size=32M 464--numa guest_numa_id=0,sgx_epc_sections=epc1 guest_numa_id=1,sgx_epc_sections=epc0:epc2 465``` 466 467### PCI bus 468 469Cloud Hypervisor supports only one PCI bus, which is why it has been tied to 470the NUMA node 0 by default. It is the user responsibility to organize the NUMA 471nodes correctly so that vCPUs and guest RAM which should be located on the same 472NUMA node as the PCI bus end up on the NUMA node 0.