xref: /cloud-hypervisor/docs/memory.md (revision 42e9632c53d14cd0040db4952d40ba806c4b6ee9)
1239169adSSebastien Boeuf# Memory
2239169adSSebastien Boeuf
3e6e58e6dSSebastien BoeufCloud Hypervisor has many ways to expose memory to the guest VM. This document
4e6e58e6dSSebastien Boeufaims to explain what Cloud Hypervisor is capable of and how it can be used to
5239169adSSebastien Boeufmeet the needs of very different use cases.
6239169adSSebastien Boeuf
7239169adSSebastien Boeuf## Basic Parameters
8239169adSSebastien Boeuf
9239169adSSebastien Boeuf`MemoryConfig` or what is known as `--memory` from the CLI perspective is the
10e6e58e6dSSebastien Boeufeasiest way to get started with Cloud Hypervisor.
11239169adSSebastien Boeuf
12239169adSSebastien Boeuf```rust
13239169adSSebastien Boeufstruct MemoryConfig {
14239169adSSebastien Boeuf    size: u64,
15239169adSSebastien Boeuf    mergeable: bool,
16239169adSSebastien Boeuf    hotplug_method: HotplugMethod,
17239169adSSebastien Boeuf    hotplug_size: Option<u64>,
184e1b78e1SSebastien Boeuf    hotplugged_size: Option<u64>,
190468598fSLi Yu    shared: bool,
200468598fSLi Yu    hugepages: bool,
210468598fSLi Yu    hugepage_size: Option<u64>,
220468598fSLi Yu    prefault: bool,
2304d034a0SRob Bradford    thp: bool
24239169adSSebastien Boeuf    zones: Option<Vec<MemoryZoneConfig>>,
25239169adSSebastien Boeuf}
26239169adSSebastien Boeuf```
27239169adSSebastien Boeuf
28239169adSSebastien Boeuf```
2904d034a0SRob Bradford--memory <memory>	Memory parameters "size=<guest_memory_size>,mergeable=on|off,shared=on|off,hugepages=on|off,hugepage_size=<hugepage_size>,hotplug_method=acpi|virtio-mem,hotplug_size=<hotpluggable_memory_size>,hotplugged_size=<hotplugged_memory_size>,prefault=on|off,thp=on|off" [default: size=512M,thp=on]
30239169adSSebastien Boeuf```
31239169adSSebastien Boeuf
32239169adSSebastien Boeuf### `size`
33239169adSSebastien Boeuf
34239169adSSebastien BoeufSize of the RAM in the guest VM.
35239169adSSebastien Boeuf
36239169adSSebastien BoeufThis option is mandatory when using the `--memory` parameter.
37239169adSSebastien Boeuf
38239169adSSebastien BoeufValue is an unsigned integer of 64 bits.
39239169adSSebastien Boeuf
40239169adSSebastien Boeuf_Example_
41239169adSSebastien Boeuf
42239169adSSebastien Boeuf```
43239169adSSebastien Boeuf--memory size=1G
44239169adSSebastien Boeuf```
45239169adSSebastien Boeuf
46239169adSSebastien Boeuf### `mergeable`
47239169adSSebastien Boeuf
48239169adSSebastien BoeufSpecifies if the pages from the guest RAM must be marked as _mergeable_. In
49239169adSSebastien Boeufcase this option is `true` or `on`, the pages will be marked with `madvise(2)`
50239169adSSebastien Boeufto let the host kernel know which pages are eligible for being merged by the
51239169adSSebastien BoeufKSM daemon.
52239169adSSebastien Boeuf
53239169adSSebastien BoeufThis option can be used when trying to reach a higher density of VMs running
54239169adSSebastien Boeufon a single host, as it will reduce the amount of memory consumed by each VM.
55239169adSSebastien Boeuf
56239169adSSebastien BoeufBy default this option is turned off.
57239169adSSebastien Boeuf
58239169adSSebastien Boeuf_Example_
59239169adSSebastien Boeuf
60239169adSSebastien Boeuf```
61239169adSSebastien Boeuf--memory size=1G,mergeable=on
62239169adSSebastien Boeuf```
63239169adSSebastien Boeuf
64239169adSSebastien Boeuf### `hotplug_method`
65239169adSSebastien Boeuf
66239169adSSebastien BoeufSelects the way of adding and/or removing memory to/from a booted VM.
67239169adSSebastien Boeuf
68239169adSSebastien BoeufPossible values are `acpi` and `virtio-mem`. Default value is `acpi`.
69239169adSSebastien Boeuf
70239169adSSebastien Boeuf_Example_
71239169adSSebastien Boeuf
72239169adSSebastien Boeuf```
73239169adSSebastien Boeuf--memory size=1G,hotplug_method=acpi
74239169adSSebastien Boeuf```
75239169adSSebastien Boeuf
76239169adSSebastien Boeuf### `hotplug_size`
77239169adSSebastien Boeuf
78239169adSSebastien BoeufAmount of memory that can be dynamically added to the VM.
79239169adSSebastien Boeuf
80c645a72cSSebastien BoeufValue is an unsigned integer of 64 bits. A value of 0 is invalid.
81239169adSSebastien Boeuf
82239169adSSebastien Boeuf_Example_
83239169adSSebastien Boeuf
84239169adSSebastien Boeuf```
85239169adSSebastien Boeuf--memory size=1G,hotplug_size=1G
86239169adSSebastien Boeuf```
87239169adSSebastien Boeuf
884e1b78e1SSebastien Boeuf### `hotplugged_size`
894e1b78e1SSebastien Boeuf
904e1b78e1SSebastien BoeufAmount of memory that will be dynamically added to the VM at boot. This option
914e1b78e1SSebastien Boeufallows for starting a VM with a certain amount of memory that can be reduced
924e1b78e1SSebastien Boeufduring runtime.
934e1b78e1SSebastien Boeuf
944e1b78e1SSebastien BoeufThis is only valid when the `hotplug_method` is `virtio-mem` as it does not
954e1b78e1SSebastien Boeufmake sense for the `acpi` use case. When using ACPI, the memory can't be
964e1b78e1SSebastien Boeufresized after it has been extended.
974e1b78e1SSebastien Boeuf
984e1b78e1SSebastien BoeufThis option is only valid when `hotplug_size` is specified, and its value can't
994e1b78e1SSebastien Boeufexceed the value of `hotplug_size`.
1004e1b78e1SSebastien Boeuf
1014e1b78e1SSebastien BoeufValue is an unsigned integer of 64 bits. A value of 0 is invalid.
1024e1b78e1SSebastien Boeuf
1034e1b78e1SSebastien Boeuf_Example_
1044e1b78e1SSebastien Boeuf
1054e1b78e1SSebastien Boeuf```
1064e1b78e1SSebastien Boeuf--memory size=1G,hotplug_method=virtio-mem,hotplug_size=1G,hotplugged_size=512M
1074e1b78e1SSebastien Boeuf```
1084e1b78e1SSebastien Boeuf
1090468598fSLi Yu### `shared`
1100468598fSLi Yu
1110468598fSLi YuSpecifies if the memory must be `mmap(2)` with `MAP_SHARED` flag.
1120468598fSLi Yu
1130468598fSLi YuBy sharing a memory mapping, one can share the guest RAM with other processes
1140468598fSLi Yurunning on the host. One can use this option when running vhost-user devices
1150468598fSLi Yuas part of the VM device model, as they will be driven by standalone daemons
1160468598fSLi Yuneeding access to the guest RAM content.
1170468598fSLi Yu
1180468598fSLi YuBy default this option is turned off, which results in performing `mmap(2)`
1190468598fSLi Yuwith `MAP_PRIVATE` flag.
1200468598fSLi Yu
121aad4dc3bSRob BradfordIf `hugepages=on` then the value of this field is ignored as huge pages always
122aad4dc3bSRob Bradfordrequires `MAP_SHARED`.
123aad4dc3bSRob Bradford
1240468598fSLi Yu_Example_
1250468598fSLi Yu
1260468598fSLi Yu```
1270468598fSLi Yu--memory size=1G,shared=on
1280468598fSLi Yu```
1290468598fSLi Yu
1300468598fSLi Yu### `hugepages` and `hugepage_size`
1310468598fSLi Yu
1320468598fSLi YuSpecifies if the memory must be created and `mmap(2)` with `MAP_HUGETLB` and size
1330468598fSLi Yuflags. This performs a memory mapping relying on the specified huge page size.
1340468598fSLi YuIf no huge page size is supplied the system's default huge page size is used.
1350468598fSLi Yu
1360468598fSLi YuBy using hugepages, one can improve the overall performance of the VM, assuming
1370468598fSLi Yuthe guest will allocate hugepages as well. Another interesting use case is VFIO
1380468598fSLi Yuas it speeds up the VM's boot time since the amount of IOMMU mappings are
1390468598fSLi Yureduced.
1400468598fSLi Yu
1410468598fSLi YuThe user is responsible for ensuring there are sufficient huge pages of the
1420468598fSLi Yuspecified size for the VMM to use. Failure to do so may result in strange VMM
1430468598fSLi Yubehaviour, e.g. error with `ReadKernelImage` is common. If there is a strange
1440468598fSLi Yuerror with `hugepages` enabled, just disable it or check whether there are enough
1450468598fSLi Yuhuge pages.
1460468598fSLi Yu
147aad4dc3bSRob BradfordIf `hugepages=on` then the value of `shared` is ignored as huge pages always
148aad4dc3bSRob Bradfordrequires `MAP_SHARED`.
149aad4dc3bSRob Bradford
1500468598fSLi YuBy default this option is turned off.
1510468598fSLi Yu
1520468598fSLi Yu_Example_
1530468598fSLi Yu
1540468598fSLi Yu```
1550468598fSLi Yu--memory size=1G,hugepages=on,hugepage_size=2M
1560468598fSLi Yu```
1570468598fSLi Yu
1580468598fSLi Yu### `prefault`
1590468598fSLi Yu
1600468598fSLi YuSpecifies if the memory must be `mmap(2)` with `MAP_POPULATE` flag.
1610468598fSLi Yu
1620468598fSLi YuBy triggering prefault, one can allocate all required physical memory and create
1630468598fSLi Yuits page tables while calling `mmap`. With physical memory allocated, the number
1640468598fSLi Yuof page faults will decrease during running, and performance will also improve.
1650468598fSLi Yu
1660468598fSLi YuNote that boot of VM will be slower with `prefault` enabled because of allocating
1670468598fSLi Yuphysical memory and creating page tables in advance, and physical memory of the
1680468598fSLi Yuspecified size will be consumed quickly.
1690468598fSLi Yu
1700468598fSLi YuThis option only takes effect at boot of VM. There is also a `prefault` option in
1710468598fSLi Yurestore and its choice will overwrite `prefault` in memory.
1720468598fSLi Yu
1730468598fSLi YuBy default this option is turned off.
1740468598fSLi Yu
1750468598fSLi Yu_Example_
1760468598fSLi Yu
1770468598fSLi Yu```
1780468598fSLi Yu--memory size=1G,prefault=on
1790468598fSLi Yu```
1800468598fSLi Yu
18104d034a0SRob Bradford### `thp`
18204d034a0SRob Bradford
18304d034a0SRob BradfordSpecifies if private anonymous memory for the guest (i.e. `shared=off` and no
18404d034a0SRob Bradfordbacking file) should be labelled `MADV_HUGEPAGE` with `madvise(2)` indicating
18504d034a0SRob Bradfordto the kernel that this memory may be backed with huge pages transparently.
18604d034a0SRob Bradford
18704d034a0SRob BradfordThe use of transparent huge pages can improve the performance of the guest as
18804d034a0SRob Bradfordthere will fewer virtualisation related page faults. Unlike using
18904d034a0SRob Bradford`hugepages=on` a specific number of huge pages do not need to be allocated by
19004d034a0SRob Bradfordthe kernel.
19104d034a0SRob Bradford
19204d034a0SRob BradfordBy default this option is turned on.
19304d034a0SRob Bradford
19404d034a0SRob Bradford_Example_
19504d034a0SRob Bradford
19604d034a0SRob Bradford```
19704d034a0SRob Bradford--memory size=1G,thp=on
19804d034a0SRob Bradford```
19904d034a0SRob Bradford
200239169adSSebastien Boeuf## Advanced Parameters
201239169adSSebastien Boeuf
202239169adSSebastien Boeuf`MemoryZoneConfig` or what is known as `--memory-zone` from the CLI perspective
203239169adSSebastien Boeufis a power user parameter. It allows for a full description of the guest RAM,
204239169adSSebastien Boeufdescribing how every memory region is backed and exposed to the guest.
205239169adSSebastien Boeuf
206239169adSSebastien Boeuf```rust
207239169adSSebastien Boeufstruct MemoryZoneConfig {
20855e9827eSSebastien Boeuf    id: String,
209239169adSSebastien Boeuf    size: u64,
210c89b8e06SRob Bradford    file: Option<PathBuf>,
211239169adSSebastien Boeuf    shared: bool,
212239169adSSebastien Boeuf    hugepages: bool,
2130468598fSLi Yu    hugepage_size: Option<u64>,
214239169adSSebastien Boeuf    host_numa_node: Option<u32>,
215c645a72cSSebastien Boeuf    hotplug_size: Option<u64>,
2164e1b78e1SSebastien Boeuf    hotplugged_size: Option<u64>,
2170468598fSLi Yu    prefault: bool,
218239169adSSebastien Boeuf}
219239169adSSebastien Boeuf```
220239169adSSebastien Boeuf
221239169adSSebastien Boeuf```
222c89b8e06SRob Bradford--memory-zone <memory-zone>	User defined memory zone parameters "size=<guest_memory_region_size>,file=<backing_file>,shared=on|off,hugepages=on|off,hugepage_size=<hugepage_size>,host_numa_node=<node_id>,id=<zone_identifier>,hotplug_size=<hotpluggable_memory_size>,hotplugged_size=<hotplugged_memory_size>,prefault=on|off"
223239169adSSebastien Boeuf```
224239169adSSebastien Boeuf
2257bf0cc1eSPhilipp SchusterThis parameter expects one or more occurrences, allowing for a list of memory
226239169adSSebastien Boeufzones to be defined. It must be used with `--memory size=0`, clearly indicating
227239169adSSebastien Boeufthat the memory will be described through advanced parameters.
228239169adSSebastien Boeuf
229239169adSSebastien BoeufEach zone is given a list of options which we detail through the following
230239169adSSebastien Boeufsections.
231239169adSSebastien Boeuf
23255e9827eSSebastien Boeuf### `id`
23355e9827eSSebastien Boeuf
23455e9827eSSebastien BoeufMemory zone identifier. This identifier must be unique, otherwise an error will
23555e9827eSSebastien Boeufbe returned.
23655e9827eSSebastien Boeuf
23755e9827eSSebastien BoeufThis option is useful when referring to a memory zone previously created. In
23855e9827eSSebastien Boeufparticular, the `--numa` parameter can associate a memory zone to a specific
23955e9827eSSebastien BoeufNUMA node based on the memory zone identifier.
24055e9827eSSebastien Boeuf
24155e9827eSSebastien BoeufThis option is mandatory when using the `--memory-zone` parameter.
24255e9827eSSebastien Boeuf
24355e9827eSSebastien BoeufValue is a string.
24455e9827eSSebastien Boeuf
24555e9827eSSebastien Boeuf_Example_
24655e9827eSSebastien Boeuf
24755e9827eSSebastien Boeuf```
24855e9827eSSebastien Boeuf--memory size=0
24955e9827eSSebastien Boeuf--memory-zone id=mem0,size=1G
25055e9827eSSebastien Boeuf```
25155e9827eSSebastien Boeuf
252239169adSSebastien Boeuf### `size`
253239169adSSebastien Boeuf
254239169adSSebastien BoeufSize of the memory zone.
255239169adSSebastien Boeuf
256239169adSSebastien BoeufThis option is mandatory when using the `--memory-zone` parameter.
257239169adSSebastien Boeuf
258239169adSSebastien BoeufValue is an unsigned integer of 64 bits.
259239169adSSebastien Boeuf
260239169adSSebastien Boeuf_Example_
261239169adSSebastien Boeuf
262239169adSSebastien Boeuf```
263239169adSSebastien Boeuf--memory size=0
26455e9827eSSebastien Boeuf--memory-zone id=mem0,size=1G
265239169adSSebastien Boeuf```
266239169adSSebastien Boeuf
267c89b8e06SRob Bradford### `file`
268c89b8e06SRob Bradford
2692b4f60e5SBo ChenPath to the file backing the memory zone. The file will be opened and used as
2702b4f60e5SBo Chenthe backing file for the `mmap(2)` operation.
271c89b8e06SRob Bradford
272c89b8e06SRob BradfordThis option can be particularly useful when trying to back a part of the guest
273c89b8e06SRob BradfordRAM with a well known file. In the context of the snapshot/restore feature, and
274c89b8e06SRob Bradfordif the provided path is a file, the snapshot operation will not perform any
275c89b8e06SRob Bradfordcopy of the guest RAM content for this specific memory zone since the user has
276c89b8e06SRob Bradfordaccess to it and it would duplicate data already stored on the current
277c89b8e06SRob Bradfordfilesystem.
278c89b8e06SRob Bradford
279c89b8e06SRob BradfordValue is a string.
280c89b8e06SRob Bradford
281c89b8e06SRob Bradford_Example_
282c89b8e06SRob Bradford
283c89b8e06SRob Bradford```
284c89b8e06SRob Bradford--memory size=0
285c89b8e06SRob Bradford--memory-zone id=mem0,size=1G,file=/foo/bar
286c89b8e06SRob Bradford```
287c89b8e06SRob Bradford
288239169adSSebastien Boeuf### `shared`
289239169adSSebastien Boeuf
290239169adSSebastien BoeufSpecifies if the memory zone must be `mmap(2)` with `MAP_SHARED` flag.
291239169adSSebastien Boeuf
292239169adSSebastien BoeufBy sharing a memory zone mapping, one can share part of the guest RAM with
293239169adSSebastien Boeufother processes running on the host. One can use this option when running
294239169adSSebastien Boeufvhost-user devices as part of the VM device model, as they will be driven
295239169adSSebastien Boeufby standalone daemons needing access to the guest RAM content.
296239169adSSebastien Boeuf
297aad4dc3bSRob BradfordIf `hugepages=on` then the value of this field is ignored as huge pages always
298aad4dc3bSRob Bradfordrequires `MAP_SHARED`.
299aad4dc3bSRob Bradford
300239169adSSebastien BoeufBy default this option is turned off, which result in performing `mmap(2)`
301239169adSSebastien Boeufwith `MAP_PRIVATE` flag.
302239169adSSebastien Boeuf
303239169adSSebastien Boeuf_Example_
304239169adSSebastien Boeuf
305239169adSSebastien Boeuf```
306239169adSSebastien Boeuf--memory size=0
30755e9827eSSebastien Boeuf--memory-zone id=mem0,size=1G,shared=on
308239169adSSebastien Boeuf```
309239169adSSebastien Boeuf
3100468598fSLi Yu### `hugepages` and `hugepage_size`
311239169adSSebastien Boeuf
3120468598fSLi YuSpecifies if the memory must be created and `mmap(2)` with `MAP_HUGETLB` and size
3130468598fSLi Yuflags. This performs a memory mapping relying on the specified huge page size.
3140468598fSLi YuIf no huge page size is supplied the system's default huge page size is used.
315239169adSSebastien Boeuf
316239169adSSebastien BoeufBy using hugepages, one can improve the overall performance of the VM, assuming
317239169adSSebastien Boeufthe guest will allocate hugepages as well. Another interesting use case is VFIO
318239169adSSebastien Boeufas it speeds up the VM's boot time since the amount of IOMMU mappings are
319239169adSSebastien Boeufreduced.
320239169adSSebastien Boeuf
3210468598fSLi YuThe user is responsible for ensuring there are sufficient huge pages of the
3220468598fSLi Yuspecified size for the VMM to use. Failure to do so may result in strange VMM
3230468598fSLi Yubehaviour, e.g. error with `ReadKernelImage` is common. If there is a strange
3240468598fSLi Yuerror with `hugepages` enabled, just disable it or check whether there are enough
3250468598fSLi Yuhuge pages.
3260468598fSLi Yu
327aad4dc3bSRob BradfordIf `hugepages=on` then the value of `shared` is ignored as huge pages always
328aad4dc3bSRob Bradfordrequires `MAP_SHARED`.
329aad4dc3bSRob Bradford
330239169adSSebastien BoeufBy default this option is turned off.
331239169adSSebastien Boeuf
332239169adSSebastien Boeuf_Example_
333239169adSSebastien Boeuf
334239169adSSebastien Boeuf```
335239169adSSebastien Boeuf--memory size=0
3360468598fSLi Yu--memory-zone id=mem0,size=1G,hugepages=on,hugepage_size=2M
337239169adSSebastien Boeuf```
338239169adSSebastien Boeuf
339239169adSSebastien Boeuf### `host_numa_node`
340239169adSSebastien Boeuf
341239169adSSebastien BoeufNode identifier of a node present on the host. This option will let the user
342239169adSSebastien Boeufpick a specific NUMA node from which the memory must be allocated. After the
343239169adSSebastien Boeufmemory zone is `mmap(2)`, the NUMA policy for this memory mapping will be
344239169adSSebastien Boeufapplied through `mbind(2)`, relying on the provided node identifier. If the
345239169adSSebastien Boeufnode does not exist on the host, the call to `mbind(2)` will fail.
346239169adSSebastien Boeuf
347239169adSSebastien BoeufThis option is useful when trying to back a VM memory with a specific type of
348239169adSSebastien Boeufmemory from the host. Assuming a host has two types of memory, with one slower
349239169adSSebastien Boeufthan the other, each related to a distinct NUMA node, one could create a VM
350239169adSSebastien Boeufwith slower memory accesses by backing the entire guest RAM from the furthest
351239169adSSebastien BoeufNUMA node on the host.
352239169adSSebastien Boeuf
353239169adSSebastien BoeufThis option also gives the opportunity to create a VM with non uniform memory
354239169adSSebastien Boeufaccesses as one could define a first memory zone backed by fast memory, and a
355239169adSSebastien Boeufsecond memory zone backed by slow memory.
356239169adSSebastien Boeuf
357239169adSSebastien BoeufValue is an unsigned integer of 32 bits.
358239169adSSebastien Boeuf
359239169adSSebastien Boeuf_Example_
360239169adSSebastien Boeuf
361239169adSSebastien Boeuf```
362239169adSSebastien Boeuf--memory size=0
36355e9827eSSebastien Boeuf--memory-zone id=mem0,size=1G,host_numa_node=0
364239169adSSebastien Boeuf```
365239169adSSebastien Boeuf
366c645a72cSSebastien Boeuf### `hotplug_size`
367c645a72cSSebastien Boeuf
368c645a72cSSebastien BoeufAmount of memory that can be dynamically added to the memory zone. Since
369c645a72cSSebastien Boeuf`virtio-mem` is the only way of resizing a memory zone, one must specify
370c645a72cSSebastien Boeufthe `hotplug_method=virtio-mem` to the `--memory` parameter.
371c645a72cSSebastien Boeuf
372c645a72cSSebastien BoeufValue is an unsigned integer of 64 bits. A value of 0 is invalid.
373c645a72cSSebastien Boeuf
374c645a72cSSebastien Boeuf_Example_
375c645a72cSSebastien Boeuf
376c645a72cSSebastien Boeuf```
377c645a72cSSebastien Boeuf--memory size=0,hotplug_method=virtio-mem
378c645a72cSSebastien Boeuf--memory-zone id=mem0,size=1G,hotplug_size=1G
379c645a72cSSebastien Boeuf```
380c645a72cSSebastien Boeuf
3814e1b78e1SSebastien Boeuf### `hotplugged_size`
3824e1b78e1SSebastien Boeuf
3834e1b78e1SSebastien BoeufAmount of memory that will be dynamically added to a memory zone at VM's boot.
3844e1b78e1SSebastien BoeufThis option allows for starting a VM with a certain amount of memory that can
3854e1b78e1SSebastien Boeufbe reduced during runtime.
3864e1b78e1SSebastien Boeuf
3874e1b78e1SSebastien BoeufThis is only valid when the `hotplug_method` is `virtio-mem` as it does not
3884e1b78e1SSebastien Boeufmake sense for the `acpi` use case. When using ACPI, the memory can't be
3894e1b78e1SSebastien Boeufresized after it has been extended.
3904e1b78e1SSebastien Boeuf
3914e1b78e1SSebastien BoeufThis option is only valid when `hotplug_size` is specified, and its value can't
3924e1b78e1SSebastien Boeufexceed the value of `hotplug_size`.
3934e1b78e1SSebastien Boeuf
3944e1b78e1SSebastien BoeufValue is an unsigned integer of 64 bits. A value of 0 is invalid.
3954e1b78e1SSebastien Boeuf
3964e1b78e1SSebastien Boeuf_Example_
3974e1b78e1SSebastien Boeuf
3984e1b78e1SSebastien Boeuf```
3994e1b78e1SSebastien Boeuf--memory size=0,hotplug_method=virtio-mem
4004e1b78e1SSebastien Boeuf--memory-zone id=mem0,size=1G,hotplug_size=1G,hotplugged_size=512M
4014e1b78e1SSebastien Boeuf```
4024e1b78e1SSebastien Boeuf
4030468598fSLi Yu### `prefault`
4040468598fSLi Yu
4050468598fSLi YuSpecifies if the memory must be `mmap(2)` with `MAP_POPULATE` flag.
4060468598fSLi Yu
4070468598fSLi YuBy triggering prefault, one can allocate all required physical memory and create
4080468598fSLi Yuits page tables while calling `mmap`. With physical memory allocated, the number
4090468598fSLi Yuof page faults will decrease during running, and performance will also improve.
4100468598fSLi Yu
4110468598fSLi YuNote that boot of VM will be slower with `prefault` enabled because of allocating
4120468598fSLi Yuphysical memory and creating page tables in advance, and physical memory of the
4130468598fSLi Yuspecified size will be consumed quickly.
4140468598fSLi Yu
4150468598fSLi YuThis option only takes effect at boot of VM. There is also a `prefault` option in
4160468598fSLi Yurestore and its choice will overwrite `prefault` in memory.
4170468598fSLi Yu
4180468598fSLi YuBy default this option is turned off.
4190468598fSLi Yu
4200468598fSLi Yu_Example_
4210468598fSLi Yu
4220468598fSLi Yu```
4230468598fSLi Yu--memory size=0
4240468598fSLi Yu--memory-zone id=mem0,size=1G,prefault=on
4250468598fSLi Yu```
4260468598fSLi Yu
427239169adSSebastien Boeuf## NUMA settings
428239169adSSebastien Boeuf
429239169adSSebastien Boeuf`NumaConfig` or what is known as `--numa` from the CLI perspective has been
43055e9827eSSebastien Boeufintroduced to define a guest NUMA topology. It allows for a fine description
43155e9827eSSebastien Boeufabout the CPUs and memory ranges associated with each NUMA node. Additionally
43255e9827eSSebastien Boeufit allows for specifying the distance between each NUMA node.
433239169adSSebastien Boeuf
434239169adSSebastien Boeuf```rust
435239169adSSebastien Boeufstruct NumaConfig {
43639870269SSebastien Boeuf    guest_numa_id: u32,
437239169adSSebastien Boeuf    cpus: Option<Vec<u8>>,
438239169adSSebastien Boeuf    distances: Option<Vec<NumaDistance>>,
43955e9827eSSebastien Boeuf    memory_zones: Option<Vec<String>>,
4406b710209SSebastien Boeuf    sgx_epc_sections: Option<Vec<String>>,
441239169adSSebastien Boeuf}
442239169adSSebastien Boeuf```
443239169adSSebastien Boeuf
444239169adSSebastien Boeuf```
4456b710209SSebastien Boeuf--numa <numa>	Settings related to a given NUMA node "guest_numa_id=<node_id>,cpus=<cpus_id>,distances=<list_of_distances_to_destination_nodes>,memory_zones=<list_of_memory_zones>,sgx_epc_sections=<list_of_sgx_epc_sections>"
446239169adSSebastien Boeuf```
447239169adSSebastien Boeuf
448e15dba29SSebastien Boeuf### `guest_numa_id`
449239169adSSebastien Boeuf
45055e9827eSSebastien BoeufNode identifier of a guest NUMA node. This identifier must be unique, otherwise
45155e9827eSSebastien Boeufan error will be returned.
452239169adSSebastien Boeuf
453239169adSSebastien BoeufThis option is mandatory when using the `--numa` parameter.
454239169adSSebastien Boeuf
455239169adSSebastien BoeufValue is an unsigned integer of 32 bits.
456239169adSSebastien Boeuf
457239169adSSebastien Boeuf_Example_
458239169adSSebastien Boeuf
459239169adSSebastien Boeuf```
460e15dba29SSebastien Boeuf--numa guest_numa_id=0
461239169adSSebastien Boeuf```
462239169adSSebastien Boeuf
463239169adSSebastien Boeuf### `cpus`
464239169adSSebastien Boeuf
465e15dba29SSebastien BoeufList of virtual CPUs attached to the guest NUMA node identified by the
466e15dba29SSebastien Boeuf`guest_numa_id` option. This allows for describing a list of CPUs which
467e15dba29SSebastien Boeufmust be seen by the guest as belonging to the NUMA node `guest_numa_id`.
468239169adSSebastien Boeuf
469*42e9632cSJosh SorefOne can use this option for a fine-grained description of the NUMA topology
470239169adSSebastien Boeufregarding the CPUs associated with it, which might help the guest run more
471239169adSSebastien Boeufefficiently.
472239169adSSebastien Boeuf
473239169adSSebastien BoeufMultiple values can be provided to define the list. Each value is an unsigned
474239169adSSebastien Boeufinteger of 8 bits.
475239169adSSebastien Boeuf
476239169adSSebastien BoeufFor instance, if one needs to attach all CPUs from 0 to 4 to a specific node,
477239169adSSebastien Boeufthe syntax using `-` will help define a contiguous range with `cpus=0-4`. The
478b81d758cSSebastien Boeufsame example could also be described with `cpus=[0,1,2,3,4]`.
479239169adSSebastien Boeuf
480b81d758cSSebastien BoeufA combination of both `-` and `,` separators is useful when one might need to
481239169adSSebastien Boeufdescribe a list containing all CPUs from 0 to 99 and the CPU 255, as it could
482b81d758cSSebastien Boeufsimply be described with `cpus=[0-99,255]`.
483b81d758cSSebastien Boeuf
484b81d758cSSebastien BoeufAs soon as one tries to describe a list of values, `[` and `]` must be used to
485b81d758cSSebastien Boeufdemarcate the list.
486239169adSSebastien Boeuf
487239169adSSebastien Boeuf_Example_
488239169adSSebastien Boeuf
489239169adSSebastien Boeuf```
490239169adSSebastien Boeuf--cpus boot=8
491b81d758cSSebastien Boeuf--numa guest_numa_id=0,cpus=[1-3,7] guest_numa_id=1,cpus=[0,4-6]
492239169adSSebastien Boeuf```
493239169adSSebastien Boeuf
494239169adSSebastien Boeuf### `distances`
495239169adSSebastien Boeuf
496e15dba29SSebastien BoeufList of distances between the current NUMA node referred by `guest_numa_id`
497e15dba29SSebastien Boeufand the destination NUMA nodes listed along with distances. This option let
498e15dba29SSebastien Boeufthe user choose the distances between guest NUMA nodes. This is important to
499e15dba29SSebastien Boeufprovide an accurate description of the way non uniform memory accesses will
500e15dba29SSebastien Boeufperform in the guest.
501239169adSSebastien Boeuf
502239169adSSebastien BoeufOne or more tuple of two values must be provided through this option. The first
503239169adSSebastien Boeufvalue is an unsigned integer of 32 bits as it represents the destination NUMA
504239169adSSebastien Boeufnode. The second value is an unsigned integer of 8 bits as it represents the
505239169adSSebastien Boeufdistance between the current NUMA node and the destination NUMA node. The two
506239169adSSebastien Boeufvalues are separated by `@` (`value1@value2`), meaning the destination NUMA
507239169adSSebastien Boeufnode `value1` is located at a distance of `value2`. Each tuple is separated
508b81d758cSSebastien Boeuffrom the others with `,` separator.
509b81d758cSSebastien Boeuf
510b81d758cSSebastien BoeufAs soon as one tries to describe a list of values, `[` and `]` must be used to
511b81d758cSSebastien Boeufdemarcate the list.
512239169adSSebastien Boeuf
513239169adSSebastien BoeufFor instance, if one wants to define 3 NUMA nodes, with each node located at
514239169adSSebastien Boeufdifferent distances, it can be described with the following example.
515239169adSSebastien Boeuf
516239169adSSebastien Boeuf_Example_
517239169adSSebastien Boeuf
518239169adSSebastien Boeuf```
519fa22cb0bSRavi kumar Veeramally--numa guest_numa_id=0,distances=[1@15,2@25] guest_numa_id=1,distances=[0@15,2@20] guest_numa_id=2,distances=[0@25,1@20]
520239169adSSebastien Boeuf```
52155e9827eSSebastien Boeuf
52255e9827eSSebastien Boeuf### `memory_zones`
52355e9827eSSebastien Boeuf
524e15dba29SSebastien BoeufList of memory zones attached to the guest NUMA node identified by the
525e15dba29SSebastien Boeuf`guest_numa_id` option. This allows for describing a list of memory ranges
526e15dba29SSebastien Boeufwhich must be seen by the guest as belonging to the NUMA node `guest_numa_id`.
52755e9827eSSebastien Boeuf
52855e9827eSSebastien BoeufThis option can be very useful and powerful when combined with `host_numa_node`
52955e9827eSSebastien Boeufoption from `--memory-zone` parameter as it allows for creating a VM with non
53055e9827eSSebastien Boeufuniform memory accesses, and let the guest know about it. It allows for
53155e9827eSSebastien Boeufexposing memory zones through different NUMA nodes, which can help the guest
53255e9827eSSebastien Boeufworkload run more efficiently.
53355e9827eSSebastien Boeuf
53455e9827eSSebastien BoeufMultiple values can be provided to define the list. Each value is a string
53555e9827eSSebastien Boeufreferring to an existing memory zone identifier. Values are separated from
536b81d758cSSebastien Boeufeach other with the `,` separator.
537b81d758cSSebastien Boeuf
538b81d758cSSebastien BoeufAs soon as one tries to describe a list of values, `[` and `]` must be used to
539b81d758cSSebastien Boeufdemarcate the list.
54055e9827eSSebastien Boeuf
541bcae6c41SHenry WangNote that a memory zone must belong to a single NUMA node. The following
542bcae6c41SHenry Wangconfiguration is incorrect, therefore not allowed:
543fa22cb0bSRavi kumar Veeramally`--numa guest_numa_id=0,memory_zones=mem0 guest_numa_id=1,memory_zones=mem0`
544bcae6c41SHenry Wang
54555e9827eSSebastien Boeuf_Example_
54655e9827eSSebastien Boeuf
54755e9827eSSebastien Boeuf```
54855e9827eSSebastien Boeuf--memory size=0
549fa22cb0bSRavi kumar Veeramally--memory-zone id=mem0,size=1G id=mem1,size=1G id=mem2,size=1G
550fa22cb0bSRavi kumar Veeramally--numa guest_numa_id=0,memory_zones=[mem0,mem2] guest_numa_id=1,memory_zones=mem1
55155e9827eSSebastien Boeuf```
55207f30757SSebastien Boeuf
5536b710209SSebastien Boeuf### `sgx_epc_sections`
5546b710209SSebastien Boeuf
5556b710209SSebastien BoeufList of SGX EPC sections attached to the guest NUMA node identified by the
5566b710209SSebastien Boeuf`guest_numa_id` option. This allows for describing a list of SGX EPC sections
5576b710209SSebastien Boeufwhich must be seen by the guest as belonging to the NUMA node `guest_numa_id`.
5586b710209SSebastien Boeuf
5596b710209SSebastien BoeufMultiple values can be provided to define the list. Each value is a string
5606b710209SSebastien Boeufreferring to an existing SGX EPC section identifier. Values are separated from
561b81d758cSSebastien Boeufeach other with the `,` separator.
562b81d758cSSebastien Boeuf
563b81d758cSSebastien BoeufAs soon as one tries to describe a list of values, `[` and `]` must be used to
564b81d758cSSebastien Boeufdemarcate the list.
5656b710209SSebastien Boeuf
5666b710209SSebastien Boeuf_Example_
5676b710209SSebastien Boeuf
5686b710209SSebastien Boeuf```
5696b710209SSebastien Boeuf--sgx-epc id=epc0,size=32M id=epc1,size=64M id=epc2,size=32M
570fa22cb0bSRavi kumar Veeramally--numa guest_numa_id=0,sgx_epc_sections=epc1 guest_numa_id=1,sgx_epc_sections=[epc0,epc2]
5716b710209SSebastien Boeuf```
5726b710209SSebastien Boeuf
57307f30757SSebastien Boeuf### PCI bus
57407f30757SSebastien Boeuf
5753029fbeaSThomas BarrettCloud Hypervisor supports guests with one or more PCI segments. The default PCI segment always
576*42e9632cSJosh Sorefhas affinity to NUMA node 0. Be default, all other PCI segments have affinity to NUMA node 0.
5773029fbeaSThomas BarrettThe user may configure the NUMA affinity for any additional PCI segments.
5783029fbeaSThomas Barrett
5793029fbeaSThomas Barrett_Example_
5803029fbeaSThomas Barrett
5813029fbeaSThomas Barrett```
5823029fbeaSThomas Barrett--platform num_pci_segments=2
5833029fbeaSThomas Barrett--memory-zone size=16G,host_numa_node=0,id=mem0
5843029fbeaSThomas Barrett--memory-zone size=16G,host_numa_node=1,id=mem1
5853029fbeaSThomas Barrett--numa guest_numa_id=0,memory_zones=mem0,pci_segments=[0]
5863029fbeaSThomas Barrett--numa guest_numa_id=1,memory_zones=mem1,pci_segments=[1]
5873029fbeaSThomas Barrett```
588