179c0f397SHaozhong ZhangQEMU Virtual NVDIMM 279c0f397SHaozhong Zhang=================== 379c0f397SHaozhong Zhang 479c0f397SHaozhong ZhangThis document explains the usage of virtual NVDIMM (vNVDIMM) feature 579c0f397SHaozhong Zhangwhich is available since QEMU v2.6.0. 679c0f397SHaozhong Zhang 779c0f397SHaozhong ZhangThe current QEMU only implements the persistent memory mode of vNVDIMM 879c0f397SHaozhong Zhangdevice and not the block window mode. 979c0f397SHaozhong Zhang 1079c0f397SHaozhong ZhangBasic Usage 1179c0f397SHaozhong Zhang----------- 1279c0f397SHaozhong Zhang 1379c0f397SHaozhong ZhangThe storage of a vNVDIMM device in QEMU is provided by the memory 1479c0f397SHaozhong Zhangbackend (i.e. memory-backend-file and memory-backend-ram). A simple 1579c0f397SHaozhong Zhangway to create a vNVDIMM device at startup time is done via the 1679c0f397SHaozhong Zhangfollowing command line options: 1779c0f397SHaozhong Zhang 1879c0f397SHaozhong Zhang -machine pc,nvdimm 1979c0f397SHaozhong Zhang -m $RAM_SIZE,slots=$N,maxmem=$MAX_SIZE 2079c0f397SHaozhong Zhang -object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE 2179c0f397SHaozhong Zhang -device nvdimm,id=nvdimm1,memdev=mem1 2279c0f397SHaozhong Zhang 2379c0f397SHaozhong ZhangWhere, 2479c0f397SHaozhong Zhang 2579c0f397SHaozhong Zhang - the "nvdimm" machine option enables vNVDIMM feature. 2679c0f397SHaozhong Zhang 2779c0f397SHaozhong Zhang - "slots=$N" should be equal to or larger than the total amount of 2879c0f397SHaozhong Zhang normal RAM devices and vNVDIMM devices, e.g. $N should be >= 2 here. 2979c0f397SHaozhong Zhang 3079c0f397SHaozhong Zhang - "maxmem=$MAX_SIZE" should be equal to or larger than the total size 3179c0f397SHaozhong Zhang of normal RAM devices and vNVDIMM devices, e.g. $MAX_SIZE should be 3279c0f397SHaozhong Zhang >= $RAM_SIZE + $NVDIMM_SIZE here. 3379c0f397SHaozhong Zhang 3479c0f397SHaozhong Zhang - "object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE" 3579c0f397SHaozhong Zhang creates a backend storage of size $NVDIMM_SIZE on a file $PATH. All 3679c0f397SHaozhong Zhang accesses to the virtual NVDIMM device go to the file $PATH. 3779c0f397SHaozhong Zhang 3879c0f397SHaozhong Zhang "share=on/off" controls the visibility of guest writes. If 3979c0f397SHaozhong Zhang "share=on", then guest writes will be applied to the backend 4079c0f397SHaozhong Zhang file. If another guest uses the same backend file with option 4179c0f397SHaozhong Zhang "share=on", then above writes will be visible to it as well. If 4279c0f397SHaozhong Zhang "share=off", then guest writes won't be applied to the backend 4379c0f397SHaozhong Zhang file and thus will be invisible to other guests. 4479c0f397SHaozhong Zhang 4579c0f397SHaozhong Zhang - "device nvdimm,id=nvdimm1,memdev=mem1" creates a virtual NVDIMM 4679c0f397SHaozhong Zhang device whose storage is provided by above memory backend device. 4779c0f397SHaozhong Zhang 4879c0f397SHaozhong ZhangMultiple vNVDIMM devices can be created if multiple pairs of "-object" 4979c0f397SHaozhong Zhangand "-device" are provided. 5079c0f397SHaozhong Zhang 5179c0f397SHaozhong ZhangFor above command line options, if the guest OS has the proper NVDIMM 5279c0f397SHaozhong Zhangdriver, it should be able to detect a NVDIMM device which is in the 5379c0f397SHaozhong Zhangpersistent memory mode and whose size is $NVDIMM_SIZE. 5479c0f397SHaozhong Zhang 5579c0f397SHaozhong ZhangNote: 5679c0f397SHaozhong Zhang 5779c0f397SHaozhong Zhang1. Prior to QEMU v2.8.0, if memory-backend-file is used and the actual 5879c0f397SHaozhong Zhang backend file size is not equal to the size given by "size" option, 5979c0f397SHaozhong Zhang QEMU will truncate the backend file by ftruncate(2), which will 6079c0f397SHaozhong Zhang corrupt the existing data in the backend file, especially for the 6179c0f397SHaozhong Zhang shrink case. 6279c0f397SHaozhong Zhang 6379c0f397SHaozhong Zhang QEMU v2.8.0 and later check the backend file size and the "size" 6479c0f397SHaozhong Zhang option. If they do not match, QEMU will report errors and abort in 6579c0f397SHaozhong Zhang order to avoid the data corruption. 6679c0f397SHaozhong Zhang 6779c0f397SHaozhong Zhang2. QEMU v2.6.0 only puts a basic alignment requirement on the "size" 6879c0f397SHaozhong Zhang option of memory-backend-file, e.g. 4KB alignment on x86. However, 6979c0f397SHaozhong Zhang QEMU v.2.7.0 puts an additional alignment requirement, which may 7079c0f397SHaozhong Zhang require a larger value than the basic one, e.g. 2MB on x86. This 7179c0f397SHaozhong Zhang change breaks the usage of memory-backend-file that only satisfies 7279c0f397SHaozhong Zhang the basic alignment. 7379c0f397SHaozhong Zhang 7479c0f397SHaozhong Zhang QEMU v2.8.0 and later remove the additional alignment on non-s390x 7579c0f397SHaozhong Zhang architectures, so the broken memory-backend-file can work again. 7679c0f397SHaozhong Zhang 7779c0f397SHaozhong ZhangLabel 7879c0f397SHaozhong Zhang----- 7979c0f397SHaozhong Zhang 8079c0f397SHaozhong ZhangQEMU v2.7.0 and later implement the label support for vNVDIMM devices. 8179c0f397SHaozhong ZhangTo enable label on vNVDIMM devices, users can simply add 8279c0f397SHaozhong Zhang"label-size=$SZ" option to "-device nvdimm", e.g. 8379c0f397SHaozhong Zhang 8479c0f397SHaozhong Zhang -device nvdimm,id=nvdimm1,memdev=mem1,label-size=128K 8579c0f397SHaozhong Zhang 8679c0f397SHaozhong ZhangNote: 8779c0f397SHaozhong Zhang 8879c0f397SHaozhong Zhang1. The minimal label size is 128KB. 8979c0f397SHaozhong Zhang 9079c0f397SHaozhong Zhang2. QEMU v2.7.0 and later store labels at the end of backend storage. 9179c0f397SHaozhong Zhang If a memory backend file, which was previously used as the backend 9279c0f397SHaozhong Zhang of a vNVDIMM device without labels, is now used for a vNVDIMM 9379c0f397SHaozhong Zhang device with label, the data in the label area at the end of file 9479c0f397SHaozhong Zhang will be inaccessible to the guest. If any useful data (e.g. the 9579c0f397SHaozhong Zhang meta-data of the file system) was stored there, the latter usage 9679c0f397SHaozhong Zhang may result guest data corruption (e.g. breakage of guest file 9779c0f397SHaozhong Zhang system). 9879c0f397SHaozhong Zhang 9979c0f397SHaozhong ZhangHotplug 10079c0f397SHaozhong Zhang------- 10179c0f397SHaozhong Zhang 10279c0f397SHaozhong ZhangQEMU v2.8.0 and later implement the hotplug support for vNVDIMM 10379c0f397SHaozhong Zhangdevices. Similarly to the RAM hotplug, the vNVDIMM hotplug is 10479c0f397SHaozhong Zhangaccomplished by two monitor commands "object_add" and "device_add". 10579c0f397SHaozhong Zhang 10679c0f397SHaozhong ZhangFor example, the following commands add another 4GB vNVDIMM device to 10779c0f397SHaozhong Zhangthe guest: 10879c0f397SHaozhong Zhang 10979c0f397SHaozhong Zhang (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=new_nvdimm.img,size=4G 11079c0f397SHaozhong Zhang (qemu) device_add nvdimm,id=nvdimm2,memdev=mem2 11179c0f397SHaozhong Zhang 11279c0f397SHaozhong ZhangNote: 11379c0f397SHaozhong Zhang 11479c0f397SHaozhong Zhang1. Each hotplugged vNVDIMM device consumes one memory slot. Users 11579c0f397SHaozhong Zhang should always ensure the memory option "-m ...,slots=N" specifies 11679c0f397SHaozhong Zhang enough number of slots, i.e. 11779c0f397SHaozhong Zhang N >= number of RAM devices + 11879c0f397SHaozhong Zhang number of statically plugged vNVDIMM devices + 11979c0f397SHaozhong Zhang number of hotplugged vNVDIMM devices 12079c0f397SHaozhong Zhang 12179c0f397SHaozhong Zhang2. The similar is required for the memory option "-m ...,maxmem=M", i.e. 12279c0f397SHaozhong Zhang M >= size of RAM devices + 12379c0f397SHaozhong Zhang size of statically plugged vNVDIMM devices + 12479c0f397SHaozhong Zhang size of hotplugged vNVDIMM devices 125*98376843SHaozhong Zhang 126*98376843SHaozhong ZhangAlignment 127*98376843SHaozhong Zhang--------- 128*98376843SHaozhong Zhang 129*98376843SHaozhong ZhangQEMU uses mmap(2) to maps vNVDIMM backends and aligns the mapping 130*98376843SHaozhong Zhangaddress to the page size (getpagesize(2)) by default. However, some 131*98376843SHaozhong Zhangtypes of backends may require an alignment different than the page 132*98376843SHaozhong Zhangsize. In that case, QEMU v2.12.0 and later provide 'align' option to 133*98376843SHaozhong Zhangmemory-backend-file to allow users to specify the proper alignment. 134*98376843SHaozhong Zhang 135*98376843SHaozhong ZhangFor example, device dax require the 2 MB alignment, so we can use 136*98376843SHaozhong Zhangfollowing QEMU command line options to use it (/dev/dax0.0) as the 137*98376843SHaozhong Zhangbackend of vNVDIMM: 138*98376843SHaozhong Zhang 139*98376843SHaozhong Zhang -object memory-backend-file,id=mem1,share=on,mem-path=/dev/dax0.0,size=4G,align=2M 140*98376843SHaozhong Zhang -device nvdimm,id=nvdimm1,memdev=mem1 141