f3518400 | 24-Oct-2024 |
Junjie Mao <junjie.mao@hotmail.com> |
rust: introduce alternative implementation of offset_of!
offset_of! was stabilized in Rust 1.77.0. Use an alternative implemenation that was found on the Rust forums, and whose author agreed to lic
rust: introduce alternative implementation of offset_of!
offset_of! was stabilized in Rust 1.77.0. Use an alternative implemenation that was found on the Rust forums, and whose author agreed to license as MIT for use in QEMU.
The alternative allows only one level of field access, but apart from this can be used just by replacing core::mem::offset_of! with qemu_api::offset_of!.
The actual implementation of offset_of! is done in a declarative macro, but for simplicity and to avoid introducing an extra level of indentation, the trigger is a procedural macro #[derive(offsets)].
The procedural macro is perhaps a bit overengineered, but it helps introducing some idioms that will be useful in the future as well.
Signed-off-by: Junjie Mao <junjie.mao@hotmail.com> Co-developed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
646b5378 | 08-Aug-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
rust: fix cfgs of proc-macro2 for 1.63.0
Replay the configuration that would be computed by build.rs when compiling on a 1.63.0 compiler.
Reviewed-by: Junjie Mao <junjie.mao@hotmail.com> Signed-off
rust: fix cfgs of proc-macro2 for 1.63.0
Replay the configuration that would be computed by build.rs when compiling on a 1.63.0 compiler.
Reviewed-by: Junjie Mao <junjie.mao@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
ebdede64 | 18-Jun-2024 |
Stefano Garzarella <sgarzare@redhat.com> |
libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD message if MFD_ALLOW_SEALING is not defined, since it's not able to
libvhost-user: mask F_INFLIGHT_SHMFD if memfd is not supported
libvhost-user will panic when receiving VHOST_USER_GET_INFLIGHT_FD message if MFD_ALLOW_SEALING is not defined, since it's not able to create a memfd.
VHOST_USER_GET_INFLIGHT_FD is used only if VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD is negotiated. So, let's mask that feature if the backend is not able to properly handle these messages.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240618100043.144657-5-sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
92b58bc7 | 18-Jun-2024 |
Stefano Garzarella <sgarzare@redhat.com> |
libvhost-user: fail vu_message_write() if sendmsg() is failing
In vu_message_write() we use sendmsg() to send the message header, then a write() to send the payload.
If sendmsg() fails we should av
libvhost-user: fail vu_message_write() if sendmsg() is failing
In vu_message_write() we use sendmsg() to send the message header, then a write() to send the payload.
If sendmsg() fails we should avoid sending the payload, since we were unable to send the header.
Discovered before fixing the issue with the previous patch, where sendmsg() failed on macOS due to wrong parameters, but the frontend still sent the payload which the backend incorrectly interpreted as a wrong header.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20240618100043.144657-4-sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
52767e10 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Mark mmap'ed region memory as MADV_DONTDUMP
We already use MADV_NORESERVE to deal with sparse memory regions. Let's also set madvise(MADV_DONTDUMP), otherwise a crash of the process c
libvhost-user: Mark mmap'ed region memory as MADV_DONTDUMP
We already use MADV_NORESERVE to deal with sparse memory regions. Let's also set madvise(MADV_DONTDUMP), otherwise a crash of the process can result in us allocating all memory in the mmap'ed region for dumping purposes.
This change implies that the mmap'ed rings won't be included in a coredump. If ever required for debugging purposes, we could mark only the mapped rings MADV_DODUMP.
Ignore errors during madvise() for now.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-15-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
67f4f663 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Dynamically remap rings after (temporarily?) removing memory regions
Currently, we try to remap all rings whenever we add a single new memory region. That doesn't quite make sense, be
libvhost-user: Dynamically remap rings after (temporarily?) removing memory regions
Currently, we try to remap all rings whenever we add a single new memory region. That doesn't quite make sense, because we already map rings when setting the ring address, and panic if that goes wrong. Likely, that handling was simply copied from set_mem_table code, where we actually have to remap all rings.
Remapping all rings might require us to walk quite a lot of memory regions to perform the address translations. Ideally, we'd simply remove that remapping.
However, let's be a bit careful. There might be some weird corner cases where we might temporarily remove a single memory region (e.g., resize it), that would have worked for now. Further, a ring might be located on hotplugged memory, and as the VM reboots, we might unplug that memory, to hotplug memory before resetting the ring addresses.
So let's unmap affected rings as we remove a memory region, and try dynamically mapping the ring again when required.
Acked-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-14-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
2a290227 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Factor out vq usability check
Let's factor it out to prepare for further changes.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.c
libvhost-user: Factor out vq usability check
Let's factor it out to prepare for further changes.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-13-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
b2b63008 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Use most of mmap_offset as fd_offset
In the past, QEMU would create memory regions that could partially cover hugetlb pages, making mmap() fail if we would use the mmap_offset as an f
libvhost-user: Use most of mmap_offset as fd_offset
In the past, QEMU would create memory regions that could partially cover hugetlb pages, making mmap() fail if we would use the mmap_offset as an fd_offset. For that reason, we never used the mmap_offset as an offset into the fd and instead always mapped the fd from the very start.
However, that can easily result in us mmap'ing a lot of unnecessary parts of an fd, possibly repeatedly.
QEMU nowadays does not create memory regions that partially cover huge pages -- it never really worked with postcopy. QEMU handles merging of regions that partially cover huge pages (due to holes in boot memory) since 2018 in c1ece84e7c93 ("vhost: Huge page align and merge").
Let's be a bit careful and not unconditionally convert the mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb size and pass as much as we can as fd_offset, making sure that we call mmap() with a properly aligned offset.
With QEMU and a virtio-mem device that is fully plugged (50GiB using 50 memslots) the qemu-storage daemon process consumes in the VA space 1281GiB before this change and 58GiB after this change.
================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 4 guest_phys_addr: 0x0000000200000000 memory_size: 0x0000000040000000 userspace_addr: 0x00007fb73bffe000 old mmap_offset: 0x0000000080000000 fd_offset: 0x0000000080000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f02f1bdc000 Successfully added new region ================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 5 guest_phys_addr: 0x0000000240000000 memory_size: 0x0000000040000000 userspace_addr: 0x00007fb77bffe000 old mmap_offset: 0x00000000c0000000 fd_offset: 0x00000000c0000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f0284000000 Successfully added new region
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-12-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
a3c0118c | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va()
Let's speed up GPA to memory region / virtual address lookup. Store the memory regions ordered by guest physical addresses, and use bina
libvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va()
Let's speed up GPA to memory region / virtual address lookup. Store the memory regions ordered by guest physical addresses, and use binary search for address translation, as well as when adding/removing memory regions.
Most importantly, this will speed up GPA->VA address translation when we have many memslots.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-11-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
60ccdca4 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Factor out search for memory region by GPA and simplify
Memory regions cannot overlap, and if we ever hit that case something would be really flawed.
For example, when vhost code in
libvhost-user: Factor out search for memory region by GPA and simplify
Memory regions cannot overlap, and if we ever hit that case something would be really flawed.
For example, when vhost code in QEMU decides to increase the size of memory regions to cover full huge pages, it makes sure to never create overlaps, and if there would be overlaps, it would bail out.
QEMU commits 48d7c9757749 ("vhost: Merge sections added to temporary list"), c1ece84e7c93 ("vhost: Huge page align and merge") and e7b94a84b6cb ("vhost: Allow adjoining regions") added and clarified that handling and how overlaps are impossible.
Consequently, each GPA can belong to at most one memory region, and everything else doesn't make sense. Let's factor out our search to prepare for further changes.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-10-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
9c254cb4 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Don't search for duplicates when removing memory regions
We cannot have duplicate memory regions, something would be deeply flawed elsewhere. Let's just stop the search once we found
libvhost-user: Don't search for duplicates when removing memory regions
We cannot have duplicate memory regions, something would be deeply flawed elsewhere. Let's just stop the search once we found an entry.
We'll add more sanity checks when adding memory regions later.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-9-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
c6f90b78 | 14-Feb-2024 |
David Hildenbrand <david@redhat.com> |
libvhost-user: Don't zero out memory for memory regions
dev->nregions always covers only valid entries. Stop zeroing out other array elements that are unused.
Reviewed-by: Raphael Norwitz <raphael@
libvhost-user: Don't zero out memory for memory regions
dev->nregions always covers only valid entries. Stop zeroing out other array elements that are unused.
Reviewed-by: Raphael Norwitz <raphael@enfabrica.net> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20240214151701.29906-8-david@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|