[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v3 3/3] vhost: Allocate memory for packed vring
From: |
Sahil Siddiq |
Subject: |
Re: [RFC v3 3/3] vhost: Allocate memory for packed vring |
Date: |
Wed, 13 Nov 2024 10:40:49 +0530 |
User-agent: |
Mozilla Thunderbird |
Hi,
On 10/28/24 11:07 AM, Sahil Siddiq wrote:
[...]
The payload that VHOST_SET_VRING_BASE accepts depends on whether
split virtqueues or packed virtqueues are used [6]. In hw/virtio/vhost-
vdpa.c:vhost_vdpa_svq_setup() [7], the following payload is used which is
not suitable for packed virtqueues:
struct vhost_vring_state s = {
.index = vq_index,
};
Based on the implementation in the linux kernel, the payload needs to
be as shown below for the ioctl to succeed for packed virtqueues:
struct vhost_vring_state s = {
.index = vq_index,
.num = 0x80008000,
};
After making these changes, it looks like QEMU is able to set up the
virtqueues and shadow virtqueues are enabled as well.
Unfortunately, before the L2 VM can finish booting the kernel crashes.
The reason is that even though packed virtqueues are to be used, the
kernel tries to run
drivers/virtio/virtio_ring.c:virtqueue_get_buf_ctx_split() [8]
(instead of virtqueue_get_buf_ctx_packed) and throws an "invalid vring
head" error. I am still investigating this issue.
I made a mistake here. "virtqueue_get_buf_ctx_packed" [1] in the linux
kernel also throws the same error. I think the issue might be because
hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings [2] does not handle
mapping packed virtqueues at the moment.
Probably because of this, vq->packed.desc_state[id].data [1] is NULL in the
kernel.
Regarding one of the earlier reviews in the same thread [3]:
On 8/7/24 9:52 PM, Eugenio Perez Martin wrote:
On Fri, Aug 2, 2024 at 1:22 PM Sahil Siddiq <icegambit91@gmail.com> wrote:
Allocate memory for the packed vq format and support
packed vq in the SVQ "start" and "stop" operations.
Signed-off-by: Sahil Siddiq <sahilcdq@proton.me>
---
[...]
diff --git a/hw/virtio/vhost-shadow-virtqueue.c
b/hw/virtio/vhost-shadow-virtqueue.c
index 4c308ee53d..f4285db2b4 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
[...]
@@ -672,6 +674,16 @@ size_t vhost_svq_device_area_size(const
VhostShadowVirtqueue *svq)
return ROUND_UP(used_size, qemu_real_host_page_size());
}
+size_t vhost_svq_memory_packed(const VhostShadowVirtqueue *svq)
+{
+ size_t desc_size = sizeof(struct vring_packed_desc) * svq->num_free;
+ size_t driver_event_suppression = sizeof(struct vring_packed_desc_event);
+ size_t device_event_suppression = sizeof(struct vring_packed_desc_event);
+
+ return ROUND_UP(desc_size + driver_event_suppression +
device_event_suppression,
+ qemu_real_host_page_size());
+}
+
/**
* Set a new file descriptor for the guest to kick the SVQ and notify for
avail
*
@@ -726,17 +738,30 @@ void vhost_svq_start(VhostShadowVirtqueue *svq,
VirtIODevice *vdev,
svq->vring.num = virtio_queue_get_num(vdev, virtio_get_queue_index(vq));
svq->num_free = svq->vring.num;
- svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
- PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
- -1, 0);
- desc_size = sizeof(vring_desc_t) * svq->vring.num;
- svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
- svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
- PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
- -1, 0);
- svq->desc_state = g_new0(SVQDescState, svq->vring.num);
- svq->desc_next = g_new0(uint16_t, svq->vring.num);
- for (unsigned i = 0; i < svq->vring.num - 1; i++) {
+ svq->is_packed = virtio_vdev_has_feature(svq->vdev, VIRTIO_F_RING_PACKED);
+
+ if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_RING_PACKED)) {
+ svq->vring_packed.vring.desc = mmap(NULL, vhost_svq_memory_packed(svq),
+ PROT_READ | PROT_WRITE, MAP_SHARED
| MAP_ANONYMOUS,
+ -1, 0);
+ desc_size = sizeof(struct vring_packed_desc) * svq->vring.num;
+ svq->vring_packed.vring.driver = (void *)((char
*)svq->vring_packed.vring.desc + desc_size);
+ svq->vring_packed.vring.device = (void *)((char
*)svq->vring_packed.vring.driver +
+ sizeof(struct
vring_packed_desc_event));
This is a great start but it will be problematic when you start
mapping the areas to the vdpa device. The driver area should be read
only for the device, but it is placed in the same page as a RW one.
More on this later.
+ } else {
+ svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
+ PROT_READ | PROT_WRITE, MAP_SHARED |
MAP_ANONYMOUS,
+ -1, 0);
+ desc_size = sizeof(vring_desc_t) * svq->vring.num;
+ svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
+ svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
+ PROT_READ | PROT_WRITE, MAP_SHARED |
MAP_ANONYMOUS,
+ -1, 0);
+ }
I think it will be beneficial to avoid "if (packed)" conditionals on
the exposed functions that give information about the memory maps.
These need to be replicated at
hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings.
Based on what I have understood, I'll need to have an if(packed)
condition in vhost_vdpa_svq_map_rings() because the mappings will
differ in the packed and split cases.
However, the current one depends on the driver area to live in the
same page as the descriptor area, so it is not suitable for this.
Right, for the split case, svq->vring.desc and svq->vring.avail are
mapped to the same page.
svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS,
-1, 0);
desc_size = sizeof(vring_desc_t) * svq->vring.num;
svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
vhost_svq_driver_area_size() encompasses the descriptor area and avail ring.
So what about this action plan:
1) Make the avail ring (or driver area) independent of the descriptor ring.
2) Return the mapping permissions of the descriptor area (not needed
here, but needed for hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings
Does point 1 refer to mapping the descriptor and avail rings to separate
pages for both split and packed cases. I am not sure if my understanding
is correct.
I believe, however, that this approach will make it easier to map the
rings in the vdpa device. It might also help in removing the if(packed)
condition in vhost_svq_start().
Thanks,
Sahil
[1]
https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1708
[2]
https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1178
[3] https://lists.nongnu.org/archive/html/qemu-devel/2024-08/msg01145.html
- Re: [RFC v3 3/3] vhost: Allocate memory for packed vring,
Sahil Siddiq <=