On Fri, Apr 5, 2024 at 3:59 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
On 4/4/24 12:33 PM, Eugenio Perez Martin wrote:
On Thu, Apr 4, 2024 at 4:42 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
On 4/4/24 7:35 AM, Eugenio Perez Martin wrote:
On Wed, Apr 3, 2024 at 6:51 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
On 4/3/24 6:18 AM, Eugenio Perez Martin wrote:
On Thu, Mar 28, 2024 at 5:22 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
Initialize sequence variables for VirtQueue and VirtQueueElement
structures. A VirtQueue's sequence variables are initialized when a
VirtQueue is being created or reset. A VirtQueueElement's sequence
variable is initialized when a VirtQueueElement is being initialized.
These variables will be used to support the VIRTIO_F_IN_ORDER feature.
A VirtQueue's used_seq_idx represents the next expected index in a
sequence of VirtQueueElements to be processed (put on the used ring).
The next VirtQueueElement added to the used ring must match this
sequence number before additional elements can be safely added to the
used ring. It's also particularly useful for helping find the number of
new elements added to the used ring.
A VirtQueue's current_seq_idx represents the current sequence index.
This value is essentially a counter where the value is assigned to a new
VirtQueueElement and then incremented. Given its uint16_t type, this
sequence number can be between 0 and 65,535.
A VirtQueueElement's seq_idx represents the sequence number assigned to
the VirtQueueElement when it was created. This value must match with the
VirtQueue's used_seq_idx before the element can be put on the used ring
by the device.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
---
hw/virtio/virtio.c | 18 ++++++++++++++++++
include/hw/virtio/virtio.h | 1 +
2 files changed, 19 insertions(+)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index fb6b4ccd83..069d96df99 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -132,6 +132,10 @@ struct VirtQueue
uint16_t used_idx;
bool used_wrap_counter;
+ /* In-Order sequence indices */
+ uint16_t used_seq_idx;
+ uint16_t current_seq_idx;
+
I'm having a hard time understanding the difference between these and
last_avail_idx and used_idx. It seems to me if we replace them
everything will work? What am I missing?
For used_seq_idx, it does work like used_idx except the difference is
when their values get updated, specifically for the split VQ case.
As you know, for the split VQ case, the used_idx is updated during
virtqueue_split_flush. However, imagine a batch of elements coming in
where virtqueue_split_fill is called multiple times before
virtqueue_split_flush. We want to make sure we write these elements to
the used ring in-order and we'll know its order based on used_seq_idx.
Alternatively, I thought about replicating the logic for the packed VQ
case (where this used_seq_idx isn't used) where we start looking at
vq->used_elems[vq->used_idx] and iterate through until we find a used
element, but I wasn't sure how to handle the case where elements get
used (written to the used ring) and new elements get put in used_elems
before the used_idx is updated. Since this search would require us to
always start at index vq->used_idx.
For example, say, of three elements getting filled (elem0 - elem2),
elem1 and elem0 come back first (vq->used_idx = 0):
elem1 - not in-order
elem0 - in-order, vq->used_elems[vq->used_idx + 1] (elem1) also now
in-order, write elem0 and elem1 to used ring, mark elements as
used
Then elem2 comes back, but vq->used_idx is still 0, so how do we know to
ignore the used elements at vq->used_idx (elem0) and vq->used_idx + 1
(elem1) and iterate to vq->used_idx + 2 (elem2)?
Hmm... now that I'm thinking about it, maybe for the split VQ case we
could continue looking through the vq->used_elems array until we find an
unused element... but then again how would we (1) know if the element is
in-order and (2) know when to stop searching?
Ok I think I understand the problem now. It is aggravated if we add
chained descriptors to the mix.
We know that the order of used descriptors must be the exact same as
the order they were made available, leaving out in order batching.
What if vq->used_elems at virtqueue_pop and then virtqueue_push just
marks them as used somehow? Two booleans (or flag) would do for a
first iteration.
If we go with this approach I think used_elems should be renamed actually.
If I'm understanding correctly, I don't think adding newly created
elements to vq->used_elems at virtqueue_pop will do much for us.
By knowing what descriptor id must go in each position of the used ring.
Following your example, let's say avail_idx is 10 at that moment.
Then, the driver makes available the three elements you mention, so:
used_elems[10] = elem0
used_elems[11] = elem1
used_elems[12] = elem2
Now the device uses elem1. virtqueue_push can search linearly for
elem->index in used_elems[used_idx]...used_elems[avail_idx] range. As
the device is mis-behaving, no need to optimize this behavior.
used_elems[11].index == elem->index, so we mark it as used somehow.
Let's say we add a boolean to VirtQueueElement.
virtqueue_flush can scan used_elems[used_idx]...used_elems[avail_idx]
looking for used elements. At this moment used_elems[used_idx] is not
used so it stops there.
Now elem0 is pushed. It is the first one in the
used_elems[used_idx]...used_elems[avail_idx] range, so we can write it
to the used ring at fill. used_idx++. We use the rest of the
descriptor until we find one in used_elems that is not used, which is
used_elems[12].
After that virtqueue_flush is called. At its scanning, used_elems[10]
is used, so it writes it to the used ring. After that, used_elems[11]
is also used, so it is written also. used_elems[12] is not used, so it
stops there.
Finally, elem2 is pushed, so used_elems[12] is written.
virtqueue_flush detects it, so it writes it to the guest's used ring.
Let me know what you think. I find it simpler than declaring new
indexes, but I may be wrong.
I think I see where you're getting at, but I just have a few clarifying
questions about your proposal here.
So you're proposing to add entries to used_elems at virtqueue_pop, ok.
avail_idx = 10, then the driver makes some new entries (elems) available
in the avail ring:
used_elems[10] = elem0
used_elems[11] = elem1
used_elems[12] = elem2
At this point, avail_idx = 13, used_idx = 10.
elem1 gets used first, ok.
Now, if I'm understanding correctly, you're saying that in
virtqueue_push explicitly (not virtqueue_fill/virtqueue_flush), we scan
used_elems[used_idx] - used_elems[avail_idx] to find used_elems[i].index
== elem->index and mark it as used, e.g. used_elems[i].used = true.
Okay, now used_elems[11] has been marked as used.
Now we make it to virtqueue_fill. What's the role you want
virtqueue_fill to take here (if any)?
Sorry I meant virtqueue_fill here.
You say that after we mark this element as used, we go to
virtqueue_flush and scan for used elements in used_elems[used_idx] -
used_elems[avail_idx]. Of course, the first one of this order will be in
used_elems[used_idx], which is currently showing the element as unused,
so we're done with this element for now.
So, what exactly is the role of virtqueue_flush here? I'm inferring here
that you want the virtqueue_flush role (for split VQs) to do both the
writing to the used ring (normally done in virtqueue_fill) as well as
updating the used_idx (normally done in virtqueue_flush). Is this correct?
I modelled this after the packed vq scenario, where all is updated at
_flush. But yes, I think you got it totally right.
Next, elem0 gets used second.
Again, in virtqueue_push we scan scan used_elems[used_idx] -
used_elems[avail_idx] to find used_elems[i].index == elem->index and
mark it as used. Okay, used_elems[10] has been marked as used.
Then you say, "It is the first one in the used_elems[used_idx] -
used_elems[avail_idx] range, so we can write it to the used ring at
fill. used_idx++. We use the rest of the descriptor until we find one in
used_elems that is not used, which is used_elems[12]."
This, to me, sounds like "in virtqueue_fill, when we find an order (e.g.
used_elems[used_idx].index == elem->index) write it to the used ring AND
increment the used_idx. Keep writing elements to the used ring if
used_elems[used_idx].used == true and, for each element being written,
incremented used_idx."
This is a bit confusing to me since next you say "After that,
virtqueue_flush is called. At its scanning, used_elems[10] is used, so
it writes it to the used ring. After that, used_elems[11] is also used,
so it is written also. used_elems[12] is not used, so it stops there."
This sounds very similar to what you proposed for virtqueue_fill, except
it looks like you're also saying to do this in virtqueue_flush, hence my
confusion.
If you wouldn't mind, could you clarify the roles of virtqueue_fill and
virtqueue_flush here for me? Thanks :)!
I see how they're confusing if following the split vq way, sorry!
* _fill: Only update used_elems (or equivalent)
* _flush: Write information to used ring or descriptor ring.
This makes it difficult to actually batch used descriptors. My
proposal is to address it in another series, by delegating it to the
caller and recovering proper usage of virtqueue_push(idx) parameter.
The caller can specify either to batch as usual, or to delegate the
automatic (and potentially inefficient) ordering I'm proposing here.
Just to be clear, for this series, you'd like me to implement a solution
that does *not* consider the case where virtqueue_fill is called
multiple times before virtqueue_flush (and to put a solution for this in
a separate series)?
No, it is supported. It is just not very efficient because of the linear search.
For it to be supported properly the caller should indicate
virtqueue_fill idx properly. But that requires modifications to all
devices, so I'm proposing to do it on top.