qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device


From: Avi Kivity
Subject: [Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 19:02:16 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Thunderbird/3.0.3

On 03/25/2010 06:50 PM, Cam Macdonell wrote:

Please put the spec somewhere publicly accessible with a permanent URL.  I
suggest a new qemu.git directory specs/.  It's more important than the code
IMO.
Sorry to be pedantic, do you want a URL or the spec as part of a patch
that adds it as  a file in qemu.git/docs/specs/

I leave it up to you. If you are up to hosting it independently, than just post a URL as part of the patch. Otherwise, I'm sure qemu.git will be more than happy to be the official repository for the memory sharing device specification. In that case, make the the spec the first patch in the series.

Possible later extensions:
- multiple doorbells that trigger different vectors
- multicast doorbells
Since the doorbells are exposed the multicast could be done by the
driver.  If multicast is handled by qemu, then we have different
behaviour when using ioeventfd/irqfd since only one eventfd can be
triggered by a write.

Multicast by the driver would require one exit per guest signalled. Multicast by the shared memory server needs one exit to signal an eventfd, then the shared memory server signals the irqfds of all members of the multicast group.

The semantics of the value written to the doorbell depends on whether the
device is using MSI or a regular pin-based interrupt.

I recommend against making the semantics interrupt-style dependent.  It
means the application needs to know whether MSI is in use or not, while it
is generally the OS that is in control of that.
It is basically the use of the status register that is the difference.
  The application view of what is happening doesn't need to change,
especially with UIO: write to doorbells, block on read until interrupt
arrives.  In the MSI case I could set the status register to the
vector that is received and then the would be equivalent from the view
of the application.  But, if future MSI support in UIO allows MSI
information (such as vector number) to be accessible in userspace,
then applications would become MSI dependent anyway.

Ah, I see.  You adjusted for the different behaviours in the driver.

Still I recommend dropping the status register: this allows single-msi and PIRQ to behave the same way. Also it is racy, if two guests signal a third, they will overwrite each other's status.

ioeventfd/irqfd are an implementation detail.  The spec should not depend on
it.  It needs to be written as if qemu and kvm do not exist.  Again, I
recommend Rusty's virtio-pci for inspiration.

Applications should see exactly the same thing whether ioeventfd is enabled
or not.
The challenge I recently encountered with this is one line in the
eventfd implementation

from kvm/virt/kvm/eventfd.c

/* MMIO/PIO writes trigger an event if the addr/val match */
static int
ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
         const void *val)
{
     struct _ioeventfd *p = to_ioeventfd(this);

     if (!ioeventfd_in_range(p, addr, len, val))
         return -EOPNOTSUPP;

     eventfd_signal(p->eventfd, 1);
     return 0;
}

IIUC, no matter what value is written to an ioeventfd by a guest, a
value of 1 is written.  So ioeventfds work differently than eventfds.
Can we add a "multivalue" flag to ioeventfds so that the value that
the guest writes is written to eventfd?

Eventfd values are a counter, not a register. A read() on the other side returns the sum of all write()s (or eventfd_signal()s). In the context of irqfd it just means the number of interrupts we coalesced.

Multivalue was considered at one time for a different need and rejected. Really, to solve the race you need a queue, and that can only be done in the shared memory segment using locked instructions.

--
error compiling committee.c: too many arguments to function





reply via email to

[Prev in Thread] Current Thread [Next in Thread]