qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 7/7] virtio-console: Enable port throttling w


From: Amit Shah
Subject: Re: [Qemu-devel] [PATCH v8 7/7] virtio-console: Enable port throttling when chardev is slow to consume data
Date: Mon, 6 Dec 2010 15:41:02 +0530
User-agent: Mutt/1.5.21 (2010-09-15)

On (Mon) Dec 06 2010 [09:35:10], Paul Brook wrote:
> > On (Thu) Dec 02 2010 [17:31:36], Paul Brook wrote:
> > > > > > when there's a partial write, it tries to do a write again, which
> > > > > > will fail with -EAGAIN.
> > > > > 
> > > > > Doesn't that cause the first partial chunk to be incorrectly
> > > > > transmitted twice? You may only return EAGAIN if no data was
> > > > > transmitted.
> > > > 
> > > > Except for the fact that no caller of qemu_chr_write() resubmits (or
> > > > even checks) partial writes.
> > > 
> > > I don't buy this argument. The current implementation of qemu_chr_write
> > > never generates transient failures, so they don't need to.
> > 
> > And applying this patch won't change the situation.
> 
> Sure it will. The whole point of the patch is to allow transient failures 
> (i.e. avoid blocking) when writing to char backends.  You should expect to 
> have to modify the device code to cope with this.

Looks like we're talking of two different cases.  I'm talking here of
current code that uses qemu chardevs and that it'll continue working
fine with this patchset (ie. changes required only to code that wants
-EAGAIN returns).

> As with the DMA interface added a while ago, I believe it's important to get 
> these APIs right.  Quick hacks to support limited use-cases just end up 
> needing a complete rewrite (or even worse multiple concurrent 
> APIs/implementations) once we actually start using them seriously.

Sure.  My proposal is for qemu_chr_write() to succeed all the time.  If
the backend can block, and the caller can handle it, it can get a
-EAGAIN (or WSAEWOULDBLOCK) return.  When the backend becomes writable,
the chardev can call the ->writes_unblocked() callback for that caller.
Individual callers need not bother about re-submitting partial writes,
since the buffering can be done in common code in one place
(qemu-char.c).

My previous implementation for leaving out the buffering details to
individual users of qemu chardevs was OK'ed by you but not Anthony.

> I'm extremely reluctant to add a new layer of buffering that is not visible 
> to 
> ether the kernel or the device.  In practice we still need to be able to 
> split 
> oversize requests, so handling small split requests should be pretty much 
> free.

So do you propose to propagate this -EAGAIN error all the way to the
guest?  That won't work for older virtio guests (for virtio, host and
guest changes will be needed).

> > What I proposed in the earlier mail was to buffer only the data that has
> > to be re-submitted in case the caller is capable of stopping further
> > output till the char layer indicates it's free to start again.
> 
> That's case (b) below.
> 
> > > Once data has been transmitted, we have three options:
> > > 
> > > a) Block until the write completes. This makes the whole patch fairly
> > > pointless as host and guest block boundaries are unlikely to align.
> > 
> > This is what currently happens and will remain so for callers of
> > qemu_chr_write() which don't have a .write_unblocked() pointer assigned
> > in the char dev struct.
> 
> Obviously if the device doesn't supply an unbocked() hook then the behavior 
> is 
> unchanged.  That's trivially uninteresting.  I'm talking about devices that 
> do 
> provide the unblocked() hook.
> 
> > > b) Store the data on the side somewhere. Tell the device all data has
> > > been sent, and arrange for this data to be flushed before accepting any
> > > more data. This is bad because it allows the guest to allocate
> > > arbitrarily large[1] buffers on the host. i.e. a fairly easily
> > > exploitable DoS attack.
> > 
> > With virtio-serial, this is what's in use.  The buffer is limited to the
> > length of the vq (which is a compile-time constant) and there also is
> > the virtio_serial_throttle_port() call that tells the guest to not send
> > any more data to the host till the char layer indicates it's OK to send
> > more data.
> 
> No.
> 
> Firstly you're assuming all users are virtio based. That may be all you care 
> about, but is not acceptable if you want to get this code merged.

OK, but it's assumed that once a -EAGAIN is returned, the caller will
take appropriate actions to restrict the data sent.  Especially,
send_all has:

    if (chr->write_blocked) {
        /*
         * We don't handle this situation: the caller should not send
         * us data while we're blocked.
         *
         * We could buffer this data here but that'll only encourage
         * bad behaviour on part of the callers.
         *
         * Also, the data already in fd's buffers isn't easily
         * migratable.  If we want full migration support, all the
         * data landing here needs to be buffered and on migration,
         * anything that's unsent needs to be transferred to the
         * dest. machine (which again isn't a very good way of solving
         * the problem, as the src may become writable just during
         * migration and the reader could receive some data twice,
         * essentially corrupting the data).
         */
        return -1;
    }

> Secondly, the virtqueue only restricts the number of direct ring buffer 
> entries. It does not restrict the quantity of data each ring entry points to.

But that's entirely in guest memory, so it's limited to the amount of
RAM that has been allocated to the guest.

> As a side note, I notice that the virtio-serial-buf code is already 
> allocating 
> buffers and calling iov_to_buf on arbitrary sized requests. This is wrong for 
> the same reason. Don't do it.

You mean the code that allocates one buffer from the iov?

> > > c) Return a partial write to the guest. The guest already has to handle
> > > retries due to EAGAIN, and DMA capable devices already have to handle
> > > partial mappings, so this doesn't seem too onerous a requirement. This
> > > is not a new concept, it's the same as the unix write(2)/send(2)
> > > functions.
> > 
> > This isn't possible with the current vq design.
> 
> You need to fix that then.  I'm fairly sure it must be possible as virtio-blk 
> has to handle similar problems.

This was one of the items that was debated during the lead-up to
virtio-serial merge:  whether a write() call in the guest means data has
been flushed out to the host chardev or if the guest has just passed it
on to the host to take care of it.  We merged the latter approach.

I'll check with what virtio-blk does, but I guess the block layer has
semantics for allowing such usage whereas the char layer doesn't.

                Amit



reply via email to

[Prev in Thread] Current Thread [Next in Thread]