qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/4] block: Correct bs->growable


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH 1/4] block: Correct bs->growable
Date: Fri, 5 Sep 2014 15:13:42 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 05.09.2014 um 14:46 hat Max Reitz geschrieben:
> On 05.09.2014 12:01, Kevin Wolf wrote:
> >Am 04.09.2014 um 22:01 hat Max Reitz geschrieben:
> >>On 20.08.2014 13:40, Kevin Wolf wrote:
> >>>Am 12.07.2014 um 00:23 hat Max Reitz geschrieben:
> >>>>Currently, the field "growable" in a BDS is set iff the BDS is opened in
> >>>>protocol mode (with O_BDRV_PROTOCOL). However, not every protocol block
> >>>>driver allows growing: NBD, for instance, does not. On the other hand,
> >>>>a non-protocol block driver may allow growing: The raw driver does.
> >>>>
> >>>>Fix this by correcting the "growable" field in the driver-specific open
> >>>>function for the BDS, if necessary.
> >>>>
> >>>>Signed-off-by: Max Reitz <address@hidden>
> >>>I'm not sure I agree with bs->growable = true for raw. It's certainly
> >>>true that the backend can technically provide the functionality that
> >>>writes beyond EOF grow the file. That's not the point of bs->growable,
> >>>though.
> >>>
> >>>The point of it was to _forbid_ it to grow even when it's technically
> >>>possible (non-file protocols weren't really a thing back then, apart
> >>>from vvfat, so the assumption was that it's always technically
> >>>possible). growable was introduced with bdrv_check_request(), which is
> >>>supposed to reject guest requests after the end of the virtual disk (and
> >>>this fixed a CVE, see commit 71d0770c). You're now disabling this check
> >>>for raw.
> >>>
> >>>I think we need to make sure that bs->growable is only set if it is
> >>>opened for an image that has drv->requires_growing_file set and
> >>>therefore not directly used by a guest.
> >>>
> >>>Well, except that with node-name a guest will be able to use any image
> >>>in the chain... Might this mean that it's really a BlockBackend
> >>>property?
> >>Okay, the more I sit at this problem, the harder it seems to get.
> >>The only thing I currently know for sure is that I disagree with
> >>Anthony's opinion in 71d0770c ("this patch makes the BlockDriver API
> >>guarantee that all requests are within 0..bdrv_getlength() which to
> >>me seems like a Good Thing").
> >>
> >>The initial point was to range check guest requests. In 2009, it may
> >>have been enough to statically check the BDS type (protocol or
> >>format) to know whether the guest directly accesses it (format) or
> >>not (protocol). However, this is no longer sufficient. Now, as far
> >>as I know, guests can access basically any BDS (as you yourself
> >>say). Therefore, it seems to me like it's impossible to determine
> >>whether the device should be marked growable or not when opening it.
> >>Instead, I think we have to check for each single request whether it
> >>comes from the guest or from within the block layer and do range
> >>checking only for the former case; but this should not be the task
> >>of the block layer core, but of the block devices instead.
> >>Theoretically, guests may write beyond the image end and grow it
> >>that way all they want, but in practice this should be limited by
> >>the block devices which have a fixed length.
> >What about block jobs, built-in NBD server, etc.?
> 
> Well, I honestly think they should do the check themselves as well.
> 
> >Also, we have many device models that I don't trust a bit and having one
> >central check is much easier to get right than n duplicates in each
> >device emulation.
> 
> That's why I (later) said that I do understand the concept of having
> a backup check.

Okay, then we're probably in violent agreement.

> I still think the block device drivers have to detect out-of-bounds
> accesses to be able to act differently than when just seeing an EIO
> from the block layer. The same applies to the NBD server; and then
> getting the range check into block jobs shouldn't hurt too much
> either.

True for devices, because they shouldn't apply rerror/werror to
invalid requests, but always report them to the guest. I'm not so sure
about the NBD server, I don't think it behaves differently for this kind
of errors.

> I don't think all those guest interfaces (block jobs are not really
> a guest interface, imho, so I'd even keep them out of this) can just
> rely on the block layer to do the check for them.

Block jobs and NBD servers are in the same category, I think. They are
somewhat more harmless than guest requests, but they are still outside
the core block layer and shouldn't be trusted.

> >>So I scrapped that. Instead, we can just test whether
> >>BDRV_O_PROTOCOL is given or not. If it is, the BDS is used from
> >>within the block layer; if it isn't, it probably isn't, and even if
> >>it is, the user just has to cope with activated range checks. That's
> >>at least the idea.
> >Making any difference based on BDRV_O_PROTOCOL is a dead end in
> >blockdev-add times, where users are fully expected to build up a BDS
> >graph node by node with separate blockdev-add invocations.
> 
> Right, but that just means the protection is already broken.

Is it really? I see how you can get BDSes _without_ BDRV_O_PROTOCOL
where you would have it when created automatically, so you're imposing
additional restriction (probably to the point of making the resulting
BDS useless), but can you actually bypass the protection?

> >>But this doesn't work either: You can create a protocol BDS and then
> >>pass it to the guest; on second thought, however, this is already
> >>possible, so I wouldn't bother about this. But on the other hand,
> >>this breaks 071 as well because 071 creates a (non-guest accessible,
> >>but it could be) non-protocol BDS and then tries to put blkdebug on
> >>top of that. I do know that this is not a general use case but it
> >>should work anyway.
> >>
> >>So, in my honest and very humble opinion, I'd reevaluate the
> >>usefulness of BDS.growable regarding it's original purpose and
> >>instead change it to be what this patch does.
> >What use case is left if you don't use it for range checks any more?
> >Shouldn't we remove rather than change it?
> >
> >Do your patches still need a per-BDS flag, or would a static per-driver
> >flag be enough?
> 
> Well, that's not so easy. See blkdebug (I know blkdebug is a rather
> bad example, because noone really uses it except for the iotests,
> but it's there) and raw (which is another bad example, because it's
> what we're talking about here and don't really know how to handle
> it): A blkdebug or raw BDS is growable iff the underlying file BDS
> is growable. We could signal this through a special value in the
> BDS, but enter blkverify (just as bad as blkdebug, I know): It's
> growable iff both the file BDS and the tested BDS are growable; a
> similar thing applies to quorum (which I, for whatever reason, did
> not touch during this series; probably I should do that in v2).

In short: Filters don't work this way. You're right.

Still begs the question if a callback doesn't make more sense than a BDS
field.

> >Also, should qcow2 over host_device print a warning? It's a legitimate
> >setup that is frequently used, but a host_device isn't really growable
> >on its own. We rely on management taking care of it.
> 
> I most certainly think it should warn. But I'll keep that in mind
> and reformulate the message (strip the "please fix your
> configuration" and just replace it by something like "write
> operations may fail").

Sounds reasonable.

> >>tl;dr, I see only two ways to go on: Either I wait until
> >>BlockBackend exists; or I simply leave this patch as it is because
> >>it's actually the block device driver's fault if an out-of-range
> >>request comes in from the guest. Since I remember talking about the
> >>former a year ago personally with you and Markus, I fear it'll still
> >>take a considerable amount of time. Therefore, I'm strongly in favor
> >>of the latter. If the block device drivers do their job, it won't
> >>break anything. If they don't, they should be fixed (and at least
> >>it'll be only raw that's broken).
> >Wait until BlockBackend exists. Markus seems to have switched from "do
> >everything sometime" to "do something now" and is working on it, so I
> >hope we'll be there soon (before KVM Forum) with the initial version.
> >That should be enough for your requirement.
> 
> Good.

I hope he agrees. ;-)

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]