[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_size's fallback
From: |
Max Reitz |
Subject: |
Re: [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_size's fallback |
Date: |
Thu, 14 Nov 2019 14:11:27 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 |
On 11.09.19 13:00, Max Reitz wrote:
> On 11.09.19 12:31, Kevin Wolf wrote:
>> Am 11.09.2019 um 12:00 hat Max Reitz geschrieben:
>>> On 11.09.19 10:27, Kevin Wolf wrote:
>>>> Am 11.09.2019 um 09:37 hat Max Reitz geschrieben:
>>>>> On 11.09.19 08:55, Kevin Wolf wrote:
>>>>>> Well, by default the primary child, which should cover like 90% of the
>>>>>> drivers?
>>>>>
>>>>> Hm, yes.
>>>>>
>>>>> But I still think that the drivers that do not want to count every
>>>>> single non-COW child are the exception.
>>>>
>>>> They are, but drivers that want to count more than their primary node
>>>> are exceptions, too. And I think you're more likely to remember adding
>>>> the callback when you want to have a certain feature, not when you don't
>>>> want to have it.
>>>>
>>>> I really think we're likely to forget adding the callback where we need
>>>> to disable the feature.
>>>
>>> Well, I mean, we did forget adding it for qcow2.
>>
>> I'm afraid I have to agree. So the conclusion is that we won't get it
>> right anyway?
>>
>>>> I can see two options that should address both of our views:
>>>>
>>>> 1. Just don't have a fallback at all, make the callback mandatory and
>>>> provide implementations in block.c that can be referred to in
>>>> BlockDriver. Not specifying the callback causes an assertion failure,
>>>> so we'd hopefully notice it quite early (assuming that we run either
>>>> 'qemu-img info' or 'query-block' on a configuration with the block
>>>> driver, but I think that's faily safe to assume).
>>>
>>> Hm. Seems a bit much, but if we can’t agree on what’s a good general
>>> implementation that works for everything, this is probably the only
>>> thing that would actually keep us from forgetting to add special cases.
>>>
>>> Though I actually don’t know. I’d probably add two globally available
>>> helpers, one that returns the sum of everything but the backing node,
>>> and one that just returns the primary node.
>>
>> Yes, I think this is the same as I meant by "provide implementations in
>> block.c".
>>
>>> Now if I were to make qcow2 use the primary node helper function, would
>>> we have remembered changing it once we added a data file?
>>>
>>> Hmm. Maybe not, but it should be OK to just make everything use the sum
>>> helper, except the drivers that want the primary node. That should work
>>> for all cases. (I think that whenever a format driver suddenly gains
>>> more child nodes, we probably will want to count them. OTOH, everything
>>> that has nodes that shouldn’t be counted probably always wants to use
>>> the primary node helper function from the start.)
>>
>> The job filter nodes have only one child currently, which should be
>> counted. We'll add other children that shouldn't be counted only later.
>>
>> But we already have an idea of what possible extensions look like, so we
>> can probably choose the right function from the start.
>
> Yep.
>
>>>> 2. Make the 90% solution a 100% solution: Allow drivers to have multiple
>>>> storage children (for vmdk) and then have the fallback add up the
>>>> primary child plus all storage children. This is what I suggested as
>>>> the documented semantics in my initial reply to this patch (that you
>>>> chose not to answer).
>>>
>>> I didn’t answer that because I didn’t disagree.
>>>
>>>> Adding the size of storage children covers qcow2 and vmdk.
>>>
>>> That’s of course exactly what we’re trying to do, but the question is,
>>> how do we figure out that storage children? Make it a per-BdrvChild
>>> attribute? That seems rather heavy-handed, because I think we’d need it
>>> only here.
>>
>> Well, you added bdrv_storage_child().I'd argue this interface is wrong
>
> Yes, it probably is.
>
>> because it assumes that only one storage child exists. You just didn't
>> implement it for vmdk so that the problem didn't become apparent. It
>> would have to return a list rather than a single child. So fixing the
>> interface and then using it is what I was thinking.
>>
>> Now that you mention a per-BdrvChild attribute, however, I start to
>> wonder if the distinction between COW children, filter children, storage
>> children, metadata children, etc. isn't really what BdrvChildRole was
>> supposed to represent?
>
> That’s a good point.
>
>> Maybe we want to split off child_storage from child_file, though it's
>> not strictly necessary for this specific case because we want to treat
>> both metadata and storage nodes the same. But it could be useful for
>> other users of bdrv_storage_child(), if there are any.
>
> Possible. Maybe it turns out that at least for this series I don’t need
> bdrv_storage_child() at all.
>
>>>> As the job filter won't declare the target or any other involved
>>>> nodes their storage nodes (I hope), this will do the right thing for
>>>> them, too.
>>>>
>>>> For quorum and blkverify both ways could be justifiable. I think they
>>>> probably shouldn't declare their children as storage nodes. They are
>>>> more like filters that don't have a single filtered node. So some
>>>> kind of almost-filters.
>>>
>>> I don’t think quorum is a filter, and blkverify can only be justified to
>>> be a filter because it quits qemu when there is a mismatch.
>>>
>>> The better example is replication, but that has a clear filtered child
>>> (the primary node).
>>>
>>>
>>> So all in all I think it’s best to make the callback mandatory and add
>>> two global helper functions. That’s simple enough and should prevent
>>> us from making mistakes by forgetting to adjust something in the
>>> future.
>>
>> Yes, that should work.
>>
>> We should probably still figure out what the relationship between the
>> child access functions and child roles is, even if we don't need it for
>> this solution. But it feels like an important part of the design.
>
> Hm. It feels like something that should be done before this series,
> actually.
>
> So I think we should add at least a child role per child access function
> so that they match? And then maybe in bdrv_attach_child() assert that a
> BDS never has more than one primary or filtered child (a filtered child
> acts as a primary child, too), or more than one COW child. (And that
> these are always in bs->file or bs->backing so the child access
> functions do work.)
I’ve been trying to make this work, but I don’t think it does. It just
feels all wrong and I need up with things like
“child_metadata_and_data”. The last straw was that blkverify should
have the raw file be the filtered child (because, well, it’s bs->file),
but then the format file would need to be a non-filtered child, and
those would default to BDRV_O_PROTOCOL (which we decidedly don’t want).
Anyway, I’m currently attempting to solve this differently:
BdrvChildRole isn’t suitable for the job, I think. The name is
completely what we want, but it actually doesn’t look like something
that describes the child role to me.
Instead, I’m introducing a new BdrvChildRole enum mask that describes
how the child is going to be used: stay-at-node, cow, metadata, data, etc.
I’m going to rename the current BdrvChildRole structure to
BdrvChildParent (in want of a better name), because really most of what
it does is describe the parent, but precisely not the child. I’m moving
.stay_as_node to the new BdrvChildRole enum.
I hope this lets me unify child_file, child_backing, and child_format
into a child_of_bds object. The callbacks should then decide the
particularities based on the BdrvChildRole enum.
Hope that makes sense. (? :S)
At least I feel much happier implementing it this way, which I suppose
is a good sign.
Max
signature.asc
Description: OpenPGP digital signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_size's fallback,
Max Reitz <=