qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] Addressability of Block-jobs after bdrv_swap removal


From: John Snow
Subject: Re: [Qemu-block] Addressability of Block-jobs after bdrv_swap removal
Date: Thu, 17 Dec 2015 11:42:00 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 12/17/2015 08:53 AM, Markus Armbruster wrote:
> Kevin Wolf <address@hidden> writes:
> 
>> Am 09.12.2015 um 21:59 hat John Snow geschrieben:
>>> I have a question about how the device name of block jobs gets reported
>>> now after the bdrv_swap removal patch.
>>>
>>> Previously, it appears that the job stayed attached to the root-node (if
>>> somewhat hackishly, apparently) so that we could at all stages report
>>> our name without having to cache the device we started under.
>>
>> Yes, I think that was effectively what happened.
>>
>> I don't remember the exact details, but what I remember from the
>> bdrv_swap() removal work is that it was confusing. After removing
>> bdrv_swap() doing it either way turned out wrong because the job had to
>> stay at the same BDS and at the same time at the root, too, so that the
>> device name would continue to exist. Doing both is impossible without
>> swapping the C objects.
>>
>>> However, since QMP commands refer to block-jobs solely through their
>>> device name, do we have any cases post-removal where a job becomes
>>> "unreachable" through its advertised name?
>>>
>>> e.g.
>>>
>>> the block-job-ready event uses the device name to advertise which job
>>> has just converged. The user/client would then be responsible for
>>> sending qmp-block-job-complete device=XXX to finish the job when desired.
>>>
>>> I don't see one at a quick glance, but we don't have any cases where we
>>> perform any graph manipulation before we expect the user to interface
>>> with the job again, right?
>>>
>>> (It's always done right at completion time, at least for drive-mirror.
>>> Do any other jobs adjust the graph? If it's ever anything except right
>>> before completion time, we may lose the ability to pause/resume,
>>> set-speed, etc.)
>>>
>>> Does this sound about right, or have I fatally misunderstood the situation?
>>
>> Other jobs change the graph as well, but none do so before completion,
>> so I don't think theere will be further user interaction. The only
>> interesting part so far is the device name that is sent in the
>> completion event.
>>
>>> (post-script: I was thinking of adding a unique per-job ID that could be
>>> reported alongside any events or errors where the job's device name was
>>> reported, so that users could provide this ID to find the job. Each BB
>>> would have a per-tree list of jobs with globally unique IDs, and
>>> regardless of what node the job was currently attached to, we could
>>> retrieve that job unambiguously. This would be useful if the above
>>> question reveals an API problem, or more generally for multiple block
>>> jobs where we'll need IDs to reference jobs anyway.)
>>
>> I actually introduced a job->id field, but we're not using it
>> consistently yet. I needed it so I could still access the right name
>> for the completion message, as I said above.
>>
>> What could turn out a bit nasty is that we have to maintain API
>> compatibility. The best option that I could think of so far is that we
>> change the current device name in all QMP commands into a block job ID
>> while still calling it 'device' for compatibility. If an ID isn't
>> specified in the command that starts a block job, the device_name is
>> used like today. If that default ID is already taken, we fail the
>> command; this doesn't impact compatibility because today you can't set
>> any ID other than the device_name and you can only have one job per
>> device.
>>
>> If we don't like the misnomer 'device' for the block job ID, we would
>> have to store whether the id == device_name default was applied and then
>> add a 'device' key in all return values and events where this was the
>> case.
>>
>>> Each BB would have a per-tree list of jobs with globally unique IDs
>>
>> Why that? Block jobs don't belong to a single BB. They are artificially
>> attached to a single BDS today, but design-wise that doesn't make a lot
>> of sense for any job other than streaming, because all other currently
>> existing jobs have a source and a target rather than a single node.
>>
>> Or maybe we should now introduce a completely new API for background jobs
>> (not just block jobs) and implement the old API on top of that. In the
>> long run we could move e.g. migration to the same API.
> 
> Yes, please.
> 
> We clearly need more general block jobs without the bogus 1:1 tie
> between job and "device" (whatever we make that mean).  We might be able
> to shoehorn them into the existing interface, but it won't be pretty,
> and it may well get complicated.
> 
> I think having even more general background jobs makes the most sense.
> I believe shoehorning them into the existing interface would go beyond
> ugly into "bad idea" territory.  Let's create a sane new interface
> instead.
> 
> We obviously have to keep the legacy block job interface working, but we
> should be able to limit it to just legacy block jobs.  Once you start
> creating new-style background jobs, you need to use the background job
> interface to manage them.  In particular, the legacy interface shows you
> only the legacy block jobs, i.e. not the complete picture.
> 

I wrote a little thing with my ideas on the situation.
Message-ID: <address@hidden>

the tl;dr is I agree with you: let's create a new generic API for "jobs"
and leave the old block-job commands as legacy interfaces. Block job
commands will become subsets of "job" commands. (i.e. sys=block as a
union/subclass key.)

Where I think I (may) differ is that instead of block job legacy
commands showing incomplete pictures, I would like them to return an
error if they cannot show you the complete picture.

This avoids the heartache of using one of the cool, shiny new commands
and then expecting the legacy query (etc) to give you a proper picture.
"All or nothing" should be easy to enforce, I think.

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]