qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use blo


From: Paolo Bonzini
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH v2 04/11] blockjobs: Always use block_job_get_aio_context
Date: Thu, 13 Oct 2016 11:03:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0


On 13/10/2016 02:49, John Snow wrote:
> As context to everyone else as to why I'm going down the rabbit hole of
> trying to remove external references to AioContext at all, see
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg00795.html
> 
> On 10/07/2016 03:49 AM, Paolo Bonzini wrote:
>>
>>
>> On 06/10/2016 22:22, John Snow wrote:
>>> Calls .complete(), for which the only implementation is
>>> mirror_complete. Uh, this actually seems messy. Looks like there's
>>> nothing to prevent us from calling this after we've already told it to
>>> complete once.
>>
>> Yeah, it should have an
>>
>>     if (s->should_complete) {
>>         return;
>>     }
>>
>> at the beginning.  I have other mirror.c patches in my queue so I can
>> take care of that.
>>
> 
> Or something up the stack at block_job_complete so it's not up to job
> implementations? What if the next implementer "forgets."

Yeah, that would require moving s->should_complete up to BlockJob.

>>> block_job_cancel and block_job_complete are different.
>>>
>>> block_job_cancel is called in many places, but we can just add a similar
>>> block_job_user_cancel if we wanted a version which takes care to acquire
>>> context and one that does not. (Or we could just acquire the context
>>> regardless, but Paolo warned me ominously that recursive locks are EVIL.
>>> He sounded serious.)
>>
>> Not that many places:
>>
>> - block_job_finish_sync calls it, and you can just release/reacquire
>> around the call to "finish(job, &local_err)".
> 
> This makes me a little nervous because we went through the trouble of
> creating this callback, but we're going to assume we know that it's a
> public interface that will take the lock for itself (or otherwise does
> not require a lock.)
> 
> In practice it works, but it seems needlessly mystifying in terms of
> proving correctness.

It's _much_ easier to assume that all callbacks take the lock
themselves.  It's counterintuitive (just like the idea that recursive
locks are bad :)), but the point is that if you second guess the
callbacks and invoke them you might get the locking order wrong.  This
ends up with ugly AB-BA deadlocks.

>> - replication_stop is not acquiring s->secondary_disk->bs's AioContext.
> 
> Seems like a bug on their part. Would be fixed by having cancel acquire
> context for itself.

Yep.

> Yeah, I should be reffing it anyway.
> 
> The rest of this... What I think you mean is acquiring and releasing the
> context as needed for EACH of prepare, commit, abort, and clean as
> necessary, right?
> 
> And then in this case, it simply wouldn't be necessary for abort, as the
> sync cancel would do it for us.

Right.

> Alright.
> 
> Say I *do* push the acquisitions down into blockjob.c. What benefit does
> that provide? Won't I still need the block_job_get_aio_context()
> function (At least internally) to know which context to acquire? This
> would preclude you from deleting it.
> 
> Plus... we remove some fairly simple locking mechanisms and then inflate
> it tenfold. I'm not convinced this is an improvement.

The improvement is that you can now replace the huge lock with a smaller
one (you don't have to do it now, but you could).  Furthermore, the
small lock is a) non-recursive b) a leaf lock.  This means that you've
removed a lot of opportunities for deadlock, and generally made things
easier to reason about.

Furthermore, with large locks you never know what they actually protect;
cue the confusion between aio_context_acquire/release and
bdrv_drained_begin/end.  And it's easier to forget them if you force the
caller to do it; and we have an example with replication.

Again, it can be counterintuitive that it's better, but it is. :)

> (1) QMP interface for job management
> (2) bdrv_drain_all, in block/io.c
> 
> (1) AFAICT, the QMP interface is concerned with assuring itself it has
> unique access to the BlockJob structure itself, and it doesn't really
> authentically care about the AIOContext itself -- just race-free access
> to the Job.

Yes.  The protection you need here is mostly against concurrent
completion of the job (and concurrent manipulation of fields such as
job->busy, job->paused, etc.

> This is not necessarily buggy today because, even though we grab the
> BlockBackend's context unconditionally, we already know the main/monitor
> thread is not accessing the blockjob. It's still silly, though.
> 
> (2) bdrv_drain_all appears to be worried about the same thing; we just
> need to safely deliver pause/resume messages.
> 
> I'm less sure about where this can run from, and suspect that if the job
> has deferred to main that this could be buggy. If bdrv_drain_all is
> called from context A and the job is running on context M having
> deferred from B, we may lock against context B (from A) and have racey
> access from between A/M. (Maybe?)

bdrv_drain_all is BQL-only (because it acquires more than one AioContext).

> Would you be terribly offended if I left this patch as-is for now and we
> can work on removing the AioContext locks afterwards, or are you adamant
> that I cooperate in getting block_job_get_aio_context removed before I
> add more usages?

Removing block_job_get_aio_context is necessary to fix dataplane bugs,
so I'd really prefer to have that in 2.8.  But why do you actually need
this patch at all?  You can do neither thing---not push the lock down
and not add more usages---which is always the best. :)

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]