qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 05/10] block: add block job transactions


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 05/10] block: add block job transactions
Date: Tue, 30 Jun 2015 17:20:25 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Jun 29, 2015 at 06:38:13PM -0400, John Snow wrote:
> On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> > +/* The purpose of this is to keep txn alive until all jobs have been added 
> > */
> > +void block_job_txn_begin(BlockJobTxn *txn)
> > +{
> > +    block_job_txn_unref(txn);
> > +}
> > +
> 
> Maybe it's not entirely clear to the caller that "begin" may in fact
> actually delete the BlockJobTxn. Shall we update the caller's pointer to
> NULL in this case as a hint? Passing a **txn will imply that we are
> giving up our ownership of the object.

Good idea.

> > +/* Cancel all other jobs in case of abort, wake all waiting jobs in case of
> > + * successful completion.  Runs from main loop.
> > + */
> > +static void block_job_txn_complete(BlockJob *job, void *opaque)
> > +{
> > +    BlockJobTxn *txn = opaque;
> > +    BlockJob *other_job;
> > +    bool aborting = txn->aborting;
> > +
> > +    qemu_mutex_lock(&txn->lock);
> > +    txn->ref++; /* keep txn alive until the end of this loop */
> > +
> > +    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> > +        AioContext *ctx;
> > +
> > +        qemu_mutex_unlock(&txn->lock);
> > +        ctx = bdrv_get_aio_context(other_job->bs);
> > +        aio_context_acquire(ctx);
> > +
> > +        /* Cancel all other jobs if aborting.  Don't cancel our own failed 
> > job
> > +         * since cancellation throws away the error value.
> > +         */
> > +        if (aborting && other_job != job) {
> > +            block_job_cancel(other_job);
> > +        } else {
> > +            block_job_enter(other_job);
> > +        }
> > +
> > +        aio_context_release(ctx);
> > +        qemu_mutex_lock(&txn->lock);
> > +    }
> > +
> > +    qemu_mutex_unlock(&txn->lock);
> > +    block_job_txn_unref(txn);
> > +}
> > +
> 
> Maybe we can name this one along the lines of
> block_job_txn_complete_impl or something clearly internal, so that we
> can name the public interface simply "block_job_txn_complete."
> 
> Maybe I'm just bike shedding, but calling only a "prepare to X" function
> without a matching "X" call in the exposed API seems odd.
> 
> I suppose it doesn't matter, because I can't think of anything nicer :)
> 
> > +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> > +                                                    BlockJob *job,
> > +                                                    int ret)

I'll rename the function to:
void coroutine_fn block_job_txn_job_done(BlockJobTxn *txn, BlockJob *job, int 
ret)

"complete" is already overloaded.  block_job_completed() is called by
jobs to terminate.  block_job_complete() is called by the monitor to
nudge a waiting job to finish.  They are two different things.  Maybe
using the term in BlockJobTxn makes things more confusing.

(It's possible to drop the txn argument since it can be fetched from
job->txn, but that makes boundary between BlockJob and BlockJobTxn
harder to understand.)

> > +{
> > +    if (!txn) {
> > +        return;
> > +    }
> > +
> > +    qemu_mutex_lock(&txn->lock);
> > +
> > +    /* This function is entered in 3 cases:
> > +     *
> > +     * 1. Successful job completion - wait for other jobs
> > +     * 2. First failed/cancelled job in txn - cancel other jobs and wait
> > +     * 3. Subsequent cancelled jobs - finish immediately, don't wait
> > +     */
> > +    trace_block_job_txn_prepare_to_complete_entry(txn, job, ret,
> > +                                                  
> > block_job_is_cancelled(job),
> > +                                                  txn->aborting,
> > +                                                  txn->jobs_pending);
> > +
> > +    if (txn->aborting) { /* Case 3 */
> > +        assert(block_job_is_cancelled(job));
> > +        goto out; /* already cancelled, don't yield */
> > +    }
> > +
> 
> So the first failure forces all jobs not-yet-complete to cancel. Is
> there any chance for a race condition of two jobs completing almost
> simultaneously, where the first fails and the second completes, and the
> 2nd job makes it here before it gets canceled?
> 
> BOD: I really assume the answer is "no," but it's not immediately
> evident to me.

Good catch!  It is possible if the 2nd job is entered after the 1st job
yielded but before block_job_txn_complete() is called.  There would have
to be a callback pending for the 2nd job.

It can be fixed by adding a new case for jobs that call
block_job_txn_prepare_to_complete() when block_job_txn_complete has been
scheduled.  The job should simply wait for block_job_txn_complete().

I'll add a test case for this scenario.

Attachment: pgpW31G6Gm2r2.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]