qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs


From: Kevin Wolf
Subject: Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs
Date: Thu, 6 Apr 2017 10:48:24 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 05.04.2017 um 23:13 hat Paolo Bonzini geschrieben:
> On 05/04/2017 13:01, Kevin Wolf wrote:
> > Am 04.04.2017 um 17:09 hat Paolo Bonzini geschrieben:
> >> On 04/04/2017 16:53, Kevin Wolf wrote:
> >>>> The big question is how this fits into release management.  We have
> >>>> another important regression from the op blocker work and only a week
> >>>> to go before the last rc.  Are we going to delay 2.9 arbitrarily?  Are
> >>>> we going to shorten the 2.10 development period correspondingly?  (I
> >>>> vote yes and yes, FWIW).
> >>> Which is the other regression?
> >>
> >> The assertion failure for snapshot_blkdev with iothreads.
> > 
> > Ah, right, I keep forgetting that this started appearing with the op
> > blocker series because the failure mode is completely different, so it
> > seems to have been a latent bug somewhere else that was uncovered by it.
> > 
> > If we're sure that the change of the order in bdrv_append() is what
> > caused the bug to appear, we can just undo that for 2.9, at the cost of
> > a messed up graph in the error case when bdrv_set_backing_hd() fails
> > (because we have no way to undo bdrv_replace_node()).
> 
> I don't know if that is enough to fix all of the issues, but the bug is
> easy to reproduce.
> 
> The issue is the lack of understanding of what node movement does to
> quiesce_counter.  The invariant is that children cannot have a lower
> quiesce_counter than parents, I think (paths in the graph can only join
> in the children direction, right?).

Maybe I'm missing something, but I think this isn't true at all. Drains
are propagated to the parents, so that this specific node doesn't
receive new requests, but not to the children. The assumption is that
children don't do anything anyway without requests from their parents,
so they are effectively quiesced even with quiesce_counter == 0.

So if anything, the invariant should be the exact opposite: Parents
cannot have a lower quiesce_counter than their children.

I think the exact thing that the quiesce_counter of a node is expected
to be is the number of paths from itself to an explicitly drained node
in the directed block driver graph (counting one path if it is
explicitly drained itself). A path counts multiple times if a node is
explicitly drained multiple times.

> Is it checked, and are there violations already?  Maybe we need a
> get_quiesce_counter method in BdrvChildRole, to cover BlockBackend's
> quiesce_counter?  Then we can use that information to adjust the
> quiesce_counter when nodes move in the graph.

We would need that if we had a downwards propagation and if a
BlockBackend could be drained, but as it stands, I don't see what could
be missing from bdrv_replace_child_noperm() (well, except that I think
your patch is right to avoid calling drained_end/begin if both nodes
were drained because new requests could sneak in this way in theory).

> The block layer has good tests, but as the internal logic grows more
> complex we should probably have more C level tests.  I'm constantly
> impressed by the amount of tricky cases that test-replication.c catches
> in the block job code.

Never really noticed test-replication specifically catching things when
I worked on the op blockers code which changed a lot around block jobs,
but that we should consider this type of tests more often is probably a
good point.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]