qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/repl


From: Pavel Dovgalyuk
Subject: Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay
Date: Mon, 15 Feb 2016 14:19:08 +0300

> From: Kevin Wolf [mailto:address@hidden
> Am 15.02.2016 um 10:14 hat Pavel Dovgalyuk geschrieben:
> > > From: Pavel Dovgalyuk [mailto:address@hidden
> > > > From: Kevin Wolf [mailto:address@hidden
> > > > > >
> > > > > > int blkreplay_co_readv()
> > > > > > {   
> > > > > >     BlockReplayState *s = bs->opaque;
> > > > > >     int reqid = s->reqid++;
> > > > > >
> > > > > >     bdrv_co_readv(bs->file, ...);
> > > > > >
> > > > > >     if (mode == record) {
> > > > > >         log(reqid, time);
> > > > > >     } else {
> > > > > >         assert(mode == replay);
> > > > > >         bool *done = req_replayed_list_get(reqid)
> > > > > >         if (done) {
> > > > > >             *done = true;
> > > > > >         } else {
> > > > > point A
> > > > > >             req_completed_list_insert(reqid, qemu_coroutine_self());
> > > > > >             qemu_coroutine_yield();
> > > > > >         }
> > > > > >     }
> > > > > > }
> > > > > >
> > > > > > /* called by replay.c */
> > > > > > int blkreplay_run_event()
> > > > > > {
> > > > > >     if (mode == replay) {
> > > > > >         co = req_completed_list_get(e.reqid);
> > > > > >         if (co) {
> > > > > >             qemu_coroutine_enter(co);
> > > > > >         } else {
> > > > > >             bool done = false;
> > > > > >             req_replayed_list_insert(reqid, &done);
> > > > > point B
> > > > > >             /* wait synchronously for completion */
> > > > > >             while (!done) {
> > > > > >                 aio_poll();
> > > > > >             }
> > > > > >         }
> > > > > >     }
> > > > > > }
> > > >
> > Now I've encountered a situation where blkreplay_run_event is called from 
> > read coroutine:
> > bdrv_prwv_co -> aio_poll -> qemu_clock_get_ns -> replay_read_clock -> 
> > blkreplay_run_event
> >            \--> bdrv_co_readv -> blkreplay_co_readv -> bdrv_co_readv(lower 
> > layer)
> >
> > bdrv_co_readv inside blkreplay_co_readv can't proceed in this situation.
> > This is probably because aio_poll has taken the aio context?
> > How can I resolve this?
> 
> First of all, I'm not sure if running replay events from
> qemu_clock_get_ns() is such a great idea. This is not a function that
> callers expect to change any state. If you absolutely have to do it
> there instead of in the clock device emulations, maybe restricting it to
> replaying clock events could make it a bit more harmless.

qemu_clock_get_ns() wants to read some clock data from the log.
While reading, it finds block driver event and tries to proceed it.
These block events may occur at any moment, because blkreplay_co_readv()
writes them immediately as executes.

Alternative approach is adding these events to the queue and executing them
when checkpoint will be met. I'm not sure that this is easy to implement with
coroutines.

> Anyway, what does "can't proceed" mean? The coroutine yields because
> it's waiting for I/O, but it is never reentered? Or is it hanging while
> trying to acquire a lock?

bdrv_co_io_em() (raw file read/write) yields and waits inifinitely to return, 
because
aio_poll hands in some clock read.

> Calling the callbacks that reenter a yielded coroutine is generally the
> job of aio_poll(). After reentering the coroutine, blkreplay_run_event()
> should return back to its caller and therefore indirectly to aio_poll(),
> which should drive the events. Sounds like it should be working.

Yield occurred inside blkreplay_co_readv()->bdrv_co_readv() before
the request was added to the queue.


Pavel Dovgalyuk




reply via email to

[Prev in Thread] Current Thread [Next in Thread]