[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [Qemu-devel] [PATCH for 2.11] virtio-net: don't touch
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-stable] [Qemu-devel] [PATCH for 2.11] virtio-net: don't touch virtqueue if vm is stopped |
Date: |
Mon, 27 Nov 2017 11:19:25 +0000 |
User-agent: |
Mutt/1.9.1 (2017-09-22) |
On Mon, Nov 27, 2017 at 11:26:34AM +0800, Jason Wang wrote:
>
>
> On 2017年11月24日 18:44, Stefan Hajnoczi wrote:
> > On Fri, Nov 24, 2017 at 10:57:11AM +0800, Jason Wang wrote:
> > > On 2017年11月23日 18:59, Stefan Hajnoczi wrote:
> > > > On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
> > > > > Guest state should not be touched if VM is stopped, unfortunately we
> > > > > didn't check running state and tried to drain tx queue unconditionally
> > > > > in virtio_net_set_status(). A crash was then noticed as a migration
> > > > > destination when user type quit after virtqueue state is loaded but
> > > > > before region cache is initialized. In this case,
> > > > > virtio_net_drop_tx_queue_data() tries to access the uninitialized
> > > > > region cache.
> > > > >
> > > > > Fix this by only dropping tx queue data when vm is running.
> > > > hw/virtio/virtio.c:virtio_load() does the following:
> > > >
> > > > for (i = 0; i < num; i++) {
> > > > if (vdev->vq[i].vring.desc) {
> > > > uint16_t nheads;
> > > >
> > > > /*
> > > > * VIRTIO-1 devices migrate desc, used, and avail ring
> > > > addresses so
> > > > * only the region cache needs to be set up. Legacy
> > > > devices need
> > > > * to calculate used and avail ring addresses based on the
> > > > desc
> > > > * address.
> > > > */
> > > > if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> > > > virtio_init_region_cache(vdev, i);
> > > > } else {
> > > > virtio_queue_update_rings(vdev, i);
> > > > }
> > > >
> > > > So the region caches should be initialized after virtqueue state is
> > > > loaded.
> > > >
> > > > It's unclear to me which code path triggers this issue. Can you add a
> > > > backtrace or an explanation?
> > > >
> > > > Thanks,
> > > > Stefan
> > > Migration coroutine was yield before region cache was initialized. The
> > > backtrace looks like:
> > [...]
> > > #16 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010,
> > > vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610, version_id=1)
> > > at migration/vmstate.c:160
> > > #17 0x0000555555865cc3 in virtio_load (vdev=0x555557d68610,
> > > f=0x555556f7c010, version_id=11) at
> > > /home/devel/git/qemu/hw/virtio/virtio.c:2110
> > Reviewed-by: Stefan Hajnoczi <address@hidden>
> >
> > Thanks for the backtrace! Your patch is fine but I have a larger
> > concern:
> >
> > The backtrace shows that the virtio code is re-entrant during savevm
> > load. That's probably a bad thing because set_status() and other APIs
> > are probably not intended to run while we are half-way through savevm
> > load. The virtqueue is only partially set up at this point :(. I
> > wonder if a more general cleanup is necessary to avoid problems like
> > this in the future...
> >
> > Stefan
>
> Yes, this needs some thought. An idea is to guarantee the atomicity of the
> virtio state and don't expose partial state. But looks like this needs lots
> of changes.
>
> Anyway, I will apply this patch first.
Sounds good, thanks!
Stefan
signature.asc
Description: PGP signature