qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] assert during internal snapshot


From: Li, Liang Z
Subject: Re: [Qemu-devel] assert during internal snapshot
Date: Mon, 9 Nov 2015 02:28:09 +0000

> Hello, All!
> 
> This commit
> 
> commit 94f5a43704129ca4995aa3385303c5ae225bde42
> Author: Liang Li <address@hidden>
> Date:   Mon Nov 2 15:37:00 2015 +0800
> 
>      migration: defer migration_end & blk_mig_cleanup
> 
>      Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
>      lazy collapsing of small sptes into large sptes mechanism, now
>      migration_end() is a time consuming operation because it calls
>      memroy_global_dirty_log_stop(), which will trigger the dropping of small
>      sptes operation and takes about dozens of milliseconds, so call
>      migration_end() before all the vmsate data has already been transferred
>      to the destination will prolong VM downtime. This operation should be
>      deferred after all the data has been transferred to the destination.
> 
>      blk_mig_cleanup() can be deferred too.
> 
>      For a VM with 8G RAM, this patch can reduce the VM downtime about
> 30 ms.
> 
>      Signed-off-by: Liang Li <address@hidden>
>      Reviewed-by: Paolo Bonzini <address@hidden>
>      Reviewed-by: Juan Quintela <address@hidden>al3
>      Reviewed-by: Amit Shah <address@hidden>al3
>      Signed-off-by: Juan Quintela <address@hidden>al3
> 
> introduces the following regression
> 
> (gdb) bt
> #0  0x00007fd5d314a267 in __GI_raise (address@hidden)
>      at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x00007fd5d314beca in __GI_abort () at abort.c:89
> #2  0x00007fd5d314303d in __assert_fail_base (
>      fmt=0x7fd5d32a5028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>      address@hidden "i != mr->ioeventfd_nb",
>      address@hidden "/home/den/src/qemu/memory.c",
>      address@hidden,
>      address@hidden <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:92
> #3  0x00007fd5d31430f2 in __GI___assert_fail (
>      assertion=0x557288ed5b69 "i != mr->ioeventfd_nb",
>      file=0x557288ed5a36 "/home/den/src/qemu/memory.c", line=1731,
>      function=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:101
> #4  0x0000557288b108fa in memory_region_del_eventfd
> (mr=0x55728ad83700,
>      addr=16, size=2, match_data=true, data=0, e=0x55728b21ff40)
>      at /home/den/src/qemu/memory.c:1731
> #5  0x0000557288d9fc18 in virtio_pci_set_host_notifier_internal (
>      proxy=0x55728ad82e80, n=0, assign=false, set_handler=false)
>      at hw/virtio/virtio-pci.c:178
> #6  0x0000557288da19a9 in virtio_pci_set_host_notifier (d=0x55728ad82e80,
> n=0,
>      assign=false) at hw/virtio/virtio-pci.c:984
> #7  0x0000557288b523df in virtio_scsi_dataplane_start (s=0x55728ad8afa0)
>      at /home/den/src/qemu/hw/scsi/virtio-scsi-dataplane.c:268
> #8  0x0000557288b50210 in virtio_scsi_handle_cmd (vdev=0x55728ad8afa0,
>      vq=0x55728b21ffc0) at /home/den/src/qemu/hw/scsi/virtio-scsi.c:574
> #9  0x0000557288b65cb7 in virtio_queue_notify_vq (vq=0x55728b21ffc0)
>      at /home/den/src/qemu/hw/virtio/virtio.c:966
> #10 0x0000557288b67bbf in virtio_queue_host_notifier_read
> (n=0x55728b220010)
>      at /home/den/src/qemu/hw/virtio/virtio.c:1643
> #11 0x0000557288e12a2b in aio_dispatch (ctx=0x55728acaeab0) at
> aio-posix.c:160
> #12 0x0000557288e03194 in aio_ctx_dispatch (source=0x55728acaeab0,
>      callback=0x0, user_data=0x0) at async.c:226
> #13 0x00007fd5d409fff7 in g_main_context_dispatch ()
>     from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> ---Type <return> to continue, or q <return> to quit---
> #14 0x0000557288e1110d in glib_pollfds_poll () at main-loop.c:211
> #15 0x0000557288e111e8 in os_host_main_loop_wait (timeout=0) at
> main-loop.c:256
> #16 0x0000557288e11295 in main_loop_wait (nonblocking=0) at main-
> loop.c:504
> #17 0x0000557288c1c31c in main_loop () at vl.c:1890
> #18 0x0000557288c23dec in main (argc=105, argv=0x7ffca9a6fa08,
>      envp=0x7ffca9a6fd58) at vl.c:4644
> (gdb)
> 
> during 'virsh create-snapshot' operation over alive VM.
> It happens 100% of time when VM is run using the following command line:
> 
>   7498 ?        tl     0:37 qemu-system-x86_64 -enable-kvm -name rhel7
> -S -machine pc-i440fx-2.2,accel=kvm,usb=off -cpu SandyBridge -m 1024 -
> realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object
> iothread,id=iothread1 -uuid 456af4d3-5d67-41c6-a229-c55ded6098e9
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-
> shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot
> strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7
> -device
> ich9-usb-
> uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
> -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
> -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2
> -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
> file=/var/lib/libvirt/images/rhel7.qcow2,if=none,id=drive-scsi0-0-0-
> 0,format=qcow2,cache=none,aio=native
> -device
> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-
> 0-0,bootindex=1
> -drive
> file=/home/den/tmp/CentOS-7.0-1406-x86_64-DVD.iso,if=none,id=drive-ide0-0-
> 0,readonly=on,format=raw
> -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev
> tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device
> virtio-net-
> pci,netdev=hostnet0,id=net0,mac=52:54:00:97:2f:1d,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/rhel7.org.qe
> mu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-
> serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agen
> t.0
> -chardev spicevmc,id=charchannel1,name=vdagent -device
> virtserialport,bus=virtio-
> serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0
> -chardev pty,id=charconsole0 -device
> virtconsole,chardev=charconsole0,id=console0 -device
> usb-tablet,id=input0 -spice
> port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device
> qxl-
> vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=
> pci.0,addr=0x2
> -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
> hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev
> spicevmc,id=charredir0,name=usbredir -device
> usb-redir,chardev=charredir0,id=redir0 -chardev
> spicevmc,id=charredir1,name=usbredir -device
> usb-redir,chardev=charredir1,id=redir1 -chardev
> spicevmc,id=charredir2,name=usbredir -device
> usb-redir,chardev=charredir2,id=redir2 -chardev
> spicevmc,id=charredir3,name=usbredir -device
> usb-redir,chardev=charredir3,id=redir3 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -set
> device.scsi0.iothread=iothread1 -msg timestamp=on
> 
> Minimal dumb (!!!) fix is the following:
> 
> hades ~/src/qemu $ cat 1.diff
> diff --git a/migration/ram.c b/migration/ram.c index 25e9eeb..25ebf0d 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1100,7 +1100,8 @@ static void migration_bitmap_free(struct BitmapRcu
> *bmap)
>       g_free(bmap);
>   }
> 
> -static void migration_end(void)
> +extern void migration_end(void);
> +void migration_end(void)
>   {
>       /* caller have hold iothread lock or is in a bh, so there is
>        * no writing race against this migration_bitmap diff --git
> a/migration/savevm.c b/migration/savevm.c index dbcc39a..01da865 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -914,6 +914,7 @@ void qemu_savevm_state_cancel(void)
>       }
>   }
> 
> +extern void migration_end(void);
>   static int qemu_savevm_state(QEMUFile *f, Error **errp)
>   {
>       int ret;
> @@ -942,6 +943,8 @@ static int qemu_savevm_state(QEMUFile *f, Error
> **errp)
>           qemu_savevm_state_complete(f);
>           ret = qemu_file_get_error(f);
>       }
> +    migration_end();
> +
>       if (ret != 0) {
>           qemu_savevm_state_cancel();
>           error_setg_errno(errp, -ret, "Error while writing VM state"); hades
> ~/src/qemu $
> 
> (patch attached to start a discussion).
> 
> For me it looks like commit is wrong and should be reverted and reworked.
> 
> Den


Hi Den,

   I don't know the reason  for this issue,  could you please point out how can 
I reproduce the issue just with QEMU? with the fewest command option, I do not 
have the libvirt environment.

 Liang



reply via email to

[Prev in Thread] Current Thread [Next in Thread]