qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU/KVM SCSI lock up


From: Avi Kivity
Subject: Re: [Qemu-devel] QEMU/KVM SCSI lock up
Date: Thu, 03 Apr 2008 11:38:08 +0300
User-agent: Thunderbird 2.0.0.12 (X11/20080226)

Matteo Frigo wrote:
kvm-64 hangs under heavy disk I/O with scsi disks.  To reproduce,
create a fresh qcow2 disk, boot linux, and execute

  dd if=/dev/sdX of=/dev/null bs=1M

on the fresh disk.  See also 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1895893&group_id=180599

I have attached a patch that appears to fix the problem.  The bug
seems to be the following.  scsi_read_data() does the following

    bdrv_aio_read()
    r->sector += n;
    r->sector_count -= n;

For reasons that I do not fully understand, bdrv_aio_read() does
not return immediately, but instead it calls scsi_read_data()
recursively.

What happens (I think) is that bdrv_aio_read() completes immediately, calls the completion callback, which starts a read for the next batch of sectors.


  Since ``r->sector += n;'' has not been executed
yet, the re-entrant call triggers a read of the same sector, which
breaks the producer-consumer lockstep.  The fix is to swap the operations
as follows:

    r->sector += n;
    r->sector_count -= n;
    bdrv_aio_read()

A similar fix applies to scsi_write_data().


Will that not issue the read for the wrong sector?

I think the correct fix is to move r->sector and r->sector_count adjustment into scsi_read_complete() and scsi_write_complete().

Long term we want to replace the recursion by queuing.

--
error compiling committee.c: too many arguments to function





reply via email to

[Prev in Thread] Current Thread [Next in Thread]