qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Race condition in overlayed qcow2?


From: dovgaluk
Subject: Re: Race condition in overlayed qcow2?
Date: Tue, 25 Feb 2020 08:58:00 +0300
User-agent: Roundcube Webmail/1.4.1

Vladimir Sementsov-Ogievskiy писал 2020-02-21 16:23:
21.02.2020 15:35, dovgaluk wrote:
Vladimir Sementsov-Ogievskiy писал 2020-02-21 13:09:
21.02.2020 12:49, dovgaluk wrote:
Vladimir Sementsov-Ogievskiy писал 2020-02-20 12:36:
1 or 2 are ok, and 4 or 8 lead to the failures.


That is strange. I could think, that it was caused by the bugs in
deterministic CPU execution, but the first difference in logs
occur in READ operation (I dump read/write buffers in blk_aio_complete).


Aha, yes, looks strange.

Then next steps:

1. Does problem hit into the same offset every time?
2. Do we write to this region before this strange read?

2.1. If yes, we need to check that we read what we write.. You say you dump buffers in blk_aio_complete... I think it would be more reliable to dump at start of bdrv_co_pwritev and at end of bdrv_co_preadv. Also, guest may modify its buffers
during operation which would be strange but possible.

2.2 If not, hmm...



Another idea to check: use blkverify

I added logging of file descriptor and discovered that different results are obtained
when reading from the backing file.
And even more - replay runs of the same recording produce different results. Logs show that there is a preadv race, but I can't figure out the source of the failure.

Log1:
preadv c 30467e00
preadv c 30960000
--- sum = a2e1e
bdrv_co_preadv_part complete offset: 30467e00 qiov_offset: 0 len: 8200
--- sum = 10cdee
bdrv_co_preadv_part complete offset: 30960000 qiov_offset: 8200 len: ee00

Log2:
preadv c 30467e00
--- sum = a2e1e
bdrv_co_preadv_part complete offset: 30467e00 qiov_offset: 0 len: 8200
preadv c 30960000
--- sum = f094f
bdrv_co_preadv_part complete offset: 30960000 qiov_offset: 8200 len: ee00


Checksum calculation was added to preadv in file-posix.c


So, preadv in file-posix.c returns different results for the same
offset, for file which is always opened in RO mode? Sounds impossible
:)

True.
Maybe my logging is wrong?

static ssize_t
qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
{
     ssize_t res = preadv(fd, iov, nr_iov, offset);
     qemu_log("preadv %x %"PRIx64"\n", fd, (uint64_t)offset);
     int i;
     uint32_t sum = 0;
     int cnt = 0;
     for (i = 0 ; i < nr_iov ; ++i) {
         int j;
         for (j = 0 ; j < (int)iov[i].iov_len ; ++j)
         {
             sum += ((uint8_t*)iov[i].iov_base)[j];
             ++cnt;
         }
     }
     qemu_log("size: %x sum: %x\n", cnt, sum);
     assert(cnt == res);
     return res;
}


Hmm, I don't see any issues here..

Are you absolutely sure, that all these reads are from backing file,
which is read-only and never changed (may be by other processes)?

Yes, I made a copy and compared the files with binwalk.

2. guest modifies buffers during operation (you can catch it if
allocate personal buffer for preadv, than calculate checksum, then
memcpy to guest buffer)

I added the following to the qemu_preadv:

    // do it again
    unsigned char *buf = g_malloc(cnt);
    struct iovec v = {buf, cnt};
    res = preadv(fd, &v, 1, offset);
    assert(cnt == res);
    uint32_t sum2 = 0;
    for (i = 0 ; i < cnt ; ++i)
        sum2 += buf[i];
    g_free(buf);
    qemu_log("--- sum2 = %x\n", sum2);
    assert(sum2 == sum);

These two reads give different results.
But who can modify the buffer while qcow2 workers filling it with data from the disk?



Pavel Dovgalyuk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]