qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block/blklogwrites: Fix a bug when logging "write zeroes" op


From: Kevin Wolf
Subject: Re: [PATCH] block/blklogwrites: Fix a bug when logging "write zeroes" operations.
Date: Wed, 10 Jan 2024 14:39:19 +0100

Am 09.01.2024 um 19:46 hat megari@gmx.com geschrieben:
> From: Ari Sundholm <ari@tuxera.com>
> 
> There is a bug in the blklogwrites driver pertaining to logging "write
> zeroes" operations, causing log corruption. This can be easily observed
> by setting detect-zeroes to something other than "off" for the driver.
> 
> The issue is caused by a concurrency bug pertaining to the fact that
> "write zeroes" operations have to be logged in two parts: first the log
> entry metadata, then the zeroed-out region. While the log entry
> metadata is being written by bdrv_co_pwritev(), another operation may
> begin in the meanwhile and modify the state of the blklogwrites driver.
> This is as intended by the coroutine-driven I/O model in QEMU, of
> course.
> 
> Unfortunately, this specific scenario is mishandled. A short example:
>     1. Initially, in the current operation (#1), the current log sector
> number in the driver state is only incremented by the number of sectors
> taken by the log entry metadata, after which the log entry metadata is
> written. The current operation yields.
>     2. Another operation (#2) may start while the log entry metadata is
> being written. It uses the current log position as the start offset for
> its log entry. This is in the sector right after the operation #1 log
> entry metadata, which is bad!
>     3. After bdrv_co_pwritev() returns (#1), the current log sector
> number is reread from the driver state in order to find out the start
> offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
> offset will be the sector right after the (misplaced) operation #2 log
> entry, which means that the zeroed-out region begins at the wrong
> offset.
>     4. As a result of the above, the log is corrupt.
> 
> Fix this by only reading the driver metadata once, computing the
> offsets and sizes in one go (including the optional zeroed-out region)
> and setting the log sector number to the appropriate value for the next
> operation in line.
> 
> Signed-off-by: Ari Sundholm <ari@tuxera.com>
> Cc: qemu-stable@nongnu.org

Thanks, applied to the block branch.

Note that while this fixes the single threaded case, it is not thread
safe and will still break with multiqueue enabled (see the new
iothread-vq-mapping option added to virtio-blk very recently). Most
block drivers already take a lock when modifying their global state, and
it looks like we missed blklogwrites when checking what needs to be
prepared for the block layer to be thread safe.

So I think we'll want another patch to add a QemuMutex that can be taken
while you do the calculations on s->cur_log_sector. But this patch is
a good first step because it means that we don't need to keep a lock
across an I/O request (just for the sake of completeness, this would
have had to be a CoMutex rather than a QemuMutex).

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]