qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu-img convert cache mode for source


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] qemu-img convert cache mode for source
Date: Mon, 3 Mar 2014 13:03:49 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Feb 28, 2014 at 03:35:05PM +0100, Peter Lieven wrote:
> On 27.02.2014 09:57, Stefan Hajnoczi wrote:
> >On Wed, Feb 26, 2014 at 05:01:52PM +0100, Peter Lieven wrote:
> >>On 26.02.2014 16:41, Stefan Hajnoczi wrote:
> >>>On Wed, Feb 26, 2014 at 11:14:04AM +0100, Peter Lieven wrote:
> >>>>I was wondering if it would be a good idea to set the O_DIRECT mode for 
> >>>>the source
> >>>>files of a qemu-img convert process if the source is a host_device?
> >>>>
> >>>>Currently the backup of a host device is polluting the page cache.
> >>>Points to consider:
> >>>
> >>>1. O_DIRECT does not work on Linux tmpfs, you get EINVAL when opening
> >>>    the file.  A fallback is necessary.
> >>>
> >>>2. O_DIRECT has no readahead so performance could actually decrease.
> >>>    The question is, how important is reahead versus polluting page
> >>>    cache?
> >>>
> >>>3. For raw files it would make sense to tell the kernel that access is
> >>>    sequential and data will be used only once.  Then we can get the best
> >>>    of both worlds (avoid polluting page cache but still get readahead).
> >>>    This is done using posix_fadvise(2).
> >>>
> >>>    The problem is what to do for image formats.  An image file can be
> >>>    very fragmented so the readahead might not be a win.  Does this mean
> >>>    that for image formats we should tell the kernel access will be
> >>>    random?
> >>>
> >>>    Furthermore, maybe it's best to do readahead inside QEMU so that even
> >>>    network protocols (nbd, iscsi, etc) can get good performance.  They
> >>>    act like O_DIRECT is always on.
> >>your comments are regarding qemu-img convert, right?
> >>How would you implement this? A new open flag because
> >>the fadvise had to goto inside the protocol driver.
> >>
> >>I would start with host_devices first and see how it performs there.
> >>
> >>For qemu-img convert I would issue a FADV_DONTNEED after
> >>a write for the bytes that have been written
> >>(i have tested this with Linux and it seems to work quite well).
> >>
> >>Question is, what is the right paramter for reads? Also FADV_DONTNEED?
> >I think so but this should be justified with benchmark results.
> 
> I ran some benchmarks at found that a FADV_DONTNEED issues after
> a read does not hurt regarding to performance. But it avoids buffers
> increasing while I read from a host_device of raw file.

It was mentioned in this thread that a sequential shouldn't promote the
pages anyway - they should be dropped by the kernel if there is memory
pressure.

So what is the actual performance problem you are trying to solve and
what benchmark output are you getting when you compare with
FADV_DONTNEED against without FADV_DONTNEED?

I think there's a danger that the discussion will go around in circles.
Please post the performance results that kicked off this whole effort
and let's focus on the data.  That way it's much easier to evaluate what
changes to QEMU are a win and which are not necessary.

> As for writing it does only work if I issue a fdatasync after each write, but
> this should be equivalent to O_DIRECT. So I would keep the patch
> to support qemu-img convert sources if they are host_device or file.

fdatasync(2) is much more heavy-weight than writing out a pages because
it sends a disk write cache flush command and waits for it to complete.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]