qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_D


From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: [PATCH][v2] Align file accesses with cache=off (O_DIRECT)
Date: Wed, 21 May 2008 13:25:59 -0500
User-agent: Thunderbird 2.0.0.14 (X11/20080501)

Andrea Arcangeli wrote:
On Wed, May 21, 2008 at 12:53:52PM -0500, Anthony Liguori wrote:
MAP_SHARED cannot be done transparently to the guest, that's the motivating reason behind MAP_PRIVATE.

Could you elaborate on what means 'done transparently'? The only
difference is for writes. When guest writes MAP_PRIVATE will
copy-on-write. How can it be good if guest generates many
copy-on-writes and eliminates the cache from the mapping and replaces
it with anonymous memory?

I think we're talking about different things. What I'm talking about is the following:

Guest issues DMA read from disk at offset N of size M to physical address X. Today, we essentially read from the backing disk image from offset N into a temporary buffer of size M, and then memcpy() to physical address X.

What I would like to do, if N and M are multiples of PAGE_SIZE, is replace the memory at guest physical address X, with the host's page cache for N, M. The guest is unaware of this though and it may decide to reclaim that memory for something else. When this happens, we need to unmap guest physical address X and replace it with normal memory (essentially, CoW'ing).

The effect of this would be that if multiple guests are using the same disk image, they would end up sharing memory transparently.

With MMU notifiers, this is possible by just using mmap(MAP_PRIVATE | MAP_FIXED) assuming we fix gfn_to_pfn() to take a 'write' parameter, right now we always write fault CoW mappings because we unconditionally call get_user_pages with write=1.

As has been pointed out, this is probably not ideal since it would cause heavy vma fragmentation. We may be able to simulate this using the slots API although slots are quite similar to vma's in that we optimize for a small number of them.

I'm not really sure what's the best approach.

Regards,

Anthony Liguori

I can't see how MAP_PRIVATE could replace O_DIRECT, there's no way to
write anything to disk with MAP_PRIVATE, msync on a MAP_PRIVATE is a
pure overhead noop for example, only MAP_SHARED has a chance to modify
any bit present on disk and it'll require msync at least every time
the host OS waits for I/O completion and assumes the journal
metadata/data is written on disk.

The real good thing I see of MAP_PRIVATE/MAP_SHARED vs O_DIRECT, is
that the guest would boot the second time without triggering reads
from disks. But after guest is booted, the runtime of the guest is
likely going to be better with O_DIRECT, the guest has its own
filesystem caches in the guest memory, replicating them shouldn't pay
off significantly for the guest runtime even on a laptop, and it
provides disavantages in the host by polluting host caches already
existing in the guest, and it'll decrease fairness of the system,
without mentioning the need of msync for journaling. So besides the
initial boot time I don't see many advantages for
MAP_PRIVATE/MAP_SHARED at least unless you're running msdos ;).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]