[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Multipage requests for GNU Mach 1,3
From: |
Sergio Lopez |
Subject: |
Re: Multipage requests for GNU Mach 1,3 |
Date: |
Sat, 18 Dec 2004 05:02:38 +0100 |
On Fri, 17 Dec 2004 11:09:26 +0000
"Neal H. Walfield" <neal@walfield.org> wrote:
> Hi,
>
> > I've been playing a little with GNU Mach, and I think there is a
> > thing that could be nice to implement in it. In "vm/vm_fault.c",
> > when the kernel is requesting some data from a translatator for a
> > memory_object, we can read this code:
> >
> > if ((rc = memory_object_data_request(object->pager,
> > object->pager_request,
> > m->offset + object->paging_offset,
> > PAGE_SIZE, access_required)) != KERN_SUCCESS) {
> >
> > And this is the syntax for m_o_d_request (from The GNU Mach
> > Reference Manual):
> >
> > kern_return_t seqnos_memory_object_data_request (
> > memory_object_t memory_object,
> > mach_port_seqno_t seqno,
> > memory_object_control_t memory_control,
> > vm_offset_t offset,
> > vm_offset_t length,
> > vm_prot_t desired_access)
> >
> > As you can see, the parameter for "length" is always "PAGE_SIZE"
> > (you know, 4K in x86) in GNU Mach. This means that for a translator
> > which works reading and writting from a disk (like ext2fs), every
> > I/O operation is splitted up into 4K fragments.
>
> A small nit: data_request is only for page in, not page outs. That
> is, this does not effect every i/o operation, only input operations
> (and specifically those resulting from vm faults). Page outs are only
> grouped together in memory_object_lock_request (vm/memory_object.c)
> which is only invoked from user space. When Mach evicts pages, it
> calls vm_pageout_page (vm/vm_pageout.c) which only returns a single
> page at a time (using memory_object_data_return) rather than trying to
> coalesce them in, for instance, vm_object_terminate (vm/vm_object.c).
> Maybe you would be interested in looking at this problem after you get
> page in clusters to work.
>
Yes, I've forgotten to talk about page outs :-) as you said,
m_o_lock_request calls m_o_data_return sending
multiple pages at once, but due to the libpager's API limitation,
writting is done page by page:
for (i = 0; i < npages; i++)
if (!(omitdata & (1 << i)))
pagerrs[i] = pager_write_page (p->upi,
offset + (vm_page_size * i),
data + (vm_page_size * i));
Changing libpager's API, and working a little in translators, this issue
can be easily solved. Also, I'll take a look at vm_pageout_page and
vm_pageout_scan (I've done that before, but touching it makes Mach
become some unstable, probably I was doing it the wrong way).
> > But, in OSF Mach, things are a bit different. The memory_objects
> > have a property named "cluster_size", and "length" in m_o_d_request
> > is determined by that. I don't know where OSF Mach sets the value of
> > cluster_size, but we can do it in m_o_ready/m_o_set_attributes, so
> > every translator can set this as it wants.
>
> Mac OS X, which is based off of OSF's Mach, supports this in
> memory_object_change_attributes [1]. I think it makes sense to be
> compatible with their interface even it if means updating our API.
>
Fine, anyways, having multipage support requires API changes.
> > but even with this issue, benchmarks [1] (I've
> > made and fast (ugly, buggy and dirty) implementation over GNU Mach
> > to test it [2]) show that the performance for I/O operations is
> > slightly increased.
>
> That looks promising. I assume that you must have also changed
> libpager as hurd/libpager/data_request.c only supports length ==
> vm_page_size? I looked in the CVS repository on bee.nopcode.org,
> however, I did not see a Hurd tree.
Yes, I changed libpager, ext2fs and isofs for testing pruposes. I've
just uploaded the Hurd tree to Bee's CVS (*.mp are the modified
directories).
If you look at that code, you'll find some calls to *_direct functions,
and that there are no memory allocations on file_page_read_multipage().
As I said before, the code in CVS contains other changes, not only the
ones realted to multipage requests. Those *_direct functions are a
experimental change, which makes that the glue code inserts the
requested pages directly to the pager's memory object.
I think that I'll reimplement multipage support on a clean GNU Mach
tree, to make changes progressive and easy to review.
> > But with this strategy we have a trouble that must be resolved. Many
> > times, GNU Mach requests more pages than the translator (ext2fs in
> > my tests) can fill (if your are dumping a 17K file, with a 16K
> > cluster_size(4 pages), first call will fill all the pages, and the
> > second only 1) , and we must free it some way. I think that
> > m_o_d_unavailable and m_o_d_error don't fit well for this purpose,
> > so I've hacked the glue code (linux/dev/glue/block.c), to make that
> > "device_read" writes the pages directly to the memory_object,
> > freeing the unused ones at time(probably, there is a much better way
> > to do this ;-).
>
> The memory object operates at the page level granularity. So if you
> do a memory_object_data_supply for the pages you do have and
> memory_object_data_error for those you don't, it would seem to me that
> it should work. Is this not the case?
>
I think I tried with m_o_d_error and I had some trouble, but I can't
remember right now. Anyways, I'll take a look at this again.
> > What do you think about this?
>
> This looks promising. I look forward to seeing the patch and the
> results of benchmarks with various cluster sizes with ext2fs before I
> advocate upstream inclusion. (Also, we need to think about copyright
> assignment to the FSF for both the Hurd and Mach if you have not done
> those yet?)
For the copyright issue, what do I need to do?
>
> Thanks for your work,
Hey, playing with GNU Mach is really funny! ;-)