l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comparing "copy" and "map/unmap"


From: Jonathan S. Shapiro
Subject: Re: Comparing "copy" and "map/unmap"
Date: Tue, 18 Oct 2005 08:05:06 -0400

On Tue, 2005-10-18 at 01:42 +0200, Matthieu Lemerre wrote:
> >> 3/
> >> > One difference between this approach and UNMAP is that the EROS
> > approach
> >> > interacts well with persistence, while the UNMAP approach does not
> >> > appear to (to me). My question about unmap is that I do not
> > understand
> >> > how it scales successfully when the mapping database grows large
> > enough
> >> > to become an out-of-memory data structure, and I have already
> > described
> >> > in a previous note why flushing and reconstructing the mappings
> > doesn't
> >> > work without severe restrictions on system architecture.
> >> 
> >> Maybe the L4 people have thought about this.  There is already a paper
> >> about persistence on the L4ka web site.
> >
> > That paper assumes that kernel virtual memory is not exceeded, so it is
> > not scalable. It is a "dump to memory" design rather than a first-class
> > persistence design. In my opinion, the current L4 system would need a
> > complete redesign to enable scalable persistence. It simply wasn't a
> > design goal.
> >
> 
> OK.  Maybe they have plan for it, though.

If so, I would be very interested to hear what the plan is. I believe
that I have spoken to all of the implementors of the various persistence
mechanisms in L4. The best answer I have received (so far) is "mumble,
umm, uh.".

However, it would be good if there were a solution, and I would be
interested to see one.

> >> 4/
> >> You have to allocate kernel memory for the mapping database.  But
> >> there may be solutions to that problem ("in-kernel heaps that were
> >> backed by an out-of-kernel storage allocator.". Is it sigma1?)
> >
> > Yes. No, this is probably not sigma1. The complication is how to make
> > this design correctly transactional in a persistent system, and how to
> > design a kernel that can survive revocation of kernel heap storage by
> > application code.
> >
> > This is deeply challenging. It may be solved, but I do not expect that
> > it will be solved robustly any time soon.
> >
> >> So, if points 3/ and 4/ could be solved, MAP/UNMAP has better
> >> performances.  This is really important if it becomes a common usage
> >> pattern, and I have some ideas that would require MAP/UNMAP
> >> operations.
> >
> > I would be very interested to see a list! Perhaps there are simpler
> > answers.
> >
> 
> It seems to me that you avoid REVOCABLE COPY because it is less
> efficient than COPY (need to create a new object, etc...).  My point
> is that there are cases where REVOCABLE COPY can be a replacement for
> COPY.

As the Russians used to say: quantity has a quality of it's own. When
the latency becomes high enough, the problem is qualitative. The problem
is that low latency implies kernel allocation, and kernel allocation is
incompatible with a robust system architecture.

There is also a formal issue. This probably should not interest the Hurd
group, but I will explain it.

I have mentioned previously that the ability to connect the semantics of
capabilities with the semantics of programming languages is very
beautiful and powerful. It is therefore important to me to preserve
this.

Until Coyotos, the only example I had for a viable REVOCABLE COPY
implementation was the L4 UNMAP operation. My problem from a theoretical
perspective is that I have been unable to construct a formal statement
of the L4 MAP/UNMAP semantics that leads to a sensible formal semantics
for client/server computation.

I should add that we have not been able to do this for EROS or Coyotos
either. The difference is that in the EROS/Coyotos case, we have made
significant progress and we are beginning to see what this structure
would look like.

It is very possible that we simply have not tried hard enough for L4
UNMAP, but my intuition says otherwise.

> > Yes, in both systems this can be done, but with the cost that the server
> > must now deal with revoke during mid-operation. The EROS "trusted buffer
> > object" provides a pre-packaged solution for this that is not doable in
> > L4sec (at least, not at the moment).
> 
> * What are they?  Are they something that make sure that the buffer
> can not be revoked for a certain amount of time?  Or do you have a
> more clever way to deal with this?

See "Vulnerabilities in Synchronous IPC Designs":

        http://www.eros-os.org/papers/IPC-Assurance.ps

> * I don't see the difference between COPY and REVOCABLE COPY with
>   regarde to dealing with revoke during mid-operation.  The memory can
>   be revoked in either cases (either selectivly REVOKED or the memory
>   object could be destroyed, and this makes no difference to me).

Not necessarily.

I think you are struggling here with something difficult. Taken alone,
you are correct that COPY and REVOCABLE COPY are not enough to resolve
this problem. The interesting part about EROS is the ways in which the
pieces interact to let problems like this be solved. The difficulty is
that you need to step back and figure out how to put those pieces
together. One of my tasks over the next month or so is to write a
journal paper that captures some of this.

Have a look at the IPC Vulnerabilities paper. It is an attempt to
capture *some* of the issues.

> >> In L4ng threads are mappable.  So a client could in fact map a thread
> >> to a server so that it could serves its own requests (to avoir thread
> >> explosion, maybe we should have one thread per principal.  But maybe
> >> having a bunch of inactive threads isn't really harmful.).
> >
> > The real problem is not page faults per se. You aren't going to recover
> > the thread anyway. The real problem is re-establishing consistency for
> > all of your data structures after a client defaults. Sending a thread
> > just introduces another way that the client can default.
> 
> It seems to me that:
> 
> -Either the objects to a server are "alones", or have only relations
>  between objects allocated by the same (homogeneous) source. In this
>  case, if the memory is revoked during mid-operation, then we don't
>  care if the data is inconsistent (because the client just lost the
>  thread providing the ressource allocated by itself.). I don't see
>  additional harm in this case of homogeneous storage.

I agree. In practice, all situations of homogeneous storage appear (so
far) to involve symmetric trust (i.e. all parties trust each other), so
there is rarely any problem in this case.

> -"Global states" (like, for a global thread server, which thread has
>  the server mapped (which is something needed on L4)) would be run on
>  the server's memory (and cannot be unmapped except by owner of the
>  memory on which the server is runned, which we must assume to be
>  trusted), and would be manipulated by the "global thread".

How can this possibly lead to a robust system? You have just architected
a denial of resource opportunity!

> I agree that this type of servers should then be written with care,
> but the approach I describe above seems to me quite similar than your
> "paranoidmemcpy" method.  Still performance may be quite bad.

ParanoidMemcpy is quite different -- the global storage involved is
transient and boundable. Performance is actually quite good, because the
copy can usually be avoided in practice.

> >> A hostile client could then do nothing to the server, asides from
> >> making its own thread fault.
> >
> > Is it clear from my comment above that this is FAR from true?
> 
> I should rewrite this sentence: A hostile client could then do nothing
> to the server, asides from making its own thread fault, provided that
> the server is written with care (which may not be easy, but maybe
> easier with having two threads communicating than using a
> specialmemcpy operation)

No. Consider the following counter-example:

We assume from your initial conditions that the server has multiple
threads. Therefore, I will assume that the server is implemented with at
least one critical section.

The client thread enters the critical section. Before exiting the
critical section it is killed. Note that this does NOT release the
critical section lock!

Please give a general strategy for recovery, or at least a set of
architectural principles for how to design a multithreaded server that
can recover from this situation.

Hint: it does not exist in the general case, and the field of computer
science has been looking for useful design patterns that would support
practically useful cases for 40 years without success. The best
solutions so far involve designing the entire subsystem transactionally,
with a single atomic update operation that commits the operation.

> >> This has another advantages not only the clients pays for its memory
> >> usage on the server, but it also pays for its CPU time usage and can
> >> adjust it (provided that it has some control on the time slices
> >> allocated to the thread, that it can take from its "CPU time
> >> container")
> >
> > CPU time has nothing to do with passing/mapping of thread resources. It
> > has to do with passing/mapping of scheduling resources, which is an
> > orthogonal issue.
> 
> Threads are a basis for scheduling in the L4 world, so they are not
> completly orthogonal.

It is true that threads are the unit of dispatch. They are therefore the
unit that is controlled by the scheduler. This is true in EROS/Coyotos
as well.

However, threads to not migrate from one process to another in L4 in the
way that you imagine. What you are really proposing is that the
*binding* between the thread and it's scheduling slot should migrate.

Actually, there are two separate decisions here:

  1. Should the binding migrate?
  2. Should a preemption point exist in the IPC path?

     A preemption point is any point where the kernel asks: "Should
     I consider scheduling something else now?

The concept of "thread migrating IPC" is really just question (2) where
the answer is decided to be "there are no preemption points in the IPC
path". The name "thread migrating IPC" was a horrible name, because it
simply does not describe what is actually happening.

You are considering a design in which the answer to (1) is "yes", or at
least "possibly". This is known under various names, but is most
commonly called "schedule donation", "priority donation", or "priority
inheritance scheduling". It's really quite a horrible idea.

> >> So the user provided thread just has to make a call to
> >> the server thread.  This is local IPC, so potentially very fast.
> >
> > Local IPC was a bad idea, and is going away. It cannot be implemented
> > cleanly in a protected system in any case.
> 
> Why that?  If the protected objects are stored on kernel immutable
> memory, how could a thread modify it?  Or is the problem something
> completly different?

The problem is that the semantics of local IPC was wrong, and it wasn't
wrong in a fixable way.

> >> * I also noted, in a previous answer, that when you use a private
> >>   server, you can just map memory to it : you don't have to copy it.
> >> 
> >> Thus, I am wondering if MAP/UNMAP isn't the best operation for some
> >> kinds of pattern (with the above exemple memory usage).
> >
> > Sharing memory (or storage) is a fine design for many patterns. This is
> > orthogonal to whether the primitive is MAP or not.
> >
> 
> What I wanted to say is that this design pattern fits well with the
> MAP/UNMAP operations, and seemed to have better performance.  Of
> course, this is to counterbalance with the "global thread/local
> thread" communications (when needed).

It is true that you need support for establishing a shared mapping. It
does not follow that MAP/UNMAP is the best design, or that creating this
kind of mapping should be convoluted with revocability.

However: I agree that it is desirable to have an in-kernel mechanism
that allows to processes A and B to quickly establish a shared region.
Further, I agree that the flexpage notion (a poor name, since it has
nothing to do with pages) is useful here. What we wish to do is quickly
establish a shared mapping of 2^k units that are naturally aligned.

This is one of the reasons that Coyotos is shifting from the Node data
structure to the PATT data structure. It will allow us to establish this
sharing by a single capability copy.

> > The last I heard, the L4sec group was considering dropping GRANT. In any
> > case, GRANT does not help. In order for grant to help, *all* transfers
> > would need to be done with GRANT, and there are now two problems:
> >
> >   1. It precludes two processes or threads ever holding the same
> >      capability at the same time.
> >   2. There is no way for the recipient to check whether this
> >      condition has actually been satisfied for the capabilities
> >      that it receives.
> >
> 
> Right.  As for 2., I imagine this is a problem because the chain of
> processes can be potentially untrusted, but the implementation of a
> capability can be checked?  I didn't thing of it because (as stated in
> another discussion) we don't have (yet?) such things in the Hurd.

Yes. You understand exactly.

> I didn't know there was an "endpoint pager" (the L4sec spec isn't out
> yet).  I have even never thought on how it could be useful.

The term "pager" in L4 is a legacy from the days when the only things
that were mapped were pages. In their new system, the idea is
generalized to other mappable resources as well.

> > In practice, it would not be A' revoking the mapping either. The problem
> > is that A' ultimately got its storage from sigma1, and if sigma1 needs
> > to page something out it is going to need to revoke the mappings. This
> > revokes A', and consequently revokes A and B. The problem is that when B
> > now touches that object, it doesn't have enough information to get it
> > paged back in successfully.
> 
> But when sigma1 needs some memory, it could be arranged that it does
> not revoke a fpage containing endpoints informations.

Please explain how this can be robustly achieved? Where should sigma1
store this information? Why it should believe someone who says "this
page is not revocable because it contains endpoints"? Isn't this just a
denial of resource opportunity?


shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]