l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On hierarchy


From: Jonathan S. Shapiro
Subject: Re: On hierarchy
Date: Fri, 21 Oct 2005 16:26:06 -0400

On Fri, 2005-10-21 at 20:22 +0200, Udo A. Steinberg wrote:
> On Fri, 21 Oct 2005 10:43:33 -0400 Jonathan S. Shapiro (JSS) wrote:

> JSS> 1. System calls are only preemptable between well-defined "units of
> JSS>    operation".
> 
> In order to use compatible terminology, please write "preemptible" ;)

Happily. My splling is terribull. Actually, I believe that the correct
spelling in this case is with an "a".

> JSS> 2. Because they are non-preemptable, units of operation must complete in
> JSS>    a small *constant* number of steps.
> JSS> 
> JSS>   => In my *opinion*, the L4 string bound is too large to satisfy
> JSS>      this requirement, but this is definitely a matter of opinion
> JSS>      and preference. In effect, I want a smaller constant bound.
> 
> Note that there exist at least one fully preemptible L4 microkernel, namely
> Fiasco, where your concern is a non-issue.

Yes. I was aware of this. There are two cases in the API that I do not
understand how to implement correctly if operations are preemptable. Can
you explain them:

1. Two threads exist in a task. One performs an unmap from a location
while the other (simultaneously) executes a loop performing a map. Is
the unmap guaranteed to terminate in any predictable amount of time?

[You may have already answered this below]

2. One thread performs a long unmap that is preempted after making
partial progress. Because it is preempted, the scheduler permits other
threads to run. In particular, a debugger thread performs a "stop"
operation on the thread that is performing the UNMAP. I do not see any
outcome in which this is possible within the requirements I have stated
without leaving the system overall in an inconsistent state.

> JSS> 4. Locks may not be held across preemptions. This is necessary to avoid
> JSS> priority inversion.
> 
> Not quite. Priority inversion can also be avoided by using helping mechanisms
> such that higher-priority threads help a low-priority lock holder out of its
> critical section. There's a whole set of formulas for such things in common
> real-time literature.

Unfortunately, none of these mechanisms work in the general case -- a
discussion we should take up separately. The problem is that there is no
way in general to know which thread to escalate if the IPC mechanism is
general-purpose. Both the L4 and EROS IPC mechanisms are general
purpose.

Setting this aside, long-held locks also create opportunities for kernel
deadlock that we would prefer to avoid, and they make analysis of kernel
correctness about two orders of magnitude harder. I don't know if formal
analysis is an active objective for L4.sec, but for Coyotos this is
definitely a concern.

> JSS> 2. The UNMAP operation requires an "all children" visit. The question
> JSS> is: can this operation be divided successfully into constant-bounded
> JSS> units of operation? The answer appears to me to be "no". 
> 
> You can descend down the mapping tree starting at the root node, marking each
> node "unmapped" as you come across it. Such marked nodes don't allow creation
> of any subnodes underneath them, which ensures the mapping tree does not grow
> in such branches. For each leaf node reached, you remove the node, remove the
> PTE and flush the relevant TLB entry. The per-node time overhead is bounded
> and you can be preempted in between such node operations.
> 
> Granted, unmap operations can take a long time to complete for the unmapper,
> depending on the size of the mapping tree for the unmapped page frame.

If I understand you, this is the case where the unmap is a total unmap:
the PTE's are going to invalid. I had in mind a different algorithm, but
I agree that yours works. Also, I *think* that your solution resolves my
"map during unmap" question.

The case that I do not currently understand how to do is the *partial*
unmap -- e.g. taking away write permission while leaving read
permission. For this case, I think you will need to add some form of
index to keep track of progress. Is there a way around this?

Do the performance measurements that have been published for the map
operation include this locking, or are they measuring an incorrect
implementation?

Finally this strategy violates the long-held locks requirement. I
recognize that this requirement needs further discussion, and that other
design rules may be feasible here.

shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]