Re: HURDNG : Which type of OS design could we have to think nowadays ?

l4-hurd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HURDNG : Which type of OS design could we have to think nowadays ?

From:	Marcus Brinkmann
Subject:	Re: HURDNG : Which type of OS design could we have to think nowadays ?
Date:	Fri, 11 Aug 2006 22:56:44 +0200
User-agent:	Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.7 (Sanjō) APEL/10.6 Emacs/21.4 (i486-pc-linux-gnu) MULE/5.0 (SAKAKI)

At Fri, 11 Aug 2006 22:12:40 +0200,
"Ernst Rohlicek jun." <address@hidden> wrote:
> As I see it, there are a few options:
> 
>   - Not saving caps and re-issuing caps every machine reboot and server 
> restart; processes are giving their children the caps they need.
>   - Reconstruct environment.
>   - Save all (if I remember correctly, EROS was doing this continuously 
> in the background).
> 
> The first point is the safest and easiest. You never restore a 
> check-pointed process back into the running system with the system 
> having changed state in a way which negatively impacts the program.

Obviously, this is also the least useful: You lose all your state.
Basically, at every reboot, you start with a fresh installation of the
operating system.  No users, no accessible files, no configuration,
etc.

If you want to "re-issue" anything, you need to remember what to
re-issue to whom.  But that is exactly the state we are talking about.

> If you re-create the same environment security-wise and start it, the 
> program should run the same way. The environment is the POSIX-compatible 
> environment (env vars, working directory, user:group) and the 
> Hurd-specific environment (capabilities / ports, grants / memory maps).

It is true that if you can reestablish the environment of a process,
you can reinstantiate it from scratch.  There are two issues: First,
how do you do this in the first place (orthogonal persistence comes to
mind).  Second, how do we know that the program's state is consistent
with the rest of the system?  This is of course exactly what you meant
with "having changed state in a way which negatively impacts the
program".

> I'm not sure as to what is the best place to save these things to (a 
> protected configuration file? into the filesystem inode?) though this 
> should be sufficient to run the program with the same privileges and 
> avoid the Confused Deputy Problem.

In UNIX, there is only one place to store this state: The filesystem.
A lot of the complexity in the UNIX kernel and applications is related
to difficulties in ensuring consistent file system updates.

In the Hurd, the filesystem is still used.  The user ID in the file
system is what remembers which user has access to which files.  The
"user ID object" is passed to a user in exchange to a password.  This
means that underlying the capability system is an ACL based protection
scheme (with all the problems that entails).

To faciliate reconstruction of server processes, the Hurd has the
"passive translator" filesystem extension, which contains a command
line string to restart the service.  This approach has led to subtle
bugs: Because not the whole environment is saved, the newly
instantiated version of the server may run in a different environment
than the original server, and in fact that environment may include
capabilities that the installer of the server did not have, thereby
violating a protection boundary.

In the context of Hurd translators, this could be fixed, but the
mechanisms of the fix do not generalize to arbitrary objects and
services.

> The third has a performance impact; though not as big as it is 
> subjectively assumed in the first moment, it is still there and I'm 
> still unsure what the advantages are...

@InProceedings{Shap02,
author = {Jonathan S. Shapiro and Jonathan Adams},
title = {Design Evolution of the {EROS} Single-Level Store},
booktitle = {Proceedings of the 2002 USENIX Technical Conference},
publisher = {USENIX Association},
month = {October},
year = {2002},
pages = {2--20},
abstract = {Explains the design of the EROS single-level store.},
keyword = {EROS, Persistence, Performance},
url = {http://www.eros-os.org/papers/storedesign2002.pdf} 

> Anyway, you will not be able to restore the same state as before 
> check-pointing because in case of a crash the underlying hardware was 
> reset and all remote connections were cut off. As soon as you have 
> non-local dependencies, processes & kernel & drivers will (1) not be in 
> the same state as when they were check-pointed because they get back 
> errors because hardware is not in the same state as before 
> check-pointing and (2) because network connections and especially the 
> remote parties will not be there or in the same protocol-wise state as 
> before.

That's well known, and in fact it is usually desired: Although
distributed persistent systems are possible, the concept doesn't
scale: A crash of a single machine would cause a "time jump" of all
machines back to the last checkpoint.

In practice, the machine boundary is well understood by programmers of
these applications, and programs are written to deal with these
issues.  It is unavoidable: There are other reasons for intermittent
communication failures over the network as well.

> In the best case, processes might be able to recover from the state 
> change, but I'm skeptical about that.

That's what they already do.

> As a solution, you can just save process images of selected processes, 
> ie. no Hurd servers, kernel and drivers, just the applications and store 
> what caps they had, what files they had open ...

There is no general way to do that correctly.  Any solution like this
will need to have a lot of semantic knowledge about the objects
involved.  As you said:

> For Linux, this exists, though it is very limited (works only for daemon 
> processes as they say):
> 
>       http://www.openmosixview.com/chpox/
>       -> http://www.cluster.kiev.ua/tasks/chpx_eng.html
>       (also with a paper about it)
>       -> http://www.checkpointing.org/

In this case, it is possible, because the Linux kernel only supports a
limited number of different object types in the environment of a
process.

I agree with you that there are basically two options: selective
persistence, or global persistence.  Both have a number of advantages
and disadvantages.  It's difficult for me to see how selective
persistence could also be user extensible (one of the Hurd's design
goals is maximizing user freedom).

Thanks,
Marcus

[Prev in Thread]

Current Thread

[Next in Thread]

Re: HURDNG : Which type of OS design could we have to think nowadays ?, (continued)
- Re: Can we really think at a new OS design nowadays ?, Benno, 2006/08/05
  - Re: Can we really think at a new OS design nowadays ?, Guillaume FORTAINE, 2006/08/05
    - Re: Can we really think at a new OS design nowadays ?, Sergio Ruocco, 2006/08/06

Prev by Date: Re: HURDNG : Which type of OS design could we have to think nowadays ?
Next by Date: Re: K42/HURDNG
Previous by thread: Re: HURDNG : Which type of OS design could we have to think nowadays ?
Next by thread: Re: HURDNG : Which type of OS design could we have to think nowadays ?
Index(es):
- Date
- Thread