[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Questions
From: |
Espen Skoglund |
Subject: |
Re: Questions |
Date: |
Mon, 27 Oct 2003 14:40:27 +0100 |
[Martin Schaffner]
> I have two questions:
> 1) Performance:
> In 06/04/2001, "sjames" wrote on slashdot
> (http://slashdot.org/comments.pl?sid=11531&cid=310196):
>> The biggest problem for microkernels is that they have to switch
>> contexts far more frequently than a monokernel (in general).
>>
>> For a simple example, a user app making a single system call. In a
>> monokernel, The call is made. The process transitions from the user
>> to the kernel ring (ring 3 to ring 0 for x86). The kernel copys any
>> data and parameters (other than what would fit into registers) from
>> userspace, handles the call, possably copies results back to
>> userspace, and transitions back to ring3 (returns).
>>
>> In a microkernel, the app makes a call, switch to ring 0, copy
>> data, change contexts, copy data to daemon, transition to ring3 in
>> server daemon's context, server daemon handles call, transitions to
>> ring 0, data copied to kernelspace, change contexts back to user
>> process, copy results into user space, transition back to ring3
>> (return).
>>
In short, the guy is saying that in a microkernel the single
user-to-kernel switch for syscalls is translated into an RPC from
user-task to kernel-task (I don't understand why everyone has to get
all tangled up in this ring-terminology). And yes, this will incur
some overhead. You should take the data copying part of his argument
with a grain of salt, though, since a) there might be no arguments
that need copying (e.g., they may reside in registers), and b)
arguments may be copied directly from user-task to kernel-task without
using a temporary kernel buffer.
> The app has previously aquired a capability for the file it wants to
> read, and allocated a buffer. When it calls "read", the glibc
> function makes an RPC directly to the filesystem translator (no
> going to ring 0, assuming the translator is owned by the same user
> as the calling process),
Will need to enter kernel-level unless translator is within same
address space as the application.
> which in turn RPCs the driver of the backing store (will probably
> reside in ring 0) for the data.
Except that the driver will probably *not* reside in kernel-land.
> Assuming zero-copy, there is a minimum of data copying, but there
> are still four context switches, which can't be done with super-fast
> IPCs, since they concern three different processes (app, translator,
> driver).
I'm by no means a hurd expert, but I suspect that the intermediate IPC
(i.e., the one to the translator) can probably be circumvented for
common case operations (i.e., read/write).
> Is it likely that l4/hurd will be slower than linux, for things like
> filesystem operations?
The major overhead of filesystem operations tend to be with the
hardware itself. The overhead of IPC will of course still be present.
However, considering a 1GHz processor, an IPC time of 200 cycles, and
a syscall every 1 microsecond, the pure overhead of the syscall RPC
will amount to 0.04%. Other factors, such as cache working sets and
TLB flush operations due to untagged TLBs will then have a higher
impact.
eSk