chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Re: An alternative thread system?


From: Elf
Subject: Re: [Chicken-users] Re: An alternative thread system?
Date: Mon, 11 Aug 2008 15:19:18 -0700 (PDT)

On Tue, 12 Aug 2008, Aleksej Saushev wrote:

Elf <address@hidden> writes:

On Mon, 11 Aug 2008, Aleksej Saushev wrote:

<snip>
is shared access to memory, which you can easily avoid.
Pipes are not that simple actually, to pass some complex structure
through pipe, you need to pack it to some structure on one end,
parse and unpack on the other end (note all those XML/YAML encodings),
while with _some_ shared memory you could just pass the reference.

Again, the problem isn't in threads or shared memory, it is in
abstractions used. It is like modular programming vs. the
ancient non-modular one (with lots of shared variables &c).

Disagreed absolutely.  Threads can be incredibly complex to debug, even
ignoring shared memory issues.  Threads can get stuck in dependency loops
very quickly while waiting on I/O, even from possibly different
sources.

How does it differ from communicating processes in this respect?

Single threaded processes cant get stuck in mutual dependency loops.

Oh, really? Create socket, start two processes, and wait on input on both sides.
Of course, the OS will save you, it magically knows how to reschedule them.

a) thats not a deadlock.  thats gino programming. (garbage in, nothing out)
b) a realistic, feasible version of this (say, three threads, one writer and
   two readers) is trival to avoid in the process model, but incredibly complex
   at best in the thread model.  plus in the thread model, youd have one
   fd repeated twice in a single select call, which is always entertaining.

The OS can reschedule them.  Single threaded processes can be
written in such a way to guarantee that whatever is being
waiting for is the next thing that needs to be run.  Threads
generally cannot do either of these.  There is no general
purpose method of breaking deadlocks in pure thread models
that does not violate consistency of one or more of the deadlockers.

Have you ever used operating system with no MMU? Processes and
threads are the same in relevant parts here.  There's no
difference in behaviour of multiple processes and multiple
threads, in both cases you see deadlocks, races and contention.
All relevant mechanisms are just the same.

the first sentence made no sense.  processes all work in their own space,
with their own interrupt and error handling, with a completely deterministic
state.  threads share space but more importantly they share the timeslice
for the same process.  the order of the threads being run is indeterminate.
it is not possible to absolutely determine the control flow of the threaded
model as soon as more than one thread is active.

This is the biggest difference between threads and processes,
from a user viewpoint.  Processes only have to know about
themselves, and how they operate, to maintain
consistency/correctness.

That's not true. If processes operate cooperatively, they should
know about each other. E.g. SMTP and NNTP server and client, ssh
server and client, etc. If some SMTP goes wild, the other has to
drop connection after timeout. Database replication daemons are
even more dependent on mutual communication protocol.

no, they can follow a scripted protocol.  they dont need to know anything
about the other's actual internal state.  they dont need to know what else
its doing, or if its waiting, or if it can talk right then, because they KNOW it can talk right then, cause theyve either received the message or exceeded the timeout. designing deterministic protocols is an entirely
separate issue and one you yelled at me about below.


Threads can get interrupted in all kinds of unpleasant
ways without being detectable as such.

Irrelevant again, so are processes.

No, they cant.  There isnt a per-thread signal handler

The original topic has nothing with signal handlers,
OS scheduler can just stop process and swap it off for a while,
you have rather limited control over it, unless you're doing
real-time job.

i said that threads can get interrupted in all kindso f unpleasant ways that
processes werent susceptible to.  the original topic had to do with thread
model vs process model.  signal handling is an important aspect of this, as
thread-kill/thread-signal are signal-oriented and
thread-terminate/thread-suspend are condition-handler oriented, and the ORIGINAL
topic was thread-terminate.  i have no idea why youre talking about os
scheduler swapping, which is totally invisible to the process, whhereas i was
saying that even which thread a given signal is applied to is indeterminate
in the current model, because theres only one handler for the entire process,
not one per thread. do you actually read the entire thing i write, or just half of the first sentence?


(at least not at the moment) in chicken, there is one signal
handler that accepts everything. Which signal may be aimed at
whom is for the signal sender to know and the scheduler to
guess.  (Do you feel lucky, punk? :) )

That's the limitation of particular virtual machine, not the
thread at all. This has nothing to do with original topic.


the original topic was threads vs process model for concurrency.


Conversely, processes each have their own signal handler and
any well-designed program installs the relevant cleanup handlers.

This is the limitation of particular process model, which has
nothing to do with original topic.

If you're the sort of person who uses a debugger (I'm not), to
the best of my knowledge no debuggers handle multithreaded
software particularly gracefully.

This is your personal taste and your personal problems, isn't it?

No.  I don't use debuggers.  The #1 problem that I hear from
people who DO have to use them constantly is that its well-nigh
impossible to debug any non-trivial threaded program.

Again, this is irrelevant to differences between communicating
threads and processes. Actively communicating processes are as
hard to debug.

Flat out wrong on multiple counts. Actively communicating processes are trivial to debug if you either write a proper protocol or use one
of the gazillion already written.  PROTOCOLS ARE DETERMINISTIC.  this
forces the things speaking the protocol to act in a well-behaved way,
at least with each other, or the communication cannot continue.  Furthermore,
it is always detectable when the communication breaks down, because the
messages have to follow the format in a particular order.  Whereas with
thread communication, you dont even know if the same thread will be
listening at two sequential steps of a handshake.
I also said, as my first line, above, that i was focusing on non-shared-memory
aspects of the thread model being more fragile, which meant that everything BESIDES direct communication, unless youre proposing that threads
only talk to each other through pipes/ports/sockets.



 You don't need all the XML/YAML/Whatever-This-Months-Buzzword-Is.
It's pretty trivial to define a small, customised format for
messages in your application.  It's also easy to add sanity
checking to messages, so the receiver can detect if the sender
is in an inconsistent state (or if the message was, at very
least) and try to correct for it.  It's easy to add new
message capability to the system with a new process handling
it, because its a lot more loosely coupled.  It's easier
to keep data formats small.

Noone argued against message-passing interface above. Really.

My point was only that its much simpler to do message-passing
(4 calls needed at system level) than to do inter-thread
communication (some ungodly number, depending on how things
like mutexes are handled).

You can pass messages with _no_ context switches at all.
Faster than processes.


except for grabbing the mutex, waiting for the mutex, saving the registers and 
stack, loading the new threads registers and stack, and cleaning up all the
continuation references cause your stack position is probably different...

this is NOTHING like a context switch.  of course.


The overhead of interthread communication is implementation
dependent, and it is still faster than IPC due to no user-to-kernel,
kernel-to-user copy. With IPC you still have same locks, only in
kernel space.

You can use those very pipes to communicate between threads.

remember the first line above? even if we restrict ourselves to issues besides shared memory? if youre already using pipes to talk between
all your threads, and youve written a scheduler to handle your threads,
and all of this... congratulations, you have miniprocesses.  you no longer
are talking about threads.


The potential for race conditions is greatly diminished if the
implementor was not a total idiot about security.  The flow of
the program is deterministic and verifiable.  This last condition
is not possible, to the best of my knowledge, in threading
systems, without having the scheduler be so restrictive and
have so much information about everything beforehand that it
essentially runs as a single-threaded program.

How does that differ for multiple processes?

do you know what a race condition is?



If multiple processes are spawned by the same program, they can be timed
to not conflict with each other.  If multiple threads are spawned by the
same program, there's no assurance that the creating thread won't lose its
timeslice before it should (and in fact almost always will, if you do sleep
calls), nor is there any guarantee as to which thread will be started.

Processes can't be timed, if some I/O interrupt happens,
scheduler may decide to stop the process. When it will be
resumed, you've lost time. There may be one common scheduler
for both, processes and threads. So, all may be equal before
the Scheduler.

timing means a lower bound on delay of some number.  an interrupt would
not affect this, even if handling it exceeded the sleep time.  it just means
that it would get started a little later, that one process will be idling
for slightly longer than it would be otherwise. THREADS DONT HAVE A SCHEDULER THE WAY PROCESSES HAVE A SCHEDULER.



It seems, that you've lost the point of the message you reply to:
given some restrictions to communication between threads, there's
almost no difference between processes and threads, semantically,
and at the same time threads are more flexible and more effective.

I was responding to it, merely disagreeing with it.  threads
are neither more flexible, more effective, safer, nor easier
to work with.

You failed to prove it.

And on the other hand we have enough empiric material, where
converting to threads raised overall performance.


that is what we in the business refer to as an 'unverified customer anecdote'.
whereas here are some concrete ones:
   have you noticed that computer stresstesting apps, where theres a certain
   desired threshold of traffic/signaling/io/whatever, are not written in a
   multithreaded way, but rather spawn processes?
      examples: SARA (vuln scanner), NESSUS (vuln scanner), dnsstress
   the same holds true for servers with high load:
      examples: sendmail, rbldnsd, apache
   we can look at other language implementations, all of which either have
   green threading and recommend processes, or have broken attempts at native.
   we can look at games... if youre running windows, look at the process log.

   we can look historically at why things like distributed message passing
   architectures were first explored, for the biggest, highest performance
   applications in supercomputing... (netlib.org is a good resource for this)

so, where have threads raised overall performance on real applications?
i believe for user-intensive ui elements, the event loop dispatcher is a good
use of threading. i know from my own experiences writing HA test code that threads slowed it down considerably as well as adding ridiculous complexity
(except in cases like perl, which is stupid about spawning).

i think the whole point here is that threads are not an elegant way of handling concurrency. you have made several unverified claims, cited no
references, attacked my points by calling them 'irrelevant' without responding,
and failed to raise a single point since the first post, which was also
unsubstantiated claims.  give me some numbers.  show me some code.  show me
some application, some bugtracker, some readme, some changelog, anything, where
people note an overall improvement in performance attributable to the threading.


-elf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]