swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multithreading question


From: Marcus G. Daniels
Subject: Re: Multithreading question
Date: Wed, 29 Jan 2003 11:20:30 -0700
User-agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.3b) Gecko/20030117

With Javaswarm we're stuck (the problem being with mosix anyway, not with multiprocessor pc's).

I personally have an Objective-C application which would benefit a lot from going multithreaded (either on Mosix or on a multiprocessor single machine), being inherently parallel.

Java threads can be `native' threads on most systems. `Native' threads are threads that the kernel can schedule, and thus distribute in parallel to different CPUs. (Ultimately, the kernel gets to decide how to use the hardware.)

Native thread performance is limited by how much bookkeeping the kernel must do to track threads. In the past, on Linux, this has been limited by the fact that threads looked like processes, and carried around their own bookkeeping data. So, things tended to slow down when there were thousands of threads, as the kernel had to wade through a lot of irrelevant data. Now, with the new NPTL threads library (Native POSIX Threading Library) and the Linux 2.5 kernel this problem is removed. On a big SMP system (say http://www.sgi.com/servers/altix), you should now be able to scalable parallelism in multithreaded applications, and that includes multithreaded Java applicatiosn.

But MOSIX won't work for Java because it migrates jobs at the process level, not at the thread level. Nor will it help for multithreaded Objective C applications, or any multithreaded programs for that matter. As soon as an application calls `clone', all the threads have to share memory, and thus can't be migrated independently. However, systems like the SGI Altix are different because, at the hardware level, they migrate chunks of memory around (NUMAlink @ 3.2 GB/sec) and give the appearance of one big memory pool.

It's about a genetic algorithm-based production planner: if you're into GAs you've probably seen the point already. The fact is many computationally heavy independent tasks (computing every candidate solution's fitness) can be run simultaneously on different CPUs. The idea of going multithread has been in my mind for a long time, and I expect to start working on it seriously as soon as I get some spare time.

I think what you really want is a client/server architecture that chunks fitness evaluations into big batches, and ships the results back in a compact way. You probably have much bigger populations than you do CPUs, and probably each individual is small. You'll never amortize the communication costs unless you limit the number of messages.

Yes, it would be nice if Swarm would parallelize concurrent actions, but that would only be a benefit on multi-CPU boxes. There aren't many of those machines that are close to affordable. The Altix, for example, starts at $70,000 US.



                 ==================================
  Swarm-Support is for discussion of the technical details of the day
  to day usage of Swarm.  For list administration needs (esp.
  [un]subscribing), please send a message to <address@hidden>
  with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]