[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [help-gnubatch] ( [PATCH] Bump version in debian/changelog to 1.4 al
From: |
Reuti |
Subject: |
Re: [help-gnubatch] ( [PATCH] Bump version in debian/changelog to 1.4 also ) + newbie question |
Date: |
Thu, 7 Jun 2012 23:58:51 +0200 |
Am 06.06.2012 um 14:46 schrieb Peter Valdemar Mørch:
> Thanks for replying!
>
> On Wed, Jun 6, 2012 at 12:52 PM, Reuti <address@hidden> wrote:
>> You could put a sleep with the necessary amount of seconds in the jobscript
>> and start it at the full minute before, so that you can start it at any
>> second.
>
> :-( I was affraid of that. That also means, I take it, that all the
> details of the scheduling - when to run what - is left up to me/us.
Well, the absolute time to start you can specify with GNUbatch by minute, but
not with GridEngine and I think also also not with Torque.
This could be circumvented by a minimum starttime for the job and an unlimited
number of slots though. Hence it's guaranteed: when the minimum starttime is
reached, it will start (limited by the number of available slots defined on
this machine). I think this is what you arel looking for: don't start1200 jobs
at once but only 50 or whatever at a time.
>> Is it to run in a cluster of machines or only on a local one? To me it
>> sounds like you need some features from GridEngine (like the accounting
>> which you can access later), and on the other hand some kind of
>> "real-time-feature" from GNUbatch.
>
> No, it is to run all 1200 mostly tiny, light-weight jobs on one
> machine, just like nagios does. Typically ping, check_tcp which
> basically only tries to connect to a TCP port. But could be anything.
>
> Didn't know about GridEngine, will look into that. I'm just getting
> worried that the scheduler's overhead is much much higher than the
> jobs' with these many tiny jobs!
Yes, GridEngine is more to schedule by certain conditions like: the former
usage in the past, backfilling, i.e. classic batch processing. But they have an
accounting for each job which can be checked after the job.
> What we're currently doing is starting a process every 15 min, that
> runs N child processes at a time in parallel until all 1200 are done.
> The consequence of that is we get high-ish load around 0:15, 0:30,
> 0:45 etc. and would love to use "a scheduler" to smooth out that load,
> and hopefully use existing tool infrastructure to get more debugging
> insight (execution times, output etc.) and runtime control.
One the one hand I think GridEngine is one size to large for this task, and you
can achieve a similar scheduling with GNUbatch:
By using one variable which you preset with the number of jobs beforehand and
each job has a condition:
$ gbch-var -C -s 5 master
$ gbch-r -c "master>0" -s "master-=1" test.sh
As -s will undo the assignment it did at start of the job, it will always
adjust "master" to reflect the number of used slots (you could also do it the
otherway round: start with 0 and testing for -c "master<=5" and an increment of
the variable.
But there is no accounting about used memory or runtime of each job in GNUbatch
AFAIK. What information would you need from the last runs?
>> Do you have more machines than jobs, i.e. all 1200 jobs should run at the
>> same time on a bunch of machines?
>
> One machine. Cool if it lends itself to more machines, but this is
> handled fine by one machine with our brute force N parallel approach.
> The trick is picking N since execution times vary. And then I thought:
> "Somebody must have tackled this previously and created an open source
> project!" :-)
Setting up GridEgine on only one machine is also possible (we use it to
serialize our workflow on local machines and use them over the weekend for
computations), it needs more knowledge to start with than GNUbatch I fear. And:
There is no repeating mechanism builtin.
As a consequence you would need a cronjob to submitting the jobs every 15
minutes some minutes before they are entitled to start.
-- Reuti
Re: [help-gnubatch] ( [PATCH] Bump version in debian/changelog to 1.4 also ) + newbie question, John M Collins, 2012/06/06
Re: [help-gnubatch] ( [PATCH] Bump version in debian/changelog to 1.4 also ) + newbie question, Bob Proulx, 2012/06/08