parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: job race!


From: Ole Tange
Subject: Re: job race!
Date: Thu, 25 Apr 2013 13:19:09 +0200

On Wed, Apr 24, 2013 at 9:25 PM, Ozgur Akgun <ozgurakgun@gmail.com> wrote:

> I want to be able to say, something like `parallel --timeout (fastest * 2)`
> and let get the same output.

I have been pondering if I could somehow make a '--timeout 5%'. It should:

1. Run the first 3 jobs to completion (no --timeout)
2. Compute the average and standard deviation for all completed jobs
3. Adjust --timeout based on the new average, standard deviation and user input
4. Go to 2 until all jobs are finished

The user input would be a percentage e.g. 5% - meaning "I want the job
killed if takes longer to run than the 95% fastest jobs". We can
statistically compute that limit if we assume that the run time of the
jobs is normally distributed (the bell curve
https://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg) and
that the run time of the jobs does not depend on the order (e.g. it
will not work if we get all the fast jobs first).

I am not sure if the run times of jobs generally are normally
distributed or if they are more like Chi-square or another continuous
distribution, and in this case it probably does not matter, because
the percentage of jobs that people want timed out will always be <
30%. If you have some insight in this, please speak up.

With the above '--timeout 5%' will normally kill 5% of the jobs - even
if they are not "bad", and that might less useful than just a percent
of the median run time:

  --timeout 200%

which would kill all jobs taking more than twice as long as the median
run time (using remedian to do median in finite memory).

I do not think looking at the fastest jobs is a good indicator: You
can have an odd job that is extremely fast while the median is much
slower.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]