parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Replacement string for process number


From: Ole Tange
Subject: Re: Replacement string for process number
Date: Wed, 12 Jan 2011 17:58:22 +0100

On Wed, Jan 12, 2011 at 4:49 PM, Jay Hacker <jayqhacker@gmail.com> wrote:
> On 12/23/10, Ole Tange <ole@tange.dk> wrote:
>> On Wed, Dec 22, 2010 at 3:51 PM, Jay Hacker <jayqhacker@gmail.com> wrote:
>>> I'd like to be able to use the number of a process in a GNU parallel
>>> command.
:
>> GNU Parallel cannot do that at the moment.
>>
>> $PARALLEL_PID and $PARALLEL_SEQ are a bit similar to this.
:
>> parallel printf '%02d\\t%s\\n' \$\(\(\$PARALLEL_SEQ%16\)\) :::
>> ~/files/*.txt | parallel --colsep '\t' -P16 "cat {2} >>
>> output-file{1}.txt"
>
> It seems possible that, say, process 0 gets input 0, process 1 gets
> input 16 (because 1-15 finished quickly), and they both write to
> output file 0 at the same time, clobbering the output.  That's what
> I'm trying to avoid.  I want a number that only gets used by one
> process at a time.

That is a valid point.

I have therefore looked at the code to estimate how hard this would be
to implement. If the jobslot number already existed internally it
would be relatively easy to export that as an environment variable.
Unfortunately jobslot number does not exist in the code and it would
require extensive changes to implement that.

>> {n} seems to be what $PARALLEL_SEQ is today.
>
> I had not seen $PARALLEL_SEQ before.  Maybe environment variables
> would be the way to go for what I was calling {p} and {P} as well.

Environment variables would be easier to implement, but that is not
the major hurdle: The major hurdle is that the concept of a jobslot
(and thus jobslot number) is not used in the code.

>> Also have a look at https://savannah.gnu.org/bugs/?31678. It is a
>> feature that would solve your two examples provided that the number of
>> arguments fit a single line (because scp and cat can take more than
>> one argument).
>
> But this only works for the case where the command accepts multiple
> arguments. :(

Ahh, but any command _does_ accept multiple arguments (at least when
you wrap it with parallel):

cat list | parallel -X -j+0 -I {outer} --argsep dummy parallel -j1
singlearg_cmd {} \$PARALLEL_SEQ ::: {outer}

Currently I do not feel it is worth the effort to implement {p}. If
someone makes a patch that does not break other functionality I would
be willing to look at it again. Until then I encourage you to post
your real world problems that would be solved by {p}. Maybe we can
help you find another way to do it.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]