parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel cat


From: Ole Tange
Subject: Re: parallel cat
Date: Sun, 17 Jul 2011 16:08:25 +0200

On Fri, Jul 15, 2011 at 8:39 PM, Dan Kokron <daniel.kokron@nasa.gov> wrote:

> I have a bunch (~200) small (1K to 100K) binary files that I want to
> 'cat' into a larger file.  I usually use "cat pe* > diag", but this
> takes considerable time on the Lustre file system we are using.  I am
> exploring using GNU parallel for this task but have run into some
> difficulties.  Basically the resulting diag file only contains one of
> the input files.
>
> I've tried the following variations.
>
> parallel "cat {} >diag_amsua_n18_03.2011041700" ::: pe*
> parallel cat {} ">"diag_amsua_n18_03.2011041700 ::: pe*
> ls pe* | parallel cat {} ">"diag_amsua_n18_03.2011041700
> ls pe* | parallel -j4 -k cat {} ">"diag_amsua_n18_03.2011041700
> ls pe* | parallel -k cat {} ">"diag_amsua_n18_03.2011041700
> parallel -j4 -k "cat {} >diag_amsua_n18_03.2011041700" ::: pe*

You are _so_ close.

parallel cat >diag_all ::: pe*

It is probably more readable for UNIX users to write this (It does
exactly the same):

parallel cat ::: pe* >diag_all

Or if you prefer the order kept:

parallel -k cat ::: pe* >diag_all

I have no experience with Lustre, but I would imagine that Lustre is
slow at getting the first byte and after that it is pretty fast. Also
the reason why it is slow is because it is waiting. If that is the
case then it will be OK to run a lot of cats simultaneously:

parallel -j0 cat ::: pe* >diag_all

These sections of the man page touches the subject of using the output
from GNU Parallel:

EXAMPLE: Rewriting a for-loop and a while-read-loop
EXAMPLE: Rewriting nested for-loops
EXAMPLE: Keep order of output same as order of input
EXAMPLE: Processing a big file using more cores

If you believe it can be explained better please post your suggestion
for discussion here.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]