parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Distributing work to local and remote computers


From: Ole Tange
Subject: Re: Distributing work to local and remote computers
Date: Tue, 18 Apr 2017 22:36:46 +0200

On Mon, Apr 17, 2017 at 11:09 PM, Eric Geoffroy
<eric.geoffroy@pearson.com> wrote:

> I had a working command until I ran into files whose paths exceeded the
> maximum for the shell (or socket).
>
> This works unless the paths exceed the max-
>
> cat '/Volumes/Cinera/SBO- Video/Python Videos/videofile_paths.txt' |
> parallel -q -S :,8/eric@serverA,8/eric@ServerB/eric@ServerC ffmpeg -i {}
> -vcodec libvpx -qmin 4 -qmax 10 -crf 6 -acodec libvorbis
> '/Volumes/Cinera/SBO- Video/Python
> Videos/Informit/9780134745954/media/video/pyfs_'{/.}.webm
>
> Notes:
> The files are on a shared drive mounted on local and remote computers.
> The files and directories have spaces in them. I use -q
> I send jobs to the remote computers in order of power
> FFMPEG converts files into webm
> I should be able to get— 8 cores * 4 computers = 32 simultaneous encodes. If
> I’m understanding correctly, the first 8 files go to the fastest host
> (local), the next 8 to serverA, and so on.
>
> And all was going well until the long file paths. I scoured the man page and
> examples and found two possibilities:
>
> Plan B
> this gem --workingdir.  I figured I would replace the "cat file" with
> sending the workingdir.
>
> To test this outside of ffmpeg I tried:
> parallel -q --workdir '/Volumes/Cinera/SBO- Video/Python Videos
> 9780134745916/Safari/9780134745923/' -S 8/eric@10.105.241.211 file {} :::
> '.mp4
> output:
> *.mp4: cannot open `*.mp4' (No such file or directory)

I think you meant: *.mp4 and not '.mp4

That failed because *.mp4 was not expanded by the shell and GNU
Parallel quotes special chars so it will also not expand *.mp4.

Try this instead:

cd '/Volumes/Cinera/SBO- Video/Python Videos
9780134745916/Safari/9780134745923/'
parallel --workdir . -S 8/eric@10.105.241.211 file ::: *.mp4

GNU Parallel determines which dir . is and cd's into that on the remote system.

This only works the dir exists on the remote system. In practice this
means it will be a shared dir (but in theory you _could_ have a
different dir with the same name on the remote system).

> But it worked when point to a file by name.
> parallel -q --workdir '/Volumes/Cinera/SBO- Video/Python Videos/videos/' -S
> 8/eric@ServerA file {} ::: 00_00_00.mp4
> output:
> 00_00_00.mp4: ISO Media, MP4 v2 [ISO 14496-14]

This works because the name is already expanded.

> Plan C
> In the Example-
> Convert *.mp3 to *.ogg running one process per CPU core on local computer
> and server2:
>
> parallel --trc {.}.ogg -S :,server2 'mpg321 -w - {} | oggenc -q0 - -o
> {.}.ogg' ::: *.mp3

If the dir is already shared then this is kinda crazy.

> The output file seems to come first. then the hosts. then the mpg321 command
> I don't grok. then that is piped to the ogg encoder with a duplicate output
> file. Why two outputs?

You are asking GNU Parallel to do the following:

--transfer {} to the remote system

Run 'mpg321 -w - {} | oggenc -q0 - -o > {.}.ogg' which decodes the mp3
file to WAV, which is sent through a pipe to oggenc that encodes it
and passes it to stdout which is redirected to {.}.ogg

--return {.}.ogg from the remote system to the dir you are in.

--clean up the remote {} and {.}.ogg by removing them on the remote system.

So the --trc is a triple command that is being do both before and
after the command is run.

The order of GNU Parallel's options does not matter. But the command
to run most come after the last option to GNU Parallel.

> And now I'm stuck. Plan B or Plan C. Which is better? Where have I gone
> wrong?

I would use the --workdir . solution above.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]