parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using pre-existing SSH connection


From: Ole Tange
Subject: Re: Using pre-existing SSH connection
Date: Fri, 12 May 2017 00:24:19 +0200

On Mon, May 8, 2017 at 10:20 PM, Joe Sapp <sappj@ieee.org> wrote:
> On Mon, May 8, 2017 at 4:09 PM, Ole Tange <ole@tange.dk> wrote:
>> On Mon, May 8, 2017 at 9:02 PM, Joe Sapp <sappj@ieee.org> wrote:
:
> I am using --controlmaster in my calls to parallel, but it seems to
> still be making new connections even though I don't have a custom ssh
> command.  I will try to come up with a test case and some evidence.

GNU Parallel starts a 'ssh -MTS socket' which is kept open for the whole time:

$ time ssh lo echo 1
real    0m0.319s
# 300 ms for a normal SSH-login
$ parallel -M -S lo sleep ::: 1000 &
$ sleep 1
$ ps aux | grep MTS
<<shows socket ala /tmp/control_path_dir-9ZBF/ssh-user@lo:22>>
$ time ssh -S /tmp/control_path_dir-9ZBF/ssh-%r@%h:%p lo echo 1
real    0m0.013s
# 10-13 ms for a multiplexed login

So by running 'ssh -S socket' it is clearly using the 'ssh -MTS
socket' - otherwise it would be slower as the first command.

Now the normal job commands *do* use this:

$ parallel -M -S lo -vv echo ::: 1 2 3
ssh -S /tmp/control_path_dir-3LbD/ssh-%r@%h:%p lo -- [...]

And the timing shows this, too:
$ time parallel -M -S lo true ::: {1..1000}
real    0m9.686s
# 10 ms per login with 8 running in parallel = clearly multiplexed.

$ time parallel -j10 -M -S lo true ::: {1..1000}
real    0m10.224s

$ time parallel -j30 -M -S lo true ::: {1..1000}
parallel: Warning: ssh to lo only allows for 26 simultaneous logins.
parallel: Warning: You may raise this by changing
/etc/ssh/sshd_config:MaxStartups and MaxSess
ions on lo.
parallel: Warning: You can also try --sshdelay 0.1
parallel: Warning: Using only 25 connections to avoid race conditions.

Ahh, so we change MaxStartups to this:

MaxStartups 300:30:1800

restart sshd and try again:

$ time parallel -j30 -M -S lo true ::: {1..1000}
real    0m12.596s

Looks good so far.

$ time parallel -j100 -M -S lo true ::: {1..1000}
real    0m25.229s

There is clearly a penalty by running more jobs than there are CPUs.
Also it can be improved by raising:

MaxSessions 100

restart sshd and try again:

$ time parallel -j100 -M -S lo true ::: {1..1000}
real    0m20.725s

Compare to non-multiplexed:

$ time parallel -j100  -S lo true ::: {1..1000}
real    1m43.682s


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]