[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [rdiff-backup-users] Client dying "randomly" on 1.1.16
From: |
Oliver Hookins |
Subject: |
Re: [rdiff-backup-users] Client dying "randomly" on 1.1.16 |
Date: |
Fri, 29 Aug 2008 18:07:28 +1000 |
User-agent: |
Mutt/1.5.17+20080114 (2008-01-14) |
On Thu Aug 28, 2008 at 11:56:07 -0300, Armando M. Baratti wrote:
> Oliver Hookins escreveu:
>
> We run rdiff-backup with standard priority, but I think for whatever
> reason
>
> it is getting held up on large files while lots of I/O is taking place. I
> grabbed a tcpdump of the traffic and it looks like there is normal SSH
> traffic until it hits whatever snag, then there are no packets for two
> hours.
>
> Then the connecting machine sends a TCP keep-alive, to which the machine
> being backed up sends a TCP reset. So I think maybe my keep-alive settings
> need some tweaking.
>
> What sort of settings do people use in their ssh configs for BatchMode,
> ServerAliveCountMax, and TCPKeepAlive?
>
>
>
> I've had the same behaviour as described above.
> I've done a first rsync of my target (about 1.7 GB worth of data) on my
> backup machine. That takes about 20 min.
> Then I've done a rdiff-backup with "--force" option just to build the
> rdiff-backup-data directory.
> That takes 07:43h (no typo, more than 7 hours) to do that.
> During most of this time the connection remained idle (I don't know why).
>
> The only measure to maintain the connection alive was to set
> "ServerAliveInterval 120" on ssh client side (/etc/ssh/ssh_config).
> (I've also set "KeepAlive" to "no" on the server side, so the connection
> would no be aborted if the client was not responding, but this alone could
> not avoid the connection break, and I've not returned the option when
> testing "ServerAliveInterval 120" on the client side).
>
> I'm using version 1.0.5 of rdiff-backup (on both sides).
> The CPU and memory consumption are very low, and the same with respect
> with disc activity.
>
> I was blaming this to the difference on the versions of OpenSSH between
> the machines (4.3p2 on the backup, 3.5p1 on the target machine), but I'm
> not convinced of this anymore.
Well I enabled BatchMode on the initiating server yesterday, and it looks
like all of the backups succeeded last night. Thanks for the advice
everyone.
--
Regards,
Oliver Hookins