[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Efficiency of remote copy
From: |
Mark . Burgess |
Subject: |
Efficiency of remote copy |
Date: |
Sun, 7 Jul 2002 13:02:51 +0200 (MET DST) |
Daniel,
I tried to reproduce some numbers to check how cfengine copying
was in relation to other tools. I don't have rsync set up here, so the
closest I got to your test was scp: The results on 500MB were
SCP --
real 22m10.360s (plus-minus 15sec)
user 0m34.320s
sys 0m14.180s
daneel# time /local/sbin/cfagent -K -f cftest
real 20m26.420s (plus minus 15 sec)
user 1m11.800s
sys 0m23.570s
This is not a huge difference, so I cannot se that anything is
actually wrong. I don't fully understand why cfagent/cfservd don't work
up more of a sweat and make things go faster. Possibly network
is the bottleneck. I'm afraid I don't have any appropriate tools
(or time) to try to evaluate the efficiency at the moment, but
I would be interested to know of anyone who does and could do this --
e.g. using some tool like a code analyzer to find out where
programs are spending most of their time. Looking at resource
usgae in detail on both hosts.
This would be a good exercise -- perhpas a paper for LISA 2003?
cheers,
Mark
On 6 Jul, Daniel Riek wrote:
> Ha Mark,
>
> I can't say anything on the general performance but in our case, it
> looks like this:
>
> Cfengine config:
> [...]
> www.!www1.do_mirror_test::
> /usr/local/sourcedir
> dest=/opt/mirror_test
> syslog=false
> encrypt=false
> purge=true
> timestamps=preserv
> type=ctime
> backup=false
> recurse=inf
> trustkey=true
> server=www1.mydomain
> [...]
>
>
> - First run without existing dest dir:
>
> [root@www2 root]# time cfagent -K -D do_mirror_test
> real 8m52.785s
> user 0m1.270s
> sys 0m12.810s
>
> Ah, one more thing, the first data in my mail to Adrian was with
> "syslog=true". - So without logging it looks much better...
>
> - Second run with unchanged source dir:
>
> [root@wwww2 root]# time cfagent -K -D do_mirror_test
>
> real 7m20.412s
> user 0m0.430s
> sys 0m0.120s
>
> And now RSync:
>
> Firstrun with nonexisting dest dir:
> [root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir
> mirror_test
>
> real 3m43.060s
> user 0m37.400s
> sys 0m12.590s
>
>
> And unchanged sourcedir:
> [root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir
> mirror_test
>
> real 0m1.404s
> user 0m0.610s
> sys 0m0.240s
>
>
> To be correct, we have to take into account, that cfengine is doing more
> than just copying that dir. So cfagent without that test:
> [root@www2 opt]# time cfagent -K
>
> real 0m16.229s
> user 0m0.320s
> sys 0m0.170s
>
>
> So the difference ist still quite big. You are probably right regarding the
> security checks, but those are not really necessary in this scenario
> where we just mirror in a HA environment, I think. So RSync seems to be
> doing just the right things here...
>
> All tests have been run several times with differences <=10 sec. Encryption
> does not seem to make a difference for cfagent.
>
> One thing is, that a "ps aux" shows a very different behaviour on the server:
> while rsync takes about 50% of cpu power on the server, cfservd stays under 2%
>
> For me this seems to be the reason for the big difference. Did I miss any
> configuration option?
>
>
> Regards, Daniel
>
> On Fri, Jul 05, 2002 at 05:30:59PM +0200, Mark.Burgess@iu.hio.no wrote:
>>
>> In a paper (not written by me) in 2001, with the older (slower)
>> protocol, it was shown that cfengine was *faster* tham rsync
>> at distributing files first time around. Rsync is faster at
>> updating certain kinds of changes (that the algorithm was designed
>> for -- small changes to large files). If cfengine is slower
>> at certain things it is because it is doing extra checking
>> for security reasons, but that applies to secure copy. This
>> is also much faster since version 2, so I do not know of any
>> real studies on this.
>>
>> In short, I just do not believe the asssertion that cfengine
>> is slow at copying large amounts of filespace, compared
>> to rsync. It doesn't tally with experiments done. Is this
>> just an assumption, or have you actually tried to measure
>> it and compare?
>>
>> I am not keen on the idea of including librsync in cfengine.
>> It would not be a straightforward task, and cfengine is much
>> more security conscious than rsync. Sometimes there is a reason
>> to take your time and check stuff.
>>
>> M
>>
>>
>> On 5 Jul, Adrian Phillips wrote:
>> >>>>>> "Daniel" == Daniel Riek <riek@de.alcove.com> writes:
>> >
>> > Daniel> Hi, we are using Cfengine in a environment where we need
>> > Daniel> to copy large amounts of data from one machine to
>> > Daniel> another. There are mainly to scenarios: software
>> > Daniel> distribution (rpm packages and tarballs) and mirroring for
>> > Daniel> a failover cluster.
>> >
>> > Large as in ? I use copy for the whole cfengine "setup" from one
>> > machine to a backup, approximately 1GB which takes some minutes. I can
>> > understand anyone trying to anything more than this having problems.
>> >
>> > Daniel> One way would be to use RSync. That is what we would do in
>> > Daniel> this environment if we had no Cfengine. But as we have
>> > Daniel> some security issues and rsync would require at least a
>> > Daniel> minimal root acces from the mirroring machine, we would
>> > Daniel> prefere to use Cfengine.
>> >
>> > Daniel> Another reason for using Cfengine to copy the data is the
>> > Daniel> possibility to have services restarted depending on the
>> > Daniel> copy...
>> >
>> > Daniel> Unfortunately Cfengine seems to be very slow when doing
>> > Daniel> such things. This raises the question if anyone else
>> > Daniel> tried to use Cfengine in this manner and what his
>> > Daniel> experience is like?
>> >
>> > One thing I'l like to do if possible was link in librsync and have
>> > some additional option to copy to make it use it instead. How much
>> > work this is and how much it will help I have no idea.
>> >
>> > Sincerely,
>> >
>> > Adrian Phillips
>> >
>>
>>
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> Work: +47 22453272 Email: Mark.Burgess@iu.hio.no
>> Fax : +47 22453205 WWW : http://www.iu.hio.no/~mark
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>>
>>
>> _______________________________________________
>> Help-cfengine mailing list
>> Help-cfengine@gnu.org
>> http://mail.gnu.org/mailman/listinfo/help-cfengine
>>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work: +47 22453272 Email: Mark.Burgess@iu.hio.no
Fax : +47 22453205 WWW : http://www.iu.hio.no/~mark
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Re: Mirroring / distributing large amounts of data, Daniel Riek, 2002/07/05