[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Mirroring / distributing large amounts of data
From: |
Daniel Riek |
Subject: |
Re: Mirroring / distributing large amounts of data |
Date: |
Sat, 6 Jul 2002 12:53:02 +0200 |
User-agent: |
Mutt/1.3.28i |
Ha Mark,
I can't say anything on the general performance but in our case, it
looks like this:
Cfengine config:
[...]
www.!www1.do_mirror_test::
/usr/local/sourcedir
dest=/opt/mirror_test
syslog=false
encrypt=false
purge=true
timestamps=preserv
type=ctime
backup=false
recurse=inf
trustkey=true
server=www1.mydomain
[...]
- First run without existing dest dir:
[root@www2 root]# time cfagent -K -D do_mirror_test
real 8m52.785s
user 0m1.270s
sys 0m12.810s
Ah, one more thing, the first data in my mail to Adrian was with
"syslog=true". - So without logging it looks much better...
- Second run with unchanged source dir:
[root@wwww2 root]# time cfagent -K -D do_mirror_test
real 7m20.412s
user 0m0.430s
sys 0m0.120s
And now RSync:
Firstrun with nonexisting dest dir:
[root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir
mirror_test
real 3m43.060s
user 0m37.400s
sys 0m12.590s
And unchanged sourcedir:
[root@www2 opt]# time rsync -az -e ssh root@www1:/usr/local/sourcedir
mirror_test
real 0m1.404s
user 0m0.610s
sys 0m0.240s
To be correct, we have to take into account, that cfengine is doing more
than just copying that dir. So cfagent without that test:
[root@www2 opt]# time cfagent -K
real 0m16.229s
user 0m0.320s
sys 0m0.170s
So the difference ist still quite big. You are probably right regarding the
security checks, but those are not really necessary in this scenario
where we just mirror in a HA environment, I think. So RSync seems to be
doing just the right things here...
All tests have been run several times with differences <=10 sec. Encryption
does not seem to make a difference for cfagent.
One thing is, that a "ps aux" shows a very different behaviour on the server:
while rsync takes about 50% of cpu power on the server, cfservd stays under 2%
For me this seems to be the reason for the big difference. Did I miss any
configuration option?
Regards, Daniel
On Fri, Jul 05, 2002 at 05:30:59PM +0200, Mark.Burgess@iu.hio.no wrote:
>
> In a paper (not written by me) in 2001, with the older (slower)
> protocol, it was shown that cfengine was *faster* tham rsync
> at distributing files first time around. Rsync is faster at
> updating certain kinds of changes (that the algorithm was designed
> for -- small changes to large files). If cfengine is slower
> at certain things it is because it is doing extra checking
> for security reasons, but that applies to secure copy. This
> is also much faster since version 2, so I do not know of any
> real studies on this.
>
> In short, I just do not believe the asssertion that cfengine
> is slow at copying large amounts of filespace, compared
> to rsync. It doesn't tally with experiments done. Is this
> just an assumption, or have you actually tried to measure
> it and compare?
>
> I am not keen on the idea of including librsync in cfengine.
> It would not be a straightforward task, and cfengine is much
> more security conscious than rsync. Sometimes there is a reason
> to take your time and check stuff.
>
> M
>
>
> On 5 Jul, Adrian Phillips wrote:
> >>>>>> "Daniel" == Daniel Riek <riek@de.alcove.com> writes:
> >
> > Daniel> Hi, we are using Cfengine in a environment where we need
> > Daniel> to copy large amounts of data from one machine to
> > Daniel> another. There are mainly to scenarios: software
> > Daniel> distribution (rpm packages and tarballs) and mirroring for
> > Daniel> a failover cluster.
> >
> > Large as in ? I use copy for the whole cfengine "setup" from one
> > machine to a backup, approximately 1GB which takes some minutes. I can
> > understand anyone trying to anything more than this having problems.
> >
> > Daniel> One way would be to use RSync. That is what we would do in
> > Daniel> this environment if we had no Cfengine. But as we have
> > Daniel> some security issues and rsync would require at least a
> > Daniel> minimal root acces from the mirroring machine, we would
> > Daniel> prefere to use Cfengine.
> >
> > Daniel> Another reason for using Cfengine to copy the data is the
> > Daniel> possibility to have services restarted depending on the
> > Daniel> copy...
> >
> > Daniel> Unfortunately Cfengine seems to be very slow when doing
> > Daniel> such things. This raises the question if anyone else
> > Daniel> tried to use Cfengine in this manner and what his
> > Daniel> experience is like?
> >
> > One thing I'l like to do if possible was link in librsync and have
> > some additional option to copy to make it use it instead. How much
> > work this is and how much it will help I have no idea.
> >
> > Sincerely,
> >
> > Adrian Phillips
> >
>
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Work: +47 22453272 Email: Mark.Burgess@iu.hio.no
> Fax : +47 22453205 WWW : http://www.iu.hio.no/~mark
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
> _______________________________________________
> Help-cfengine mailing list
> Help-cfengine@gnu.org
> http://mail.gnu.org/mailman/listinfo/help-cfengine
>
--
Daniel Riek <riek@de.alcove.com> - http://www.alcove.com/de/
* Technical Manager - Tel.: +49 (0)2 28 / 9 08 69 85
* ALCOVE Deutschland GmbH - Fax: +49 (0)2 28 / 9 08 69 84
* Liberating Software - Mobil: +49 (0)1 71 / 2 80 08 79
pgpoN1GWfVfhB.pgp
Description: PGP signature
Re: Mirroring / distributing large amounts of data, Daniel Riek, 2002/07/05