rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Restarting development ... or starting over


From: covici
Subject: Re: [rdiff-backup-users] Restarting development ... or starting over
Date: Wed, 07 Apr 2010 17:28:15 -0400

Randy Syring <address@hidden> wrote:

> Daniel Miller wrote:
> > I wasn't really prepared to make this announcement so soon, but now seems 
> > like a good time to let the community know. I've been working on a new 
> > implementation of rdiff-backup since about a month ago when I dug into the 
> > current codebase and discovered its disappointing quality. While what I 
> > have right now is functional and works on simple cases, it does not cover 
> > the broad range of features currently offered by rdiff-backup. I could use 
> > some help in bringing it up to par if others are interested in the path I 
> > have taken. While I have used the current codebase for direction and 
> > inspiration, I have started with a clean slate for several reasons:
> >   
> I'm interested and am looking forward to seeing the code.
> > - An automated test suite makes adding new features and long-term 
> > maintenance much easier. Adding this to the current codebase is both hard 
> > and boring. One thing that makes it very hard to write tests for the 
> > current codebase is the widespread use of globals. My new implementation 
> > has been developed using TDD and minimal use of globals (e.g. for loggers 
> > and constants).
> >   
> YAY TDD!  :)
> > - The current repository layout has a critical design flaw that causes 
> > performance degradation as a repository grows. Most difference information 
> > is stored in a single file tree (rdiff-backup-data/increments), that has a 
> > very similar structure to the mirror. The problem is that as files get 
> > added/deleted/changed the directories in the increments tree are always 
> > growing in size, meaning it takes longer and longer to list the contents of 
> > directories in the tree. This performance problem is negligible in 
> > small-to-medium sized backup sets, but becomes apparent in very large 
> > backup sets as the number of increments grows. I have redesigned the 
> > repository layout in my new implementation to eliminate this performance 
> > issue. Note that I do not know for sure if my new layout will completely 
> > eliminate this problem since I have not tested it yet with a very large 
> > backup set over a long period of time.
> >   
> Can this be tested further?  It would suck to get further down the
> road with this repository structure and find out it didn't really help
> the problem.
> 

Also, if you don't save all the older increments by removing them
periodically, does that help this problem?

-- 
Your life is like a penny.  You're going to lose it.  The question is:
How do
you spend it?

         John Covici
         address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]