[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapsho

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapsho

From:	Eric Wheeler
Subject:	[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot
Date:	Fri, 28 Jan 2011 10:46:18 -0800

On Fri, 2011-01-28 at 11:49 +0100, Sebastian J. Bronner wrote:
> Hi Eric,
> 
> thanks for your input.
> 
> Patching block-fuse didn't work because the file-handle passed to the
> application by the kernel was created without O_DIRECT in every case,
> despite the inclusion of O_DIRECT in the proper block-fuse function.
> 
> Patching rdiff-backup to support O_DIRECT would also not help us. We
> discovered that it is not possible to open files from a fuse volume
> using O_DIRECT. The attempt is rejected with "Invalid argument".
> 
> However, inspired by the specs you mentioned in your setup, we finally
> found a mention of the NFS mount flag '-onoac'. Since we are backing up
> from raid1 to NFS, this was interesting.

Interesting!  Yes, that would do the trick!  It may be useful to patch
blockfuse with a rate-limit option, thus slowing the reading speed out
of blockfuse and preventing write-side saturation.  

-Eric




> Principally, this flag is meant to disable attribute caches, however it
> has the nice side-effect of throttling the read speed of the copy
> process to match its write when writing to NFS. This way, the output
> cache (buffers) don't fill up with unwritten data. So the system never
> reaches a point where it has to write to disk before being able to free
> buffers to be able to allocate to another process RAM.
> 
> This is exactly what we need, and the procedure seems to be working for
> us without degrading server performance now.
> 
> Thanks again.
> 
> Cheers,
> Sebastian
> 
> 
> 
> On 27.01.2011 21:10, Eric Wheeler wrote:
> >> Hi Eric,
> > 
> > Hi Sebastian,
> > 
> > I'm cc'ing the rdiff-backup-users list too, they may have some insight
> > as well.
> > 
> >> on LVM snapshots and came across your blog and your articles in that 
> >> regard:
> >>
> >> http://www.globallinuxsecurity.pro/blog.php?q=rdiff-backup-lvm-snapshot
> >>
> >> I'm very impressed both with your rdiff-backup patch and the block-fuse
> >> application.
> > 
> > I'm glad you will find it useful!  Unfortunately, I have found the
> > sparse-destination patch for rdiff-backup is sometimes slow.  I'm
> > running without sparse files until I can figure out a faster way to
> > detect blocks of 0-bytes.  If you or someone on the list knows python
> > better than I, please take a look!
> > 
> >> Since you mentioned that you use this combination to backup up images up
> >> to 350GB, I am interested to find out whether you have encountered
> >> problems with I/O-Wait.
> > 
> > I'm using blockfuse+rdiff-backup after business hours, so if the VM
> > slows down, nobody (or very few) notice.  The server runs 4x 1TB drives
> > in RAID-10, and block-IO peaks at ~225MB/sec.  That 350GB volume was
> > recently extended to 600GB.
> > 
> >> There is a Linux Kernel bug that causes I/O-Wait to skyrocket when
> >> copying large files, especially when those files are larger than the
> >> available memory.
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=12309
> > 
> > Good to know, I was unaware of this bug.  See comment#128, it looks like
> > using ext4 works a little better for writing, possibly because of
> > delayed allocation ("delalloc").  Since I'm using ext4 as my destination
> > backup filesystem, this could be the reason I am not experiencing the
> > same issue.  I suppose it could be my RAID controller (LSI 9240)
> > buffering the IO overhead from the host CPU, too.
> > 
> > What disk hardware are you using for source and destination?
> > 
> >> In our case, a quad-core server running rdiff-backup on a block-fuse
> >> directory, having 8GB ram, is basically made unavailable by the symptoms
> >> I described above. All the virtual machines on it become unreachable.
> > 
> > I have a feeling that this is due to backup-destination contention
> > rather than backup-source contention.  BlockFuse mmaps the source
> > device, and I'm not certain if mmap'ed IO is cached or not.  To
> > guarantee you are missing the source's disk cache, you could patch
> > blockfuse to use direct-IO (O_DIRECT), or backup from a "/dev/raw/rawX"
> > device.  (Missing disk cache is important for backups, because backups
> > tend to be read-once.  Thus, thrashing the cache effects the "good
> > stuff" in the cache.)
> > 
> > For large files, rdiff-backup may benefit from writing with the O_DIRECT
> > flag (a hint from comment#128).  Again, this would help miss the disk
> > cache.
> > 
> > I'm backing up local-to-local; the source is a RAID-10 array, and the
> > destination is a slow 5400rpm 2TB single-disk as tertiary storage.  Do
> > you backup local-to-local, or over a network?  
> > 
> >> If you have any experience with this in your backup scenarios, I would
> >> love to hear back from you.
> > 
> > So far it works great on my side.  I'm deploying this to backup LVM
> > snapshots of Windows VMs under KVM in about 2 weeks on different
> > hardware.  I might have better insight then if I run into new issues.
> > 
> >>
> >> Cheers,
> >> Sebastian
> > 
> 
>

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot, Eric Wheeler, 2011/01/28
- [rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot, Sebastian J. Bronner, 2011/01/31
  - [rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot, Eric Wheeler <=

Prev by Date: [rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot
Next by Date: [rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot
Previous by thread: [rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot
Index(es):
- Date
- Thread