[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapsho
From: |
Eric Wheeler |
Subject: |
[rdiff-backup-users] Re: feedback to blog entry rdiff-backup-lvm-snapshot |
Date: |
Fri, 28 Jan 2011 10:46:18 -0800 |
On Fri, 2011-01-28 at 11:49 +0100, Sebastian J. Bronner wrote:
> Hi Eric,
>
> thanks for your input.
>
> Patching block-fuse didn't work because the file-handle passed to the
> application by the kernel was created without O_DIRECT in every case,
> despite the inclusion of O_DIRECT in the proper block-fuse function.
>
> Patching rdiff-backup to support O_DIRECT would also not help us. We
> discovered that it is not possible to open files from a fuse volume
> using O_DIRECT. The attempt is rejected with "Invalid argument".
>
> However, inspired by the specs you mentioned in your setup, we finally
> found a mention of the NFS mount flag '-onoac'. Since we are backing up
> from raid1 to NFS, this was interesting.
Interesting! Yes, that would do the trick! It may be useful to patch
blockfuse with a rate-limit option, thus slowing the reading speed out
of blockfuse and preventing write-side saturation.
-Eric
> Principally, this flag is meant to disable attribute caches, however it
> has the nice side-effect of throttling the read speed of the copy
> process to match its write when writing to NFS. This way, the output
> cache (buffers) don't fill up with unwritten data. So the system never
> reaches a point where it has to write to disk before being able to free
> buffers to be able to allocate to another process RAM.
>
> This is exactly what we need, and the procedure seems to be working for
> us without degrading server performance now.
>
> Thanks again.
>
> Cheers,
> Sebastian
>
>
>
> On 27.01.2011 21:10, Eric Wheeler wrote:
> >> Hi Eric,
> >
> > Hi Sebastian,
> >
> > I'm cc'ing the rdiff-backup-users list too, they may have some insight
> > as well.
> >
> >> on LVM snapshots and came across your blog and your articles in that
> >> regard:
> >>
> >> http://www.globallinuxsecurity.pro/blog.php?q=rdiff-backup-lvm-snapshot
> >>
> >> I'm very impressed both with your rdiff-backup patch and the block-fuse
> >> application.
> >
> > I'm glad you will find it useful! Unfortunately, I have found the
> > sparse-destination patch for rdiff-backup is sometimes slow. I'm
> > running without sparse files until I can figure out a faster way to
> > detect blocks of 0-bytes. If you or someone on the list knows python
> > better than I, please take a look!
> >
> >> Since you mentioned that you use this combination to backup up images up
> >> to 350GB, I am interested to find out whether you have encountered
> >> problems with I/O-Wait.
> >
> > I'm using blockfuse+rdiff-backup after business hours, so if the VM
> > slows down, nobody (or very few) notice. The server runs 4x 1TB drives
> > in RAID-10, and block-IO peaks at ~225MB/sec. That 350GB volume was
> > recently extended to 600GB.
> >
> >> There is a Linux Kernel bug that causes I/O-Wait to skyrocket when
> >> copying large files, especially when those files are larger than the
> >> available memory.
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=12309
> >
> > Good to know, I was unaware of this bug. See comment#128, it looks like
> > using ext4 works a little better for writing, possibly because of
> > delayed allocation ("delalloc"). Since I'm using ext4 as my destination
> > backup filesystem, this could be the reason I am not experiencing the
> > same issue. I suppose it could be my RAID controller (LSI 9240)
> > buffering the IO overhead from the host CPU, too.
> >
> > What disk hardware are you using for source and destination?
> >
> >> In our case, a quad-core server running rdiff-backup on a block-fuse
> >> directory, having 8GB ram, is basically made unavailable by the symptoms
> >> I described above. All the virtual machines on it become unreachable.
> >
> > I have a feeling that this is due to backup-destination contention
> > rather than backup-source contention. BlockFuse mmaps the source
> > device, and I'm not certain if mmap'ed IO is cached or not. To
> > guarantee you are missing the source's disk cache, you could patch
> > blockfuse to use direct-IO (O_DIRECT), or backup from a "/dev/raw/rawX"
> > device. (Missing disk cache is important for backups, because backups
> > tend to be read-once. Thus, thrashing the cache effects the "good
> > stuff" in the cache.)
> >
> > For large files, rdiff-backup may benefit from writing with the O_DIRECT
> > flag (a hint from comment#128). Again, this would help miss the disk
> > cache.
> >
> > I'm backing up local-to-local; the source is a RAID-10 array, and the
> > destination is a slow 5400rpm 2TB single-disk as tertiary storage. Do
> > you backup local-to-local, or over a network?
> >
> >> If you have any experience with this in your backup scenarios, I would
> >> love to hear back from you.
> >
> > So far it works great on my side. I'm deploying this to backup LVM
> > snapshots of Windows VMs under KVM in about 2 weeks on different
> > hardware. I might have better insight then if I run into new issues.
> >
> >>
> >> Cheers,
> >> Sebastian
> >
>
>