bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] [PATCH] improved sparse file detection


From: Bernd Schubert
Subject: Re: [Bug-tar] [PATCH] improved sparse file detection
Date: Wed, 25 Aug 2010 01:38:02 +0200
User-agent: KMail/1.13.2 (Linux/2.6.32-24-generic; KDE/4.4.2; x86_64; ; )

On Wednesday, August 25, 2010, Eric Blake wrote:
> [adding coreutils]
> 
> On 08/24/2010 09:17 AM, Bernd Schubert wrote:
> > Hi all,
> > 
> > for improved stat() performance the Lustre filesystem uses entirely empty
> > sparse files on its metadata target (MDT). Now with hundredes of millions
> > of sparse file of huge sizes, creating a backup of of the MDT using
> > vanilla gnu-tar is basically impossible, as it needs far too much time
> > to detect sparse files.
> 
> Coreutils cp(1) has recently started using code to efficiently iterate
> over the locations of all holes within sparse files, with the goal of
> eventually being able to target both Linux ioctls and Solaris SEEK_HOLE
> directives.  I think that could also be leveraged rather nicely for
> tar's detection of sparse files, by stopping the iteration after the
> first hole has been found; in particular, it would rapidly detect files
> that are not completely sparse (whereas the description of your patch
> implies that you only address the subset of quickly detecting a
> completely sparse file, but offer no speedup on partially sparse files).
>   Thus, coreutils' sparse file management is a great candidate for
> migrating into gnulib and sharing among several projects.

Yes we know about FIEMAP and the Solaris lseek() approch and that should be 
the next step to add. Though, the particular system we urgently need efficient 
sparse files right now is running a RHEL53 kernel, which does not have fiemap 
support yet (added to RHEL54). 

> 
> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.
> 
> > PS: I'm used to linux-style indentation and I'm not sure if I did it the
> > right way. If it is wrong, please complain and I will try to reformat
> > it.
> 
> Thanks for taking the time to contribute a patch.  However, the diffstat
> says that your patch is large enough to fall outside the bounds of
> trivial submissions, so I quit reading it to avoid any copyright issues.
>   Would you be willing to assign copyright to the FSF?  If so, we can
> start the paperwork process off-list.

Gosh, that patch is small and I thought if I contribute code to a GPL project 
that it is automatically GPL code. 
Could we please proceed with the paper work ASAP? For Kit and me please.


Thanks,
Bernd


-- 
Bernd Schubert
DataDirect Networks



reply via email to

[Prev in Thread] Current Thread [Next in Thread]