coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: new cp option: --efficient-sparse=HOW


From: Jim Meyering
Subject: Re: RFC: new cp option: --efficient-sparse=HOW
Date: Mon, 31 Jan 2011 23:27:44 +0100

Eric Blake wrote:

> On 01/31/2011 02:46 PM, Jim Meyering wrote:
>> Now that we have can read sparse files efficiently,
>> what if I want to copy a 20PiB sparse file, and yet I want to
>> be sure that it does so efficiently.  Few people can afford
>> to wait around while a normal processor and storage system process
>> that much raw data.  But if it's a sparse file and the src and dest
>> file systems have the right support (FIEMAP ioctl), then it'll be
>> copied in the time it takes to make a few syscalls.
>>
>> Currently, when the efficient sparse copy fails, cp falls back
>> on the regular, expensive, read-every-byte approach.
>>
>> This proposal adds an option, --efficient-sparse=required,
>> to make cp fail if the initial attempt to read the sparse file fails,
>> rather than resorting to the regular (very slow in the above case) copy
>> procedure.
>>
>> The default is --efficient-sparse=auto, and for symmetry,
>> I've provided --efficient-sparse=never, in case someone finds
>> a reason to want to skip the ioctl.
>
> Conversely, what happens if I have a file that contains large blocks of
> zeros but is NOT fully sparse (plausible, since we're still facing the
> fact that it is still not easy to punch holes into existing files when
> data in that portion of the file is no longer needed)?  Does all the new
> fiemap code still have the ability for me to request that the cp code
> specifically look for large blocks of zero in the source, rather than
> trusting the fiemap, so that I can create a copy that is more sparse
> than the original?  Does that also need a tunable; and if so, should we
> try to combine it into this tunable or is it orthogonal?

It's orthogonal.

--sparse=always still does the hole-punching, independently
of whether we're copying normally or via the efficient FIEMAP-based
code.

E.g., if you have a sparse file, where one non-sparse chunk
contains all-zero blocks (currently 32KiB minimum), then --sparse=always
will convert those blocks to holes, with or without
--efficient-sparse=never.

--efficient-sparse=... controls efficiency while reading
--sparse=...           controls hole-punching (or preserving)

BTW, that the existing hole-punching behavior works for no
sequence shorter than 32KiB is a bug that I will fix very soon.
I think that was introduced as an unwanted side-effect when
increasing buffer size for efficiency.

Thanks for the feedback.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]