bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: jeff.liu
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Thu, 15 Jul 2010 09:48:28 +0800
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Hi Pádraig and Paul,

Thanks for your quick response.

Pádraig Brady wrote:
> On 14/07/10 18:45, Paul Eggert wrote:
>>>> I see fiemap just as a way to efficiently detect/read holes,
>>>> and should have no bearing on the destination.
>> Hmm, but the proposal quoted below would mean that fiemap does have a
>> bearing on the destination, in the --sparse=auto case.
>> I guess this is OK, but it should be documented.
>>
>>>> cp --sparse=auto (this is currently what cp does by default)
>>>>   recreate the original fiemap holes or resort to existing
>>>>   heuristic if fiemap not available
>> It's not just fiemap.  It's also the Solaris interface with SEEK_HOLE
>> and SEEK_DATA.  The change should involve a module that isolates these
>> low-level details from copy.c.  copy.c should ask the new module for the
>> locations of the holes (or the non-holes: that could be more convenient).
Consider the expansibility, its better to add a new file involves fiemap and 
Solaris interface(I'll
implement the fiemap at the moment).
just like 'copy.c' shared the functions between cp(1) and mv(1).
Maybe it could be used for other utilities to add new features related to them.

>> On traditional hosts without fiemap or SEEK_DATA, the module should report
>> that it doesn't know where the holes are; this can let copy.c resort to
>> the existing heuristic of looking at the size and the disk usage and
>> using the --sparse=always approach if the file "smells" like it's sparse.
>>
>>>> cp --sparse=never
>>>>   write all data, but use fiemap if available to efficiently read
>> Surely there's no need to write all the data if fallocate works.
>>
>>>> cp --sparse=always
>>>>   recreate original holes and perhaps extend add to them for
>>>>   other runs of zero bytes. Without having looked at the code
>>>>   I see this as a little tricky to mix with fiemap.
>>>>   Now since fiemap is only an optimization we can skip it
>>>>   completely for this uncommon case if too tricky (just add a FIXME for 
>>>> now).
>> Yes, that makes sense.  --sparse=always should never invoke fallocate.
>>
>>> For 'cp --sparse=never', when detected holes from SRC file, do not lseek(2) 
>>> against DST file,
>>> instead, write ZEROs to DST file, Am I right?
>> Only if fallocate doesn't work.  If fallocate works, there's no need
>> to write zeros to the destination.
> 
> What you're describing here is posix_fallocate()
> which uses fallocate() if available or falls back
> to an implementation that writes a single 0 byte
> to each block.
> 
>>> 2. Performance optimization, invoke fallocate(2) if an extent flag is 
>>> UNWRITTEN
>> This doesn't sound right.  A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, and
>> so it should act as if it were a hole.  The goal is not to copy the exact
>> fiemap structure of the source (that's impossible): the goal is to use as
>> little time and space as possible.
A FIEMAP_EXTENT_UNWRITTEN extent is marked to allocated although read it will 
return ZEROs through
the filesystem.  So why not using fallocate(2) to deal with it?  IMHO, it meet 
the goal to use
little time and space as possible, Am I miss something?

>> 
>>> If you decide to do that, then please do it as a separate patch.
>> It's not clear to me that the fiemap stuff can be cleanly separated
>> from the fallocate stuff.  To some extent they're the same issue.
>> If they can easily be separated, that's better of course.
> 
> I see fiemap as optimizing reads,
> posix_fallocate() as optimizing writing zeros
> and fallocate() as optimizing allocation.
> 
> So not having thought much about implementation details,
> it seems like they could be logically separated.
> I.E. we could optimize the writing zeros and allocation
> later when we have the fallocate and posix_fallocate
> gnulib modules in place.
I think so, its better to wait until the changes done in gnulib, for now, we 
can add FIXME for both
cases.

> 
> In saying that, doing both now is better
> when these details are in everyone's minds.
> I'll not get to resubmitting my fallocate gnulib patch,
> or doing a posix_fallocate module, this week at least I think.
> 
> cheers,
> Pádraig.

Thanks,
-Jeff

-- 
With Windows 7, Microsoft is asserting legal control over your computer and is 
using this power to
abuse computer users.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]