bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] [PATCH] Remove nonportable check for files containing only


From: Andreas Dilger
Subject: Re: [Bug-tar] [PATCH] Remove nonportable check for files containing only zeroes
Date: Thu, 25 Jan 2018 15:49:04 -0700

On Jan 23, 2018, at 11:32 PM, Mark H Weaver <address@hidden> wrote:
> Andreas Dilger <address@hidden> writes:
> 
>> On Jan 23, 2018, at 1:44 AM, Mark H Weaver <address@hidden> wrote:
>>> Andreas Dilger <address@hidden> writes:
>>> 
>>>> On Jan 20, 2018, at 5:06 PM, Mark H Weaver <address@hidden> wrote:
>>>>> Yes, on Btrfs I reliably see (st_blocks == 0) on a recently written,
>>>>> mostly sparse file with size > 8G, using linux-libre-4.14.14.  More
>>>>> specifically, the "storing sparse files > 8G" test in tar's test suite
>>>>> reliably fails on my system:
>>>>> 
>>>>> 140: storing sparse files > 8G                       FAILED 
>>>>> (sparse03.at:29)
>>>> 
>>>> I'd consider this a bug in Btrfs.
>>> 
>>> On what basis?  Can you formulate a precise rule regarding 'st_blocks'
>>> that is worth defending, that would enable this optimization, and that
>>> Btrfs is violating here?
>> 
>> We considered it a bug in ext4 and Lustre on the basis that it broke
>> existing tools (tar, and AFAIR cp) that were working fine when delayed
>> allocation and inline data features were not enabled.  Since we were in
>> a position to fix the filesystems faster than other tools (potentially
>> those beyond tar/cp that we were not aware of), we decided to return an
>> approximation for the st_blocks value to ensure that userspace tools do
>> not behave incorrectly.
> 
> Okay, but this is not an argument that the bug is in Btrfs.  It's merely
> a statement that you chose to work around the problem in your own code.

Well, if a filesystem behaves in a way that causes userspace applications
to lose data, then my preference is to fix the filesystem so that apps do
not lose data, if practical.  That is doubly true if there was a change in
the filesystem that introduced this data loss (delayed allocation and
inline data in ext4, or apparently tail packing in Btrfs).

When delayed allocation was first added to ext4 there was a different problem
that some tools like GTK preferences were creating hundreds of small files
in /etc, but only calling fsync on the last one written.  That caused problems
because delayed allocation meant that only the data from the fsync'd file
would be written (instead of all previously-written files as with ext3).
A hard reboot/crash shortly before all of the files were all flushed (either
by fsync, or the 30s kernel page aging, rather than the previous 5s journal
commit time) could result in many zero-length files in the filesystem.
While POSIX allows the ext4 delayed allocation behaviour, it was a surprise
to many users, and we implemented a heuristic to fix this issue to avoid data
loss rather than just pointing and saying "but POSIX!!!!".

Of course, it is also preferred that the heuristics in tar that trigger this
issue in userspace are also fixed (small files that fit in the inode reporting
zero blocks) via SEEK_DATA and improved heuristics, but Btrfs having a very
large file with data that reports zero blocks just seems plain wrong to me.

>>>> As mentioned previously, we had the same problem with ext4 (twice) and
>>>> Lustre, and in both cases fixed this by adding in the (dirty page
>>>> cache pages/512) if the current block count is zero:
>>> 
>>> Would you like to propose a fix to the Btrfs developers?
>> 
>> I don't use Btrfs and don't know anything about the code, so am not really
>> in a position to do that, but would be happy to discuss it with them if
>> you CC me on a thread/bugzilla related to the issue.
> 
> I'm not going to make a proposal to the Btrfs developers that I don't
> believe in.  I believe that the bug is in GNU tar.

If one considers the evidence like "50 filesystem implementations behave
one way, but Btrfs filesystem behaves differently and it loses data as a
result" then it makes sense to me that Btrfs would change its behaviour
to conform to the "normal" behaviour.  That is doubly true if the change
is trivial to implement.

> You are one of the few people here arguing that GNU tar should continue
> making this assumption about file system behavior, and that Btrfs should
> be changed to conform to that assumption.  You are also a Linux file
> system developer.  This makes you uniquely motivated and qualified to
> make this proposal to the Btrfs developers.

Sure, I thought this issue started discussion after an ongoing discussion
with the Btrfs developers, but I couldn't find any links to a mail thread
or bugzilla ticket to start with.  If this has been discussed with the
Btrfs developers already, I'd like to at least read the background material
so I don't get flamed into an ash heap for raising it there again.

> If no one wants to engage with the Btrfs developers about this issue,
> then we need to accept the fact that GNU tar's assumptions about
> 'st_blocks' are simply false, and no one wants to do the work to make
> them true again.  In that case, we must remove the faulty optimization,
> as in my proposed patch.  One way or another, we need for GNU tar to
> work well with Linux.
> 
> Does this make sense?

Partly, yes.  That said, older versions of tar will still exist, and
other applications may have made the same assumptions (st_blocks == 0
means a file has no data), so even if tar is updated it will not fix
the problem completely for Btrfs users.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP


reply via email to

[Prev in Thread] Current Thread [Next in Thread]