bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] Multiple path headers mixing sparse and xattrs


From: Dominique Martinet
Subject: Re: [Bug-tar] Multiple path headers mixing sparse and xattrs
Date: Thu, 23 Jun 2016 15:35:51 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

Hi,

Dominique Martinet wrote on Thu, Jun 09, 2016 at 01:22:53PM +0200:
> (For archive digging purpose, this looks a lot like
> http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00095.html ; except
> that the file name must contain utf8/non-valid ASCII component)
> 
> We've noticed the extracted path for some file is wrong IF both --sparse
> and --xattrs is used AND the file is sparse and its path contains some
> "weird" characters.
> 
> Here's a full reproducer, ran it on today's git master branch:
> 
> $ cd $(mktemp -d)
> $ mkdir -p t
> $ dd if=/dev/urandom of=t/barbarbar bs=1M seek=50 count=1
> $ cp t/barbarbar t/mumuµmu
> $ tar --xattrs -S -c t | tar -t
> t/
> t/barbarbar
> t/GNUSparseFile.6221/mumuµmu
> 
> I'm just listing here, but it would be extracted as such as well.
> Looking at the binary tar, the problem is that the path is listed twice
> for mumuµmu:
> 30 GNU.sparse.name=t/mumuµmu
> ...
> 38 path=t/GNUSparseFile.6236/mumuµmu
> 
> (while barbarbar only has GNU.sparse.name, and no path attribute)
> 
> 
> For now I've just quick & dirty patched my own src/xheader.c path_decode
> function to take the first path because it seems to work™ and we're in a
> bit of a hurry;
> another workaround as given in the mail I quoted at start would be to
> use --sparse-version=0
> 
> 
> I guess the main fix should be to only output the header once though;
> looking at the code (src/create.c, write_header_name), it seems that we
> explicitely check !string_ascii_p (st->file_name) and write the extra
> header then.
> I'm not quite sure how to cleanly check that we already wrote the
> filename in another attribute then...
> 
> (Thinking back we might want to handle retro-compatibility and handle
> archives made with existing tar versions over changing the way we code
> output; so maybe always preferring GNU.sparse.name over path without
> relying on order would be a better solution ?)


Does anyone have an opinion on this ?
Would you take a patch if I went through the trouble of implementing
either solution ?

I don't really care on which solution to implement and both look
possible to do (either not writing improper path in output tar or
ignoring path if GNU.sparse.name is set on extracting); but I'd rather
not pick one and be told "no we prefer the other one" after not getting
any feedback... Or just being plain ignored.


Thank you,
-- 
Dominique Martinet



reply via email to

[Prev in Thread] Current Thread [Next in Thread]