bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] Multiple path headers mixing sparse and xattrs


From: Dominique Martinet
Subject: [Bug-tar] Multiple path headers mixing sparse and xattrs
Date: Thu, 9 Jun 2016 13:22:53 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

Hi,

(For archive digging purpose, this looks a lot like
http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00095.html ; except
that the file name must contain utf8/non-valid ASCII component)

We've noticed the extracted path for some file is wrong IF both --sparse
and --xattrs is used AND the file is sparse and its path contains some
"weird" characters.

Here's a full reproducer, ran it on today's git master branch:

$ cd $(mktemp -d)
$ mkdir -p t
$ dd if=/dev/urandom of=t/barbarbar bs=1M seek=50 count=1
$ cp t/barbarbar t/mumuµmu
$ tar --xattrs -S -c t | tar -t
t/
t/barbarbar
t/GNUSparseFile.6221/mumuµmu

I'm just listing here, but it would be extracted as such as well.
Looking at the binary tar, the problem is that the path is listed twice
for mumuµmu:
30 GNU.sparse.name=t/mumuµmu
...
38 path=t/GNUSparseFile.6236/mumuµmu

(while barbarbar only has GNU.sparse.name, and no path attribute)


For now I've just quick & dirty patched my own src/xheader.c path_decode
function to take the first path because it seems to work™ and we're in a
bit of a hurry;
another workaround as given in the mail I quoted at start would be to
use --sparse-version=0


I guess the main fix should be to only output the header once though;
looking at the code (src/create.c, write_header_name), it seems that we
explicitely check !string_ascii_p (st->file_name) and write the extra
header then.
I'm not quite sure how to cleanly check that we already wrote the
filename in another attribute then...

(Thinking back we might want to handle retro-compatibility and handle
archives made with existing tar versions over changing the way we code
output; so maybe always preferring GNU.sparse.name over path without
relying on order would be a better solution ?


Thanks,
-- 
Dominique Martinet



reply via email to

[Prev in Thread] Current Thread [Next in Thread]