|
From: | Antonio Diaz Diaz |
Subject: | Re: [PATCH] change default --format from gnu to posix |
Date: | Wed, 15 Dec 2021 17:10:01 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 |
Paul Eggert wrote:
This discussion suggests the need for a new, easy-to use format option, which is like '-Hpax' except that it omits atime and ctime, and omits the subseconds part of mtime. Using this format would mean that pax extensions won't be used unless they're needed (a file with a long name, a timestamp past the year 2246, etc.) and so the tarball would be more portable to platforms with older or buggy tarball extractors. Maybe we could call this the 'art' format, for "archive reproducible tarball", so that people could use 'tar -Hart' for it. Like 'ustar', 'art' format would be a strict subset of 'pax' format so it would be POSIX-conforming. We could introduce the new option in the next release of GNU tar, and think about changing the default format to it in a later release. What do you think?
I like the idea. In fact is what I do in tarlz. I would only suggest to make 'tar -Hart' protect the extended records with a CRC as tarlz does. See
http://www.nongnu.org/lzip/manual/tarlz_manual.html#Amendments-to-pax-format and http://www.nongnu.org/lzip/manual/tarlz_manual.html#key_005fcrc32 GNU.crc32CRC32-C (Castagnoli) of the extended header data excluding the 8 bytes representing the CRC <value> itself. The <value> is represented as 8 hexadecimal digits in big endian order, '22 GNU.crc32=00000000\n'. The keyword of the CRC record is protected by the CRC to guarante that corruption is always detected (except in case of CRC collision). A CRC was chosen because a checksum is too weak for a potentially large list of variable sized records. A checksum can't detect simple errors like the swapping of two bytes.
Antonio.
[Prev in Thread] | Current Thread | [Next in Thread] |