emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#33878: closed (zcat vs zcat -f -- different output


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#33878: closed (zcat vs zcat -f -- different output)
Date: Wed, 26 Dec 2018 18:05:01 +0000

Your message dated Wed, 26 Dec 2018 10:03:57 -0800
with message-id <address@hidden>
and subject line Re: bug#33878: zcat vs zcat -f -- different output
has caused the debbugs.gnu.org bug report #33878,
regarding zcat vs zcat -f -- different output
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
33878: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=33878
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: zcat vs zcat -f -- different output Date: Wed, 26 Dec 2018 17:24:14 +0100
Hello guys.

I have a large amount of confidential gzip compressed binary data.
These files _all_ have a very specific property of giving a different
output whether or not I use the "-f" flag of zcat (or gzip -d -c): one
additional line appears when I use the -f flag.

- I don't have the uncompressed versions of these files, nor the
actual tool used to compress them
- I am trying to create a reproducible example but have not yet succeeded

Here is what it looks like, with null bytes replaced by dots for
readability: (sorry for gmail's automatic line wrap, there are of
course only two lines per output)
$ file p.gz: gzip compressed data, was "20181218.TXT", last modified:
Wed Dec 19 08:59:07 2018, from NTFS filesystem (NT)
$ wc -c p.gz
9099264 p.gz
$ zcat p.gz | wc -c
48085600
$ zcat -f p.gz | wc -c
48085955
$ gzip -d -c p.gz | tail -2 | sed -n 'l 0' | sed 's/\\000/./g'
20010101AAAAAAAA 010120010101Q   AA....00A0000000AA0AA0AAA 0A AA 0101
         0012001010101:01T2001012001:0101:01AAAAAAD/S\r$
T000378625.....................
...............................................\r$
$ gzip -d -c -f p.gz | tail -2 | sed -n 'l 0' | sed 's/\\000/./g'
T000378625.....................
...............................................\r$
...................................................................................................................................................................................................................................................................................................................................................................$

That additional line containing only null bytes is not supposed to
appear, is that some kind of padding that was not handled correctly by
gzip?

If this is not yet an identified bug, here are my questions:

Do you know what could be happening?
Do you know how I could try to reproduce the problem on
non-confidential data for you to be able to debug?
(I already tried re-compressing both versions of the decompressed
files with this binary from 2007:
http://gnuwin32.sourceforge.net/packages/gzip.htm but the problem does
not happen)
I can contact the guys who created the files and ask them anything,
but I'd like to be sure of what to ask them because contacting them
repeatedly would be considered very rude. What should I ask them?

Thank you very much in advance for any reply which could make me
understand what is happening :)

Minato
PS: I am not subscribed to the mailing list yet



--- End Message ---
--- Begin Message --- Subject: Re: bug#33878: zcat vs zcat -f -- different output Date: Wed, 26 Dec 2018 10:03:57 -0800 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1
Namikaze Minato wrote:

Do you know what could be happening?

When gzip -cdf sees junk input data, it simply copies it to standard output; this behavior is documented in the gzip manual (look for --force). Your input files have NUL-byte padding at the end, contrary to Internet RFC 1952.

Do you know how I could try to reproduce the problem on
non-confidential data for you to be able to debug?

$ (gzip </dev/null; printf '\0') >t.gz
$ gzip -cd <t.gz | od -c
0000000
$ gzip -cdf <t.gz | od -c
0000000  \0
0000001

Though it's not a bug....


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]