bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Invalid Content-Length header in WARC files, on some plat


From: David Ryskalczyk
Subject: Re: [Bug-wget] Invalid Content-Length header in WARC files, on some platforms
Date: Tue, 13 Nov 2012 10:17:31 -0500

I found the bug in the first place after using wget in WARC mode on
ARM and PPC systems and having trouble extracting the files.

I believe the issue stems from this line in warc.c:

  if (! asprintf (&content_length, "%ld", ftello (data_in)))

ftello returns a value of type off_t, which can be 32 or 64 bits wide.
%ld is the format specifier for a long, and a long is 32 bits on
32-bit platforms but 64 bits on 64-bit platforms. On Windows a long is
32-bits, whether the platform is 32-bit or 64-bit.

What confused me here is that this works fine with Intel 32-bit x86,
at least on Mac OS X and Linux. It does not work at all with 32-bit
PowerPC or 32-bit ARM.

I'm fairly certain that the configure script for wget sets
-D_LARGEFILE_SOURCE -D _FILE_OFFSET_BITS=64 (or whatever is necessary
for the platform) unless --disable-largefile is specified.

Again, the main reason I'm a bit confused here is because I can't
trigger this issue on 32-bit Intel platforms.

There's also another error that's less critical — asprintf returns -1
if it fails, and the number of bytes printed if it succeeds — not 0 if
it fails. This shouldn't cause the current problem though.


--Dave



reply via email to

[Prev in Thread] Current Thread [Next in Thread]