bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] WARC output


From: Gijs van Tulder
Subject: Re: [Bug-wget] WARC output
Date: Wed, 10 Aug 2011 11:38:51 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110617 Lightning/1.0b2 Thunderbird/3.1.11

Giuseppe Scrivano writes:

>> The implementation makes use of the open source WARC Tools library
>> (Apache License 2.0):
>>   http://code.google.com/p/warc-tools/
>
> how much code is really needed from that library?  I wonder if we can
> avoid this dependency at all.

The library comes with some utilities, an HTTrack plugin, a Java module etc. These extra things are not needed for Wget. But of the C library, I used pretty much everything. The library handles all the WARC writing stuff. It can also read WARCs, but that's not needed here.

Rough estimate: 12.000 lines of code (excluding comments).

It's probably important to note that I have changed a few small things in the warc-tools library. (I have records in Git.)


As for the other dependencies:
- I used an MIT-licenced base32 encoder (there seems to be no such
  module in Gnulib), but that's quite small so could be replaced;
- it links to the UUID library.


> Can you please track all contributors?  Any contribution to GNU wget
> requires copyright assigments to the FSF.

Yes, it's all in the Git history, so it's easy to make a list. (There's only one other contributor of code, others helped with testing.)

> In the meanwhile, can you check if you are following the GNU Coding
> Standards for the new code?

I tried to do that. So except for the warc-tools library, which uses a different standard, all new code follows the GNU standards (I hope).

Thanks,

Gijs



reply via email to

[Prev in Thread] Current Thread [Next in Thread]