Re: [Bug-wget] --header="Accept-encoding: gzip"

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] --header="Accept-encoding: gzip"

From:	andreas wpv
Subject:	Re: [Bug-wget] --header="Accept-encoding: gzip"
Date:	Wed, 23 Sep 2015 21:09:37 -0500

Thanks for the insights. and for working on the next version.
andreas

On Wed, Sep 23, 2015 at 3:10 AM, Tim Ruehsen <address@hidden> wrote:

> > wget --user-agent "Mozilla/5.0 (Windows NT x.y; WOW64; rv:10.0)
> > Gecko/20100101 Firefox/10.0" -e robots=off --header="accept-encoding:
> gzip
> > " -p -H "www.google.com"
> >
> > Still only gives me 52 kb! and one file: index.html
> >
> > So, accept encoding seems to work, but only for the main file?
>
> As Ángel said, the main file is gzipped but wget can't parse it.
> That's why you just get one file (index.html). (This file could be named
> index.html.gz to reflect the content.)
> You could manually gzip -d it and feed the resulting HTML file to wget
> manually, like wget -r --force-html --input-file index.html --base
> www.google.com
>
> There have been patches to support gzip encoding, but either they were
> half-
> baken or the authors did not sign the FSF copyright assignment.
>
> *Note*
> [Meanwhile, we are working on wget2. Content encodings like gzip and
> deflate
> are already built in here. Also lzma and bzip2 for even better compression
> (but servers don't support it out-of-the-box yet).]
>
> Regards, Tim
>
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] --header="Accept-encoding: gzip", andreas wpv, 2015/09/22
- Re: [Bug-wget] --header="Accept-encoding: gzip", Ander Juaristi, 2015/09/22
- Re: [Bug-wget] --header="Accept-encoding: gzip", Ángel González, 2015/09/22
  - Re: [Bug-wget] --header="Accept-encoding: gzip", andreas wpv, 2015/09/22
    - Re: [Bug-wget] --header="Accept-encoding: gzip", Tim Ruehsen, 2015/09/23
    - Re: [Bug-wget] --header="Accept-encoding: gzip", andreas wpv <=

Prev by Date: [Bug-wget] [PATCH] Re: [RFE / project idea]: convert-links for "transparent proxy" mode
Next by Date: Re: [Bug-wget] [PATCH] Re: [RFE / project idea]: convert-links for "transparent proxy" mode
Previous by thread: Re: [Bug-wget] --header="Accept-encoding: gzip"
Next by thread: [Bug-wget] wget downloads index.html unnecessarily and halts batch script (Windows)
Index(es):
- Date
- Thread