[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?
From: |
Micah Cowan |
Subject: |
Re: [Bug-wget] How to ignore link like "index.html?lang=ja"? |
Date: |
Sun, 06 Jun 2010 13:54:28 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 |
On 06/06/2010 01:46 PM, Guillaume Turri wrote:
> Tony Lewis a écrit :
>> Guillaume Turri wrote:
>>
>>
>>> In fact, why is this option treated after a download?
>>>
>>
>> When mirroring, all HTML files have to be downloaded (whether or not
>> it is
>> desired to ultimately keep the HTML file) in order to find all the
>> interesting file. For example:
>>
>> wget http://www.somesite.com/index.html --mirror --accept=pdf
> Indeed. I didn't realise it could be used that way.
>
> Thank you for this explanation.
Yeah, that was the original thinking. But I still hate it. For one
thing, there are no longer any guarantees that recurse-able HTML files
end in ".html"; for another, it does the wrong thing if you want to do
-r -l1 -A.pdf (just grab all the pdf links from the given page. It's
better to let you explicitly specifiy what files to download, and a
separately specified set of files to be deleted afterwards (or more
accurately, files to download only for parsing/recursion purposes, as at
some point in the future we might not actually download all files
directly to disk just in order to parse them).
--
Micah J. Cowan
http://micah.cowan.name/
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Peng Yu, 2010/06/01
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Micah Cowan, 2010/06/01
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Keisial, 2010/06/03
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Micah Cowan, 2010/06/03
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Keisial, 2010/06/03
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Guillaume Turri, 2010/06/03
- RE: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Tony Lewis, 2010/06/03
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Guillaume Turri, 2010/06/06
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?,
Micah Cowan <=
- RE: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Tony Lewis, 2010/06/07
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Micah Cowan, 2010/06/07
- RE: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Tony Lewis, 2010/06/07
- Re: [Bug-wget] How to ignore link like "index.html?lang=ja"?, Micah Cowan, 2010/06/07