bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] feature request : wget --delete-missing


From: Micah Cowan
Subject: Re: [Bug-wget] feature request : wget --delete-missing
Date: Thu, 04 Mar 2010 15:16:09 -0800
User-agent: Thunderbird 2.0.0.23 (X11/20090817)

aneeskA wrote:
> Hi,
> 
> I am making use of wget to mirror my remote server. Previously I used to do
> it using curlftpfs (mount the remote directory and then rsyncing it). I made
> the switch to wget because it does a better job than curlftpfs.
> 
> In rsync, I can use the option "--delete" to delete extraneous files from my
> local repo. When I changed to wget everything worked fine except this. Wget
> simply has no option for it. Can you guys please add such a feature?
> 
> note: I found a patch for version 1.5.3 here :
> http://mrmt.net/linux/wget.html

The reason rsync can do this is that it has full knowledge of the
complete list of files on both the local host and the remote server.
HTTP doesn't provide this information, so having wget delete files
simply because it didn't happen to find a link to it would be
inappropriate behavior in general. Particularly if you're using
timestamping, which would prevent wget from even parsing a file if its
timestamp hasn't changed, and so fail to find any links that are
uniquely found in that file.

The page you linked to said (in Japanese) that it's foolish to have to
rm -r the directory before each mirror attempt with wget. I don't really
agree: since wget's going to have to download each and every file
_anyway_, just in order to ensure that it finds all the available links,
it hardly seems useful to leave the previous files around, just so they
get overwritten. If wget were made to parse the local files when it
realizes it doesn't have to re-download them, then that would help a
lot, but it doesn't currently, and trying to make it do so has some
potential problems (though it still might be worth it).

Such a feature might be better for FTP, which does impart sufficient
knowledge to Wget, but the patch you linked doesn't provide that.

Anyway, Wget currently lacks a maintainer (as of January), so I'm afraid
no one's going to add new features until that changes. I'm the former
maintainer, and am occasionally willing to apply easy bugfix patches,
but nothing beyond that.

-- 
Micah J. Cowan
http://micah.cowan.name/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]