[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [bug #20398] Save a list of the links that were not followed
From: |
Giuseppe Scrivano |
Subject: |
[Bug-wget] [bug #20398] Save a list of the links that were not followed |
Date: |
Mon, 10 Aug 2015 11:46:11 +0000 |
User-agent: |
Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0 |
Update of bug #20398 (project wget):
Assigned to: tdlewis77 => gscrivano
Open/Closed: Open => Closed
Planned Release: 1.12.x => None
_______________________________________________________
Follow-up Comment #7:
Fixed upstream with:
commit e4db00d74d7c8ade43e57f39344d8505d607308a
Author: Jookia <address@hidden>
Date: Fri Jul 31 23:41:36 2015 +1000
Add option to write URL rejections to a tab-delimited CSV log.
* main.c: Add "--rejected-log" option.
* init.c: Add "rejectedlog" command.
* options.h: Add "rejected_log" parameter string.
* wget.texi: Add brief documentation on new --rejected-log option.
* recur.c: Optionally log details of URLs not traversed.
Add reject_reason enum.
(download_child_p -> download_child): Return a reject_reason.
(descend_redirect_p -> descend_redirect): Return a reject_reason.
(retrieve_tree): Support logging reasons for rejection.
Add write_reject_log_header that writes a CSV format header to a file.
Add write_reject_log_url that writes a url struct to a file in CSV
format.
Add write_reject_log_reason that writes the URL and parent URL as well
as the
rejection reason to a CSV file.
* Test--rejected-log.px: Add a basic test for the --rejected-log
command.
* tests/Makefile.am: Run Test--rejected-log.px.
This allows you to figure out why URLs are being rejected and some
context
around it. CSV is used as the output format since it can be used easily
parsed,
it's delimited by tabs instead of commas to allow using all (quoted) URL
characters and includes column names which may be used for compatibility.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?20398>
_______________________________________________
Messaggio inviato con/da Savannah
http://savannah.gnu.org/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-wget] [bug #20398] Save a list of the links that were not followed,
Giuseppe Scrivano <=