[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Relative links / destination target
From: |
Michael |
Subject: |
Re: [Bug-wget] Relative links / destination target |
Date: |
Fri, 28 Jul 2017 21:44:48 +0300 |
Hi Darshit Shah,
I use this script to download my WordPress site which is on hidden place
protected by password: http://rootdir.finestmall.com/sustainabilityWordpress/:
wget2 --mirror -p --page-requisites --html-extension --adjust-extension
--convert-links --user=XXXXXX --password=XXXXXXX -e robots=off -P .
http://rootdir.finestmall.com/sustainabilityWordpress/
I use the --convert-links in the script. I do get lines in html file:
<link rel="canonical"
href="http://rootdir.finestmall.com/sustainabilityWordpress/" />
<meta name="twitter:image"
content="http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/plugins/all-in-one-seo-pack/images/default-user-image.png"
/>
<script type="text/javascript">
window._wpemojiSettings =
{"baseUrl":"https:\/\/s.w.org\/images\/core\/emoji\/2.3\/72x72\/","ext":".png","svgUrl":"https:\/\/s.w.org\/images\/core\/emoji\/2.3\/svg\/","svgExt":".svg","source":{"concatemoji":"http:\/\/rootdir.finestmall.com\/sustainabilityWordpress\/wp-includes\/js\/wp-emoji-release.min.js?ver=4.8"}};
<link rel='stylesheet' id='toc-screen-css'
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/plugins/table-of-contents-plus/screen.min.css?ver=1509'
type='text/css' media='all' />
<link rel='stylesheet' id='parent-style-css'
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/style.css?ver=4.8'
type='text/css' media='all' />
<link rel='stylesheet' id='kadence_theme-css'
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/assets/css/virtue.css?ver=303'
type='text/css' media='all' />
<link rel='stylesheet' id='virtue_skin-css'
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/assets/css/skins/default.css'
type='text/css' media='all' />
<link rel='stylesheet' id='virtue_child-css'
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtuechildtheme/style.css'
type='text/css' media='all' />
I want the yellow parts removed or changed to the real domain name I use:
http://www.condo-farm.com/
So, I suggest to do filtering of the yellow parts to $new-dome-name and if it
is not specified remove the yellow parts altogether.
So, what "--convert-links" does ? J
Michael
From: Darshit Shah [mailto:address@hidden
Sent: Friday, July 28, 2017 3:44 PM
To: Michael
Cc: Bug-Wget
Subject: Re: [Bug-wget] Relative links / destination target
Hi Michael,
There is already a similar option to what you ask for. Is the "--convert-links"
option not sufficient?
On 28 July 2017 at 14:18, Michael <address@hidden> wrote:
Hello there,
To create static html file, I use the following script:
echo wget 3d-print-master site to html.
wget --mirror -p --page-requisites --html-extension --adjust-extension
--convert-links -e robots=off -P . http://3d-print-master.com/
destintation="http://somewhere.com"
destintation=""
find 3d-print-master -name "*.html" -exec sed -i -r -e
"s%https?:[\\]?/[\\]?/3d-print-master.com/%$destination%g
<http://3d-print-master.com/%25$destination%25g> " {} +
The generated html files contains reference to the original site. I want to
convert the links to relative link so the site can be seen from local disk
for example, or
to create new links to the target location.
I suggest to have a flag like -destination http://www.somewhere.com
If it is not specified, the links will become relative.
What do you think?
How can I implement it in wget code? Can I call system("find 3d-print-master
-name \"*.html\" -exec sed -i -r -e
\"s%https?:[\\]?/[\\]?/3d-print-master.com/%$destination%g\
<http://3d-print-master.com/%25$destination%25g/> " {} +
"); for that?
Michael
--
Thanking You,
Darshit Shah