bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Relative links / destination target


From: Michael
Subject: Re: [Bug-wget] Relative links / destination target
Date: Fri, 28 Jul 2017 21:44:48 +0300

Hi Darshit Shah,

 

I use this script to download my WordPress site which is on hidden place 
protected by password: http://rootdir.finestmall.com/sustainabilityWordpress/:

wget2 --mirror -p --page-requisites --html-extension --adjust-extension 
--convert-links --user=XXXXXX --password=XXXXXXX -e robots=off -P . 
http://rootdir.finestmall.com/sustainabilityWordpress/

 

I use the --convert-links in the script. I do get lines in html file:

 

<link rel="canonical" 
href="http://rootdir.finestmall.com/sustainabilityWordpress/"; />

<meta name="twitter:image" 
content="http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/plugins/all-in-one-seo-pack/images/default-user-image.png";
 />

 

<script type="text/javascript">

                        window._wpemojiSettings = 
{"baseUrl":"https:\/\/s.w.org\/images\/core\/emoji\/2.3\/72x72\/","ext":".png","svgUrl":"https:\/\/s.w.org\/images\/core\/emoji\/2.3\/svg\/","svgExt":".svg","source":{"concatemoji":"http:\/\/rootdir.finestmall.com\/sustainabilityWordpress\/wp-includes\/js\/wp-emoji-release.min.js?ver=4.8"}};

 

<link rel='stylesheet' id='toc-screen-css'  
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/plugins/table-of-contents-plus/screen.min.css?ver=1509'
 type='text/css' media='all' />

<link rel='stylesheet' id='parent-style-css'  
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/style.css?ver=4.8'
 type='text/css' media='all' />

<link rel='stylesheet' id='kadence_theme-css'  
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/assets/css/virtue.css?ver=303'
 type='text/css' media='all' />

<link rel='stylesheet' id='virtue_skin-css'  
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtue/assets/css/skins/default.css'
 type='text/css' media='all' />

<link rel='stylesheet' id='virtue_child-css'  
href='http://rootdir.finestmall.com/sustainabilityWordpress/wp-content/themes/virtuechildtheme/style.css'
 type='text/css' media='all' />

 

I want the yellow parts removed or changed to the real domain name I use: 
http://www.condo-farm.com/

 

So, I suggest to do filtering of the yellow parts to $new-dome-name and if it 
is not specified remove the yellow parts altogether. 

 

So, what "--convert-links" does ? J

 

Michael

 

 

From: Darshit Shah [mailto:address@hidden 
Sent: Friday, July 28, 2017 3:44 PM
To: Michael
Cc: Bug-Wget
Subject: Re: [Bug-wget] Relative links / destination target

 

Hi Michael,

There is already a similar option to what you ask for. Is the "--convert-links" 
option not sufficient?

 

On 28 July 2017 at 14:18, Michael <address@hidden> wrote:


Hello there,

To create static html file, I use the following script:
echo wget 3d-print-master site to html.
wget --mirror -p --page-requisites --html-extension --adjust-extension
--convert-links -e robots=off -P . http://3d-print-master.com/

destintation="http://somewhere.com";
destintation=""
find 3d-print-master -name "*.html" -exec sed -i -r -e
"s%https?:[\\]?/[\\]?/3d-print-master.com/%$destination%g 
<http://3d-print-master.com/%25$destination%25g> " {} +

The generated html files contains reference to the original site. I want to
convert the links to relative link so the site can be seen from local disk
for example, or
to create new links to the target location.

I suggest to have a flag like -destination http://www.somewhere.com
If it is not specified, the links will become relative.

What do you think?

How can I implement it in wget code? Can I call system("find 3d-print-master
-name \"*.html\" -exec sed -i -r -e
\"s%https?:[\\]?/[\\]?/3d-print-master.com/%$destination%g\ 
<http://3d-print-master.com/%25$destination%25g/> " {} +
"); for that?

Michael




-- 

Thanking You,
Darshit Shah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]