I am running a wordpress site on a Ubuntu 20.04 based LEMP server.
I have the pagespeed plugin enabled, and in order to force it to cache my website, I am using wget from a different box to mirror the site. However, when using wget from a 2nd box, It stops downloading at the first page (index.html), with the error
nofollow attribute found in /tmp/ramdisk/www.example.com/index.html. Will
not follow any links on this page
Below is the wget command I am using and the return results:
wget -m -p -E -k -P /tmp/ramdisk/ https://www.example.com
--2022-05-17 16:41:40-- https://www.example.com/
Resolving www.example.com (www.example.com)... 1**.2*.1**.*
Connecting to www.example.com (www.example.com)|1**.2*.1**.*|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘/tmp/ramdisk/www.example.com/index.html’
www.example.com/index.html [ <=> ] 130.71K 210KB/s in 0.6s
Last-modified header missing -- time-stamps turned off.
2022-05-17 16:41:42 (210 KB/s) - ‘/tmp/ramdisk/www.example.com/index.html’ saved [133848]
nofollow attribute found in /tmp/ramdisk/www.example.com/index.html. Will not follow any links on this page
FINISHED --2022-05-17 16:41:42--
Total wall clock time: 2.0s
Downloaded: 1 files, 131K in 0.6s (210 KB/s)
Converting links in /tmp/ramdisk/www.example.com/index.html... 135.
42-93
Converted links in 1 files in 0.004 seconds.
How can I go about finding the nofollow attributes and removing them so wget will fully download my website?