I am using the following command:

wget -r -p --page-requisites robots=off -U mozilla

to download a file called, I have also tried: wget -r -p --page-requisites robots=off -U mozilla

In the first case, an index.html file is downloaded, the html file looks to be the same size as the file I need.

I have also tried using the reject parameter to exclude html files, but then it just downloads a file called index.html.tmp.

In the second case, the wget command errors out, claiming the file cannot be found.

If you visit the link in a browser, it will initiate a file download.

Not sure what is happening here, any help is appreciated.


waltinator avatar
it flag
Read `man wget`. There's an option to treat `index.html` as a list of links it should get. I'm not telling you which option, since I regard reading `man` pages as a sacred duty.
iq flag

Maybe the website is using a script to start the file download, lets try with curl instead of wget

curl -O -J -L -A "Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0"

If that still dont work maybe the website require some specific event like accepting cookies, or javascript.

