Recursive Wget download

Wget can recursively download data or web pages. This is a key feature Wget has that cURL does not have. While cURL is a library with a command-line front end, Wget is a command-line tool. Since recursive download requires several Wget options,

wget --recursive -np -nc -nH --cut-dirs=4 --random-wait --wait 1 -e robots=off https://site.example/aaa/bbb/ccc/ddd/

This downloads the files to whatever directory you ran the command in. To use Wget to recursively download using FTP, simply change https:// to ftp:// using the FTP directory.

Wget recursive download options:

--recursive
download recursively (and place in recursive folders on your PC)
--recursive --level=1
recurse but --level=1 don’t go below specified directory
-Q 1g
total overall download --quota option, for example to stop downloading after 1 GB has been downloaded altogether
-np
Never get parent directories (sometimes a site will link upwards)
-nc
no clobber – don’t re-download files you already have
-nd
no directory structure on download (put all files in one directory commanded by -P)
-nH
don’t put vestigial site name directories on your PC
-A
only accept files matching globbed pattern
--cut-dirs=4
don’t put a vestigial hierarchy of directories above the desired directory on your PC. Set the number equal to the number of directories on server (here aaa/bbb/ccc/ddd is four)
-e robots=off
Many sites will block robots from consuming data. Here we override this setting telling Apache that we’re (somewhat) human.
--random-wait
To avoid excessive download requests (that can get you auto-banned from downloading) we politely wait in-between file downloads
--wait 1
making the random wait time average to about 1 second before starting to download the next file. This helps avoid anti-leeching measures.