Description
Copying an entire website isn’t entirely straightforward. Although tools exist to view web pages offline, what do you do when you want to copy an entire website? You can’t exactly go Save As on each page, and follow every link like that.
Solution
To attempt to copy an entire website, use the httrack
utility. We say attempt because results will be varied. Websites can get really complex behind the scenes and httrack
can only copy (called mirroring) what it can “see”. In simple cases this might be enough, in other cases, it simply won’t work properly.
Here is a an example of the httrack command:
httrack "https://website.com/" -O "/tmp/website.com" -v
-O
specifies the path for the path for the mirror / log files and cache
-v
is for verbose logging.
References
man httrack
- https://www.httrack.com/