Multi-Threaded Downloading with Wget
Jan 24th, 2011 | By admin | Category: Random MusingsIn the process of downloading a few thousand log files from one server to the next I suddenly had the need to do some serious multithreaded downloading in BSD, preferably with Wget as that was the simplest way I could think of handling this. A little looking around led me to this little nugget:
wget -r -np -N [url] & wget -r -np -N [url] & wget -r -np -N [url] & wget -r -np -N [url]
Just repeat the wget -r -np -N [url] for as many threads as you need… Now given this isn’t pretty and there are surely better ways to do this but if you want something quick and dirty it should do the trick… Enjoy!



I am downloading some large files, and watching the logs, it seems like multiple threads attempt the same file. Is that expected here? How do you get some sort of locking?
You’re going to need some more sophisticated scripted to do that.. If you have a .txt file containing the urls you are going to be pulling you could have wget build yuo a list of all of the recursive files and add it to a .txt file.. Then you have multiple wget instances go through this file and download it one line at a time, deleting the lines immediately after it copies the contents of that line to the thread that is grabbing it. That’s what i can think of off the top of my head but I’m sure there’s a better way.
you saved my time, thank you
Hi,
to Marc – maybe “-m” can help you if you use ftp downloading. This create .listing files for better downloading.
Pavel