Download files with leading zero in name using wget
In my previous blog I showed how wget can be used to download a file from a server using HTTP headers for authentication and how to use Content-Disposition directive send by the server to determine the correct file name. With the information of the blog it`s possible to download a single file from a server. But what if you must download several files? Maybe hundreds or thousands of files? Files whose file name is created using a mask, adding leading zeros?
Add leading zeros
What you need is a list of files to download. I`ll follow my example from the previous post and my files follow a specific patter: number. All files are numbered from 1 to n. To make it more special / complicated, it`s not only 1 to n. A mask is applied: 7 digits in total, with leading 0. 123 is 0000123, or 5301 is 0005301. In recent versions of Bash, you can use a FOR loop to loop through the numbers and printf for formatting the output and add the leading zeros. To get the numbers correctly formatted, the command is:
for i in 140000 {140001..140005}; do echo `printf "%0*d" 7 $i`; done
This prints (echo) the numbers 140000 to 140005 with leading zero.
Start download
Adding the wget command in the printf directive allows to download the files. The execution flow is to let the FOR loop together with printf create the right number with mask, and wget downloads the file. After the file is download, the next iteration of the FOR loop starts, and the next file is downloaded. Assuming that I have PDF documents named 0140000.pdf to 0140005.pdf on server http://localhost:9080, the FOR loop with wget is:
for i in 140000 {140001..140005}; do `printf "wget -nc --content-disposition http://localhost:9080/%0*d.pdf\n" 7 $i`; done
Result
Alternative
The above example is using wget. Of course, you can do the same using curl.
2 Comments
Mexsalem · September 10, 2019 at 17:04
Why such a complicated loop construction ?
Use simply :
wget -nc –content-disposition http://localhost:9080/{0140000..0140005}.pdf
Tobias Hofmann · September 12, 2019 at 09:26
Hi,
the loop is there for helping people to (better) understand. Also, the printf is used to add the leading 0 to the number.
For instance, using your example, I get:
http://localhost:9080/140004.pdf
The leading 0s I can add with %0*d 7 $i are not added.