Download files with leading zero in name using wget

Published by Tobias Hofmann on

2 min read

In my previous blog I showed how wget can be used to download a file from a server using HTTP headers for authentication and how to use Content-Disposition directive send by the server to determine the correct file name. With the information of the blog it`s possible to download a single file from a server. But what if you must download several files? Maybe hundreds or thousands of files? Files whose file name is created using a mask, adding leading zeros?

Add leading zeros

What you need is a list of files to download. I`ll follow my example from the previous post and my files follow a specific patter: number. All files are numbered from 1 to n. To make it more special / complicated, it`s not only 1 to n. A mask is applied: 7 digits in total, with leading 0. 123 is 0000123, or 5301 is 0005301. In recent versions of Bash, you can use a FOR loop to loop through the numbers and printf for formatting the output and add the leading zeros. To get the numbers correctly formatted, the command is:

for i in 140000 {140001..140005}; 
  do echo `printf "%0*d" 7 $i`; 
done

This prints (echo) the numbers 140000 to 140005 with leading zero.

Start download

Adding the wget command in the printf directive allows to download the files. The execution flow is to let the FOR loop together with printf create the right number with mask, and wget downloads the file. After the file is download, the next iteration of the FOR loop starts, and the next file is downloaded. Assuming that I have PDF documents named 0140000.pdf to 0140005.pdf on server http://localhost:9080, the FOR loop with wget is:

for i in 140000 {140001..140005}; 
  do `printf "wget -nc --content-disposition http://localhost:9080/%0*d.pdf\n" 7 $i`; 
done

Result

Alternative

The above example is using wget. Of course, you can do the same using curl.

Let the world know
Categories: Technology

Tobias Hofmann

Doing stuff with SAP since 1998. Open, web, UX, cloud. I am not a Basis guy, but very knowledgeable about Basis stuff, as it's the foundation of everything I do (DevOps). Performance is king, and unit tests is something I actually do. Developing HTML5 apps when HTML5 wasn't around. HCP/SCP user since 2012, NetWeaver since 2002, ABAP since 1998.

2 Comments

Mexsalem · September 10, 2019 at 17:04

Why such a complicated loop construction ?
Use simply :
wget -nc –content-disposition http://localhost:9080/{0140000..0140005}.pdf

    Tobias Hofmann · September 12, 2019 at 09:26

    Hi,

    the loop is there for helping people to (better) understand. Also, the printf is used to add the leading 0 to the number.

    For instance, using your example, I get:
    http://localhost:9080/140004.pdf

    The leading 0s I can add with %0*d 7 $i are not added.

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.