Python Scipt for Automated Download!

Let’s say you have a text file, filled with the links from where you want to download pdf. Well a non-geeky way would be to copy each link into your browser window and download the file but what if the text file containing links has 100000+ links!! Well sounds time consuming, or if we settle for a mediocre value lets say 50 links from where you want to download the respective pdf file, so a mid-level linux user will tell you to fetch those pdf by using the following command

     curl –remote-name URL

But again it is also a very LAME! way to solve things, well here’s a python script to download pdf using CURL only! So you guys must be confused PYTHON & CURL ! in the same domain, yes it is possible thanks to ‘os’ library of Python. Recently Satyakaam Goswami on IRC posted a link of google docs containing url of the pdf to be downloaded. He gave us strict instruction to generate a script for download. So what I did was I chose python, and created the following program to extract the pdf from a given set of links, moreover I also copied the link to a file and saved it on my directory (upon which I was working). Now all I need to do is execute the following code

import os
import time
fo=open(“data.txt”,”r”)
print “Name of the file:”, fo.name
arp=fo.readline()
for x in range(1,67):
    line=fo.readline()
    os.system(‘curl –remote-name’+line+”)
   

in place of 67 you can also take EOF as a parameter, since I copied the link from an excel file I was able to see that there were 67 links pertaining to the ROW NUMBER.

And when I typed those magic words
python *FILEnAME*.py

VOILA!!!
All files downloaded! BINGO!!

Feel free to comment or criticize or give suggestion over this problem which was solved by few lines of python code!
                                                    

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>