Simple Script to Download epaper from The Hindu- Not working.

Print Friendly, PDF & Email

Over past year and half I have got numerous mails and comments here that this script is not working. It’s true :-(. This is because of the changes at the publishers end content delivery mechanism. Also I have not given my time and resources towards mitigating the problem. Sorry.

I am no longer posting this script on this page due to errors induced by WordPress code formatting. The script can be downloaded here

How to get it running

Download the script to your Linux Desktop Copy the script to your Linux desktop
go to command prompt using terminal
type following commands

cd ~/Desktop
chmod +x thehindu.sh
./thehindu.sh

Script Updated : Now has options for Bangalore and Kochi edition.

Windows Users :  Follow these instructions

Rocky Singh submitted another mod to the script.. now u can download newspaper published on a specific date.. Link

Ishan Karve

About Ishan Karve

Ishan Karve is just an every day normal guy next door who happens to be an Electronics Engineer by profession and dabbles with PHP, Javascript, C++ and python. His interests vary as seasons change.. they change from astronomy to soul searching. This site is just a reflection of what he does to keep his mind engaged when he is not occupied by work and family. He is an extremely objective guy and is always ready for some good arguments.. of course over a glass of 40% proof alcohol.
This entry was posted in BASH, Programming and tagged . Bookmark the permalink.

69 Responses to Simple Script to Download epaper from The Hindu- Not working.

  1. jitendra says:

    hello ishan

    you really did a great job by putting script to download epaper of various newspaper.
    i was in search of epapers as i am preparing for competitive exams.though i am not a computer expert but still managed to run these scripts from my windows xp by spending whole night searching methods on google.thank you for writing these scripts
    i would really appreciate if you can put script for http://epaper.livehindustan.com

    thank you once again for this wonderful work

  2. A Mad says:

    Ishan
    Nice website, i am getting error while running the script, see below, can you help? Or is it possible to add different editions of the Hindu to your page?

    Enter edition you wish to selec[0-2]: 0
    Thanks.
    Please enter the starting page you wish to download from?1
    Please enter the ending page you wish to download?18
    ./thehindu.sh: line 45: syntax error near `e’
    ./thehindu.sh: line 45: `[Nn]* ) exit;;’

    • Ishan Karve Ishan Karve says:

      Sorry.. there was indeed a error in script.. guess a poor cut and paste job fm my side.. have updated the script and am also mailing the script to u.
      Thanks for visiting

  3. At Mad says:

    Thanks a lot Ishan, it works!!

  4. arch says:

    Hi Ishan,

    The script works really great, I am using ubuntu so didn’t have any issues running the script 🙂 Thanks!
    Do you have any script for times of india as well as I want to download different city edition.

  5. Abhishek Sharma says:

    Any script for windows ? i dont have linux,,,,,:(

  6. arch says:

    Hi Ishan,

    I was checking yesterday and the script seems to miss the supplement pages these days. any idea if the file naming has been changed?

  7. kaushik says:

    hi ishan,,

    need ur help..
    i’m a windows user.. i installed cygwin.
    and i pasted ur script(in cygwin)..
    i got the following message

    “Enter edition you wish to selec[0-2]: 0
    Please select the correct numeric serial.
    Enter edition you wish to selec[0-2]: 0
    Thanks.”

    but nothing happened then…..

    awaitin ur reply:-)

  8. Vardhan says:

    Hi Ishan,

    I wanted to download the Hindu Paper of Visakhapatnam & Vijayawada Edition for the date 01/02/2012 i.e. February 1st 2012. Will the above script help me in this or is there any change needed in the script to be done to download E-paper for a specific date and region like Visakhapatnam/Vijayawada.

    It would be great if you could reply here or send me a mail regarding this process.

    Note: I am using Cygwin for this process

    Thanks,
    Vardhan

    • Ishan Karve Ishan Karve says:

      Hi Vardhan.

      As of now (as displayed on hindu newspaper website) only five editions of the newspaper – Bangalore, Chennai, Hyderabad, Kochi and Delhi – are available in digital form.

      So you can tell me which one of the above is useful to you.. I will send you further instructions then..
      Thanks

      • Vardhan says:

        Hi Ishan,

        I think Hyderabad & Delhi versions will be helpful to me…If you are thinking of choosing any one in these…then I think probably Hyderabad version of 01/02/2012 i.e. February 1st 2012 will be useful to me.

        You can reply me the further instructions whenever possible here or in mail.

        Thanks,
        Vardhan

        • Ishan Karve Ishan Karve says:

          Delhi edition is already available at here

          For Downloading the Hyderabad edition.. the script already caters for the same.

          • Vardhan says:

            Hi Ishan,

            Actually I meant for the E-Paper of a past date which in this case is 01/02/2012 i.e. February 1st 2012. The script above gives the e-paper of today’s date.

            Can you please send me the procedure to download the 01/02/2012 e-paper of Hindu?

            Thanks,
            Vardhan

          • Ishan Karve Ishan Karve says:

            That’s pretty simple Vardhan

            Either change your system data.. easy part or change the date parameter in script

            Search and replace `date +%d` (include the quote) with date (01).. in the script

  9. Vardhan says:

    Hi Ishan,

    I tried replacing `date +%d` with 01 in the script. I am running the script in Windows XP using Cygwin application. I used the following commands. (Note: I changed the script name to hindu1.sh)

    cd /home/vardhan/desktop
    chmod +x hindu1.sh
    ./hindu1.sh

    Then I got the following menu:

    Hindu epaper editions are
    ————————————————-
    0. Chennai
    1. Hyderabad
    2. Delhi
    3. Bangalore
    4. Kochi
    ————————————————-
    Enter edition you wish to selec[0-4]: 0
    Thanks.
    Estimating number of pages in Chennai edition
    Searching for Page 001
    ./hindu1.sh: line 38: wget: command not found
    Searching for Page 002
    ./hindu1.sh: line 38: wget: command not found

    As you can see above after selecting the option 0, I was getting the error as
    line 38: wget: command not found

    This error continued until page 100, then afterwards I got the following error:

    Searching for Page 100
    ./hindu1.sh: line 38: wget: command not found
    ./hindu1.sh: line 46: clear: command not found
    Estimating number of pages in Chennai edition supplement
    Searching for Page 001
    ./hindu1.sh: line 61: wget: command not found
    Searching for Page 002

    Similarly this error too continued until page 100. Then I got the following errors:

    Searching for Page 100
    ./hindu1.sh: line 61: wget: command not found
    ./hindu1.sh: line 69: clear: command not found
    Please be patient..Bandwidth intensive operation starts..;-)
    Downloading Main Paper .. total 100 pages
    Downloading Page 001
    ./hindu1.sh: line 86: wget: command not found
    Downloading Page 002
    ./hindu1.sh: line 86: wget: command not found

    These errors too continued until the end, and then I got the final error as:

    Downloading Page 100
    ./hindu1.sh: line 99: wget: command not found
    Combining all pages into a single pdf document
    ./hindu1.sh: line 106: gs: command not found
    rm: cannot remove `/home/sunil.telidevulapall/Desktop/hindu_Chennai_01022012/*.*’: No such file or directory

    Is there any problem with the script that you posted here – http://karve.in/?attachment_id=590

    Can you please send me the rectifications in the script if there are any as mentioned above?

    Thank you,
    Vardhan

  10. Dilip says:

    Hi I am a windows user. i clearly dont understand any of the steps u have mentioned above..i need chennai edition thehindu epaper..can u please elaborate the steps for a windows xp user?

  11. Dilip says:

    Ok..now i downloaded thehindu..thanks mate..grt wrk..wat r the other uses of this crygwin…can u make a similar script for http://www.vikatan.com/article.php?mid=1&sid=445&aid=16232. actually this is a tamil weekly magazine..only for paid users online..also do mention of the other uses of crygwin..

    • Ishan Karve Ishan Karve says:

      Dilip .. I am not familiar with Tamil. I visited the site which u had requested. But it seems to be an uphill task… Actually all these scripts were just for fun learning..Any ways i will certainly have a look once I have a bit more spare time to tide over linguistic hurdles.

      Regarding Cygwin, I cannot say much authoritatively since I am not a regular use of the same as I rarely use my Windoze box. However as it appers it has a lot of command line goodies and so should be helpful to port your command line programs / script to windows. Programs will of course would need to be x-complied for windows from source.

  12. Dilip says:

    so much thnks for your immediate reply……..k can u send me some links to learn these scripts such as how do u write those thehindu scripts……..thnks..bye

  13. Suchendra says:

    Hello, this is a great script though I’m downloading pdf from http://www.karve.in/news_archive/pdf/

    Regards,
    Suchendra

  14. Harikrishna says:

    Hey Ishan….

    Wat a wonderful piece of work. Hats-off man…
    Hope u have a gr8 future ahead.

    Regards,
    Hari

  15. Harikrishna says:

    Hey Ishan,

    I tried and looked into ur website… its too good. but then i would love to view Hindu Chennai Edition. But I have a Mac System. And i would view the pdfs in a Tablet. I would download the PDF to the Tablet. Is it possible for u to give a link in your website from where i would be able to download the entire pdf.

    Thanks in advance for your kindest help.

    Regards,
    Hari

    • Ishan Karve Ishan Karve says:

      I am constrained by my hosting services provider bandwidth and server time allotted to me.. thats precisely the reason I have given the scripts which can be used by others to dowload editions of their choice..
      However, you can download the PDF’s from here ..

      • Harikrishna says:

        Thats Fine..

        Anyways lemme figure out the way.
        Thanks for the wonderful script…

        Regards,
        Hari

  16. Suchendra says:

    Hi, today there are lot of papers saying 1 day old in the pdf’s. Can you please fix that. Will i be able to download with the script? Thanks. keep up the nice work!!

    Regards,
    Suchendra

  17. Bashir says:

    Hai Ishan…
    I can’t believe this kind of work…marvellous,
    i am a simple windows xp user not familiar with cygwin, still tried to get Kochi edition of The Hindu.
    Same error received as ‘Vardhan’ mentioned. And your reply was cygwin does not have necessary programs installed, could you elaborate the same? Any special attention needed while installing cygwin?
    after installaton of cygwin I got setup as ‘io_stream_cygfile: Fopen(/etc/setup/net-proxy-host) failed 2 No such file or directory’ and similarly ‘ proxy-port’, ‘extrakeys’ and ‘chooser_window_settings’ lines also got sames messages.
    Is my error for downloding was due to this missing?
    waiting for your earliest reply
    thanks
    bashir

    • Ishan Karve Ishan Karve says:

      Hi bashir, Thanks for ur comments. you need to install following programs.. ghostscript(gs), pdftk, wget,curl etc.. i guess i have illustrated the steps quite clearly here.. in case you are still facing problem .. please let me know which OS u are using …

      Thanks for passing by.

      • Bashir says:

        Hai Ishan…..
        Finally I got The Hindu…Thank you for your kind support, can u please elaborate the scripts for Business Line & Economic Times?
        waiting for your reply…bashir

  18. Ram says:

    Hi Ishan, Thank you for the cool script, it works great I first tried today (Saturday) the Chennai Edition of “The Hindu”, but downloads Main paper + just one Supplement (MetroPlus) whereas there are more supplements i.e “Properties” etc., I wonder if you can modify the script so it can download all supplements please.
    Ram

  19. Ram says:

    Hello Ishan, To get all Supplements of “The Hindu” though didn’t know anything about Bash Script I tried to modify the script (using Note++ on Windows 7 PC) i.e basically I copied the “Supplement” part of the steps (Line 54 t0 72 and 90 t0 100) in your Script and modified occurrence of “B” to “C” and pasted into your original Script, but it failed fetch more than one supplement 🙁 – wonder why ?? so here again looking for your help!!!

  20. rocky singh says:

    modified the script a little bit to make it easy to download newspaper for previous dates as well..
    just run the script normally and it will ask for year, month, date.

    link — http://pastebin.com/R3dDuhSC

  21. rocky singh says:

    I have modified the script just a little bit to make downloading newspaper of previous date possible.
    just use the script as normal and it will ask you for year,month, date.

    link — http://pastebin.com/R3dDuhSC

  22. Sachin says:

    hi … thanx for this gr8 work… but ur pdf archive site isnt working anymore… plz rectify this problem… thanx in advance

  23. virender says:

    Ishan .. you are great man!! Even when I am paying them there view was a pain.. The compilation in one pdf is just awesome.. Thanks a lot! I am facing problem in script which someone here has posted to download previous edition. It throws a syntax error :

    $ ./the_hindu_newspaper_download_script.txt
    Hindu epaper editions are
    ————————————————-
    0. Chennai
    1. Hyderabad
    2. Delhi
    3. Bangalore
    4. Kochi
    ————————————————-
    ./the_hindu_newspaper_download_script.txt: line 20: syntax error near unexpected token `$’in\r”
    ‘/the_hindu_newspaper_download_script.txt: line 20: ` case $ed in

    —————————————————
    Please help in rectifying this problem… Thanks in advance..

  24. Anupam Yadav says:

    Hi,

    First of all , great script .. thanks a lot 🙂
    Second , after changing system date , I am able to download only current month;s paper .
    Is the URL of previous month is different ?

    Waiting for your reply 🙂

    • Ishan Karve Ishan Karve says:

      Thanks Anupam for your comments. The url is different for a previous month.. you can figure that out from the link generated by the script.
      Also Newspapers do not keep all the copies of their previous editions online. Generally Max 3 months.

      Thanks again for visiting.

  25. Dilip says:

    can u get a similar script for the website http://www.newyorker.com/ to dwnload the latest issues?

  26. Manish Kumar says:

    Hey Ishan. It seems that Hindu guys have woken up. The wget command is not working. Connection is refused for every page. Any workarounds?

    • Ishan Karve Ishan Karve says:

      Thanks Manish, however I tried the script for the Delhi Edition.. seems to be working fine for me.. Feel free to contact me just in case you are facing any issues.

  27. Danial Jose says:

    Awesome, You own the internet. Gnu/Linux ki Jay!
    Following you!
    I Using the Ubuntu OS!
    Kudos

  28. sandeep says:

    $ ls
    The_Hindu_Delhi.sh thehindu.sh

    sandeep@sandeep-PC ~
    $ chmod +x thehindu.sh

    sandeep@sandeep-PC ~
    $ ./thehindu.sh
    Hindu epaper editions are
    ————————————————-
    0. Chennai
    1. Hyderabad
    2. Delhi
    ————————————————-
    ./thehindu.sh: line 18: syntax error near unexpected token `$’in\r”
    ‘/thehindu.sh: line 18: ` case $ed in

    sandeep@sandeep-PC ~
    $ ls
    hindu.sh The_Hindu_Delhi.sh thehindu.sh

    sandeep@sandeep-PC ~
    $ chmod +x hindu.sh

    sandeep@sandeep-PC ~
    $ ./hindu.sh
    Hindu epaper editions are
    ————————————————-
    0. Chennai
    1. Hyderabad
    2. Delhi
    3. Bangalore
    4. Kochi
    ————————————————-
    ./hindu.sh: line 20: syntax error near unexpected token `$’in\r”
    ‘/hindu.sh: line 20: ` case $ed in

    sandeep@sandeep-PC ~
    $ chmod +x The_Hindu_Delhi.sh

    sandeep@sandeep-PC ~
    $ ./The_Hindu_Delhi.sh
    Hindu epaper editions are
    ————————————————-
    0. Chennai
    1. Hyderabad
    2. Delhi
    3. Bangalore
    4. Kochi
    ————————————————-
    Enter edition you wish to selec[0-4]: 2
    Thanks.
    Estimating number of pages in Delhi edition
    Searching for Page 001
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 002
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 003
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 004
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 005
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 006
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 007
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 008
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 009
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 010
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 011
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 012
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 013
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 014
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 015
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 016
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 017
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 018
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 019
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 020
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 021
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 022
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 023
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 024
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 025
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 026
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 027
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 028
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 029
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 030
    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    Searching for Page 031

    it goes on till 100 and again

    ./The_Hindu_Delhi.sh: line 38: wget: command not found
    ./The_Hindu_Delhi.sh: line 46: clear: command not found
    Estimating number of pages in Delhi edition supplement
    Searching for Page 001
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 002
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 003
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 004
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 005
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 006
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 007
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 008
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 009
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 010
    ./The_Hindu_Delhi.sh: line 61: wget: command not found
    Searching for Page 011
    ./The_Hindu_Delhi.sh: line 61: wget: command not found

    please help me out
    thanks

  29. Bashir says:

    Hai Ishan…
    Hope you fine and well….
    No pdf archives are available nowadays.
    from The Hindu how can we get Business Line using your script. I think only single edition available for BL, can you help?…
    with thanks….bashir

  30. Mixe says:

    i tried running crygwin but the terminal window itself is not getting loaded. When i went through all programs the dialogue box says…… Windows is searching for mintty. To locate the file yourself, click Browse….. after this i am not able to proceed…Please help.

  31. Adi says:

    Hi..
    plz help me out.
    Prob: I couldn’t able to download Bangalore edition on Linux system.
    Odd number coded editions are not getting downloaded (Ex: 3-Bang, 1-Hyd)
    404 Not Found Remote file does not exist — broken link!!!
    ;
    ;
    ;
    some stuff
    ;
    ;
    Error: /undefinedfilename in (/home/Desktop/hindu_Bangalore_20092012/*.pdf)


    Thnx

  32. Aseem says:

    Hey, the script is downloading a blank pdf every time.

  33. Wonder says:

    Hi, The Hindu changed their epaper path. so, now this trick is not working.

  34. Ayan says:

    Wonderful site sir..but a blank pdf gets created..plz help
    Is it a problem of my proxy server..is so plz guide for appropriate code..
    I m getting the following message

    /home/Ayan: Is a directory
    Combining all pages into a single pdf document
    GPL Ghostscript 9.06 (2012-08-08)
    Copyright (C) 2012 Artifex Software, Inc. All rights reserved.
    This software comes with NO WARRANTY: see the file PUBLIC for details.
    Error: /ioerror in –run–
    Operand stack:
    –nostringval– –nostringval– (% Co)
    Execution stack:
    %interp_exit .runexec2 –nostringval– –nostringval– –nostringval– 2 %stopped_push –nostringval– –nostringval– –nostringval– false 1 %stopped_push 1910 1 3 %oparray_pop –nostringval–
    Dictionary stack:
    –dict:1169/1684(ro)(G)– –dict:0/20(G)– –dict:77/200(L)–
    Current allocation mode is local
    Last OS error: Is a directory
    GPL Ghostscript 9.06: Unrecoverable error, exit code 1
    rm: cannot remove ‘/home/Ayan’: Is a directory
    rm: cannot remove ‘G/Desktop/hindu_Delhi_02102014/*.*’: No such file or directory
    rmdir: failed to remove ‘G/Desktop/hindu_Delhi_02102014’: No such file or directory

  35. Naveen Kumar says:

    hi
    whenever i run the ls command it comes the .sh file
    but when i run chmod +x The_Hindu_Delhi.sh
    it comes blank
    what i did wrong to it
    plz suggest that i can also use the bash

  36. Pushpendra Rathi says:

    I want to download economic times epaper pdf version , how I can download that pls help me….

Leave a Reply

Your email address will not be published. Required fields are marked *