Automatically downloading e-papers from Onetz

Automatically downloading e-papers from Onetz

There are two kind of people in our world: Those who still read classic local newspapers and those who collect their news from the Internet, like from Google News. I count myself among the latter but my parents still prefer our local newspaper. They used to receive a oldschool, physical newspaper but some years ago, they decided to use the digital version of it as it is better to handle and of course better for the environment.

Unfortunately, the user interface of our local German newspaper company Onetz was not very intuitive and sometimes quite slowly and of course not responsive. Of course there was no mobile app solution for my parent's mobile phones. This was back in the year 2016.

Thus, I wanted to help my parents by providing them the current day's newspaper as an PDF file. The idea was to create an hyperlink that will display the most recent newspaper in their preferred browser.

My requirements to the tool were the following:

  • The script should be a cross-platform command line tool as my servers are running Linux and/or BSD but sometimes I use a Windows PC for development at home
  • The script should mimic "normal" user requests as good as possible
  • The downloaded PDF files must not be available to the public (this is not an requirement of the tool itself, but for me it was a key requirement, also for legal reasons)

At the end, I decided to create a small Python script that will do the magic for me. I have open-sourced  it on GitHub as free software licensed under GPL 3.0. Feel free to clone or fork it from https://github.com/thomasreiser/onetz-epaper-downloader.

The script was in use on a Raspberry Pi running at my parent's for many years with great success. The execution was controlled via crontab and worked perfectly fine:

*/5 6-12 * * MON-SAT python3 /opt/onetz-epaper-downloader/newspaper.py -c /opt/onetz-epaper-downloader/newspaper-custom.json 2>&1 >/dev/null

(As the newspaper will be made available somewhere in the morning, I try beginning from 6 o'clock every 5 minutes to find and download the PDF. As soon as the PDF was downloaded, the script will exit immediately after it was started)

The downloaded files were then served via an nginx server as static content (there is also an symlink current.pdf created, pointing to the latest version of the newspaper).

Well, it worked fine. Until the Raspberry Pi died around one year ago at the end of 2018. I had not that much spare time to reconfigure the system and my parents somehow used to use the newspaper's web page again and we all somehow forgot how convinient the automatically downloaded e-paper solution was.

Now, I am on vacation and thus I wanted to resurrect my e-paper downloader. And I was shocked: It seems like Onetz has changed the whole e-paper backend and thus my script was useless.

Thus, I have analyzed the new web frontend and some hours later, I was able to update my script to be able to download the newspaper PDF's again. Puh.

The Python script is still not perfect and of course there are bugs somewhere — but it works. Any my parents can now again enjoy their daily newspaper with just one click.

Show Comments