Back to RSS

I've started using RSS again. For a long while there I was mostly remembering to check a handful of sites and relying on Twitter for everything else. Both of these are bad solutions: haphazard, unorganized, habit-forming. So, RSS: it's been around a long time, it still works, and almost every site I read uses it.

Some sites don't, though. I started writing a scraper that would generate an RSS feed, but... too much work, and surely a solved problem. I tried Apify, which crawls websites for you, and lets you scrape out the content you want into an RSS feed (and lots of other types of feeds, too). I set up a few scrapers, added the feeds to my RSS reader, and let them be.

When I revisited them a month or so later, they seemed to have not been running at all. I might have configured something incorrectly, but I'm pretty sure they were all set up to run once a day. There seemed to be no record of them ever running at all, even though I'd tested them all before adding them to my reader.

So... I wrote the custom scraper. It ended up being pretty easy, mostly because almost everything I needed had already been written: axios for fetching the sites, cheerio for parsing what I wanted from the scraped data, jstoxml for converting data to an RSS feed, and express for serving the feed. scrape-to-feed is the end result. I deployed it with now; here's an example feed, which scrapes story headlines from the New York Times: The whole thing took maybe three hours or so, start to finish. And it's actually much more convenient than having to click through the UI of a web app.

Give it a spin if you like.

Hi. My name is Zach Green; I'm a software developer and a writer. I currently work at GetYourGuide in Berlin. I used to work at the MIT Technology Review, and before that, at Alley Interactive. You can find me on GitHub and Twitter, or you can email me at If you feel like reading more, have a look around. Thanks for stopping by.

©2019 Zach Green