Reconnecting with an old flame: RSS
, by 4eyes.So if you haven't noticed yet, I am old. Naturally attracted to old things. RSS is one of them. I was using Google Reader in the 2000s, until it died from a non-curable disease. Then I tried some alternatives but, now I can't remember their names. I kept the fire smoldering however never enjoyed RSS as I used to be. And one day, FreshRSS rekindled the old flame. It's a very nice program, although this post is not about it. You should definitely check FreshRSS out if you didn't know about it before. This post is about miniflux, Gitea, a bit of python and how I am using RSS these days. I will also mention some alternatives to my 'setup'.
What is RSS?
Let's get this out of the way first. What is RSS? I don't know... There is also something called atom. As I heard it's also as nice as RSS (if not even better) but I don't know what it is or what is the difference. I even don't know if RSS is the correct term when I mean web feed. Read about them here, here and here, I can't hold everyone's hand...
Anyway, you subscribe to some RSS feeds, like bloggers you like or newspapers you don't pay for it (even you ad-block them), and when they publish something new, you receive them in your RSS reader. It just gives you another excuse to not visit their website...
GitHub as well has a nice feature, you can subscribe to feeds of the programs you use and track new releases. Just add .atom
to end of tag
URL. For example https://github.com/Goaccess/Goaccess/tags.atom
. Extremely useful for repos like goaccess where they update frequently. There is no way I am missing the next GoAccess release.
Oh and podcasts, apparently podcasts use them a lot.
What I am doing?
Today I am using miniflux as my RSS reader. I have subscribed to some feeds like blogs, newspapers, release feeds of GitHub repos, Hacker News and r/selfhosted. Apart from that, I am running 2 python scripts that create very personal RSS feeds.
1) First one checks my Gmail account for 2 specific labels ('newsletter' and 'promo') half-hourly, reads emails, converts them to RSS feed and pushes the XML file to a repository I host in local Gitea instance. Gitea can serve static files, so I subscribe to that file with my miniflux.
2) The other one is replacement of Google News alert. On a daily basis it checks the news for the terms I defined, using Microsoft Bing News API and pushes to the remote repo, like the first one.
Here is a list of what I am using:
Docker Containers
I have miniflux and Gitea running in docker containers. Gitea is only accessible from my local network. Miniflux is exposed to internet with SWAG, so I can read my news while at work or on my mobile. Since miniflux and Gitea are in the same network, miniflux fetch the feeds from Gitea files directly within the local network, so I don't need public access to my Gitea instance.
Gitea
I have been using Gitea for a while now. When I first started it, to be honest I didn't have a real use case for it. But now I am using it for my personal RSS feeds and beancount folders. Also, I am deploying my scripts to my servers via Gitea. Nowadays, there are some controversy around it, but I think many of the self-hosted alternative will work here. Here is my docker-compose file:
version: "3"
networks:
Gitea:
external: false
services:
server:
image: Gitea/Gitea
container_name: Gitea
environment:
- USER_UID=1000
- USER_GID=1000
- Gitea__database__DB_TYPE=mysql
- Gitea__database__HOST=<mysql ip>:<port>
- Gitea__database__NAME=<mysql database name>
- Gitea__database__USER=<mysql user>
- Gitea__database__PASSWD=<mysql password>
restart: always
networks:
- Gitea
volumes:
- /home/serkan/Containers/Gitea/data:/data
- /etc/timezone:/etc/timezone:ro
- /etc/localtime:/etc/localtime:ro
ports:
- "3000:3000"
- "222:22"
Gitea also requires a database, you can use existing MySQL instance or add another service to your docker compose file. Here they have a few examples.
Miniflux
Second container I am using is the RSS reader, miniflux. Miniflux is a minimalist and opinionated feed reader. Any questions? No? OK moving on...
services:
miniflux:
image: miniflux/miniflux:latest
restart: unless-stopped
container_name: miniflux
ports:
- "6123:8080"
depends_on:
db:
condition: service_healthy
environment:
- DATABASE_URL=postgres://user:password@db/miniflux?sslmode=disable # POSTGRESS user and password from below
- BASE_URL=https://your.domain.tld
db:
image: postgres:15
container_name: miniflux-db
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
volumes:
- ./data:/var/lib/postgresql/data
healthcheck:
test: ["CMD", "pg_isready", "-U", "miniflux"]
interval: 10s
start_period: 30s
I re-discovered RSS thanks to freshrss. It is really a solid program and get jobs done perfectly. It has some plug-in environment to enhance the experience. But it feels a little old and sluggish. So I decided to switch to miniflux. It has very small set of configuration and I feel it is much faster. It provides Fever API, so you can use it with compatible mobile apps. Also, it has a few more integration. The UI is clean and simple. No distractions, just thousands and thousands of unread news...
One thing I recommend doing is to add a custom CSS. Because newsletters uses tables heavily. Those HTML codes are rendered very nicely in Gmail for example, but they don't look nice with Miniflux's default style. Something like below is sufficient, so you can hide the table borders.
table, th, td {
border-style: hidden;
}
Python, git and cron
You can find the scripts here and here. I tried to explain my code as much as possible, but you may also realize, I am not a programmer, so don't judge me too much. Both scripts are using loguru for logging and feedgen to create the XML files.
Gmail2RSS
The gmail2RSS script is using Google API. Finding the relevant part in the email payload that contains HTML content was especially tricky. I don't know if this is a peculiarity of GMAIL.
Authentication flow for Google API is also not easy, at least wasn't easy for me but with a little research I think I figured it out. More on this below.
I have created multiple filters and rules. Selected emails get "promo" or "newsletter" tags.
Bing2RSS
The second script gets the news from Microsoft Bing News API. The method is the same. You need two YAML files. First one is the credentials.yaml
, where you need to keep your azure key. More on this below.
credentials.yaml
----------------
subscriptionKey: "somerandomlettersandnumbersazuregivesyou"
The other file is config.yaml
where you keep your search terms. Script will loop through the terms. For example, if you want to be kept updated about when Tina Turner will release her next album, you can do something like this:
config.yaml
-----------
terms:
- name: Queen of RocknRoll
q: Tina Turner
You can add more terms but keep in mind the free tier has 1000 calls per month, each term is one call, so up to 30 terms should be within the limits.
git
You need to initialize 2 separate git repositories, each for one. Then add remote repos. When you first create the repos in Gitea, it shows you how to do it. And I am pretty sure, there are hundreds of tutorials how to do this internet, that could explain a hundred times better than me.
Cron
I have 2 cron tasks for the scripts. Gmail2RSS runs every 30 mins and bing2RSS runs at 7 am every morning.
0 7 * * * /usr/bin/python3 /home/4eyes.dad/python-scripts/bing-RSS/main.py >/dev/null 2>&1
*/30 * * * * /usr/bin/python3 /home/4eyes.dad/python-scripts/gmail2RSS/main.py >/dev/null 2>&1
Scripts will create XML files in the same folder and GitPython will push them to remote.
Others: Gmail and Bing APIs
You will need two 3rd party APIs for this. Google API to read the emails and Bing search API to get the news.
Google API
First you need to create a new project or use a project you already have in Google Cloud Console and add Gmail API from the market. Then go to credentials tab and create a new web application with OAuth Client ID. Once you have created the app, you need to follow the authorization flow. It's a bit tricky, but Google has a good tutorial here.
Azure
For the Bing Search API go to Microsoft Azure and create an account if you don't have already. Then create a subscription, you can create a free subscription and test it, or you can create a pay-as-you-go subscription. Don't worry Azure will not charge you if you don't select one of the paid tiers. Next step, add a resource. Find Bing Search v7, click create and fill the form. Selecting F1 in the pricing tier is important. This plan will give you 1000 calls per month and 3 calls per second. This should be enough for 20-30 topics if you run the script daily.
Once your subscription and resource is created, go to resource page and find your keys under "keys and endpoints". You can paste KEY 1 to credentials.yaml
file. Keep it safe, I don't have to tell this, I am not your father...
Optional: Healtchecks.io
Each script is sending post requests at the beginning and end of the process to my healthchecks.io service. If script fails, it is also sending an error message with the logs, so I can monitor the script and fix it if something breaks. I will hopefully write a proper article for healtchecks later.
Finally, I want to mention some alternatives. One is kill-the-newsletter. You can create purpose specific email addresses, and subscribe to the newsletters with that email address. Then kill-the-newsletter transform them to a web feed. It's also possible to self-host it if you are brave enough run an email server.
Other one I want to mention is RSS-bridge. As the name suggest, it bridges the gap to the websites that doesn't provide a feed. There are hundreds of bridges you can use, including many social networks.