Find Related products on Amazon

Shop on Amazon

The power of interning: making a time series database smaller

Published on: 2025-07-07 00:03:03

This week-end project started by browsing the open-data repository of Paris’ public transport network, which contains various APIs to query real-time departures, current disruptions, etc. The data reuse section caught my eye, as it features external projects that use this open data. In particular, the RATP status website provides a really nice interface to visualize historical disruptions on metro, RER/train and tramway lines. A usual day of disruptions on ratpstatus.fr. Under the hood, the ratpstatus.fr GitHub repository contains all the JSON files queried from the open-data API, every 2 minutes for almost a year now. A repository with 188K commits and more than 10 GB of accumulated data at the last commit alone (as measured by git clone --depth=1 ) is definitely an interesting database choice! To be clear, this post isn’t in any way a critique of that. RATP status is an excellent website providing useful information that runs blazingly fast and smoothly without the usual bloat you ... Read full article.