You can't unit test for taste

I’m building In the Long Run where runners do virtual runs on famous routes around the world. The app tallies up your Strava mileage and plots your total distance as progress against country- or continent-spanning routes. The intention is to provide long-term inspiration and motivation; life is a marathon, not a sprint. You can have a bad month or season but still make progress on your virtual traversal of the world.

The app shows your progress on interactive maps, which lets users do some exploring of their own. But I had long wanted to enrich the maps with interesting sights or historical sites. For routes I was familiar with I could build such lists myself but that doesn’t scale to routes spanning countries I am not familiar with. So I set out to find a data source for points of interest that I could build a pipeline off. Along the way I wrestled with taste and biases, and fought a hallucinating llm. I initially thought AI would be the feature, but it ended up merely in a supporting role alongside other signals and data processing mainstays.

GeoNames was an obvious starting point, an extensive data source with locations, categories and links. The full data set can be downloaded and has a Creative Commons licence. So with my friend Claude I set about building a pipeline to go from the raw dumps to serving relevant points of interest to users of In the Long Run.

We used Python as the programming language (had good library support for the tasks at hand), stored processed data locally as Apache Parquet files and used DuckDB as the query layer. This was my first time using both Parquet and DuckDB but the ergonomics of both felt good and Claude introduced me to their features step by step (and most of the DuckDB work was SQL that I am very familiar with). In general I find adding one or two new tools or technologies to a project is the best way to learn. If the entire stack is new to you the learning curve will be too steep and it might put you off the project entirely. AI coding agents change this calculus somewhat, but even then I find having a handle on most of the technologies being used lets me steer the agent better and make informed decisions instead of blindly following its lead.

Point of interest feature screenshot for a runner on Route 66 near Springfield, Illinois.

I built a project plan with Claude before starting the implementation, outlining the different steps of the pipeline and feature work. As we went along we then built a spec/plan for each step that we could iterate on as we learned more from earlier work. This also meant I could start new agent sessions for each milestone. Condensing results from the previous milestones into short context and instructions for the next step gets you faster and better responses (I find big contexts quickly degrade the quality of agent work).

Notability and notable biases #

To begin with we downloaded and unzipped all the required files from Geonames and set up gitignores for the data files as most are too large to be version controlled.

The first step of processing was joining the downloaded files on the relevant columns and filtering out rows that were not useful for our purposes. For instance we excluded administrative divisions: countries, states, regions etc. We also selected specific feature codes that we thought would be most interesting: parks, historic sites, castles, monuments, mountains, etc. Finally we added a population filter on populated places and an elevation filter on mountains. I’m sure this led to some false negatives, but we wanted a rough first draft.

Somewhat unintuitively the alternateNames.txt Geonames dataset includes Wikipedia links (where isolanguage=link and alternate_name like %en.wikipedia.org% , this usecase feels bolted on to their schema after the fact but it is very helpful data to have). We used this as a notoriety/relevance signal, and it also provided texts that we could build blurbs from as Wikipedia summaries also have a Creative Commons licence.

... continue reading