My Database Is My Application: Rethinking Webhook Logic with DuckDB and SQL

My Database is My Application: Rethinking Webhook Logic with DuckDB and SQL

Sat May 10 2025 • duckdbsqlwebhooks Back

Imagine you need to build a system for processing incoming webhooks. You're probably picturing a familiar setup: a lightweight web server (FastAPI, Flask, Express.js, etc.), some Python (or Node.js, or Go) handlers to parse JSON, a sprinkle of business logic, and then maybe persisting data to a traditional database like PostgreSQL or MySQL. Perhaps you'd toss events onto a message queue like Kafka or RabbitMQ for downstream processing. Standard stuff, right?

Well, I’ve been experimenting with a different approach. What if I told you I let SQL handle almost all of it?

🏗️ I load the incoming webhook JSON directly into DuckDB. Then, I run a SQL transform—a query dynamically defined and stored in the database itself—to reshape the data. And yes, SQL even helps decide where that webhook payload gets routed next.

Sounds a bit... unconventional? Maybe. But it’s an attempt to solve some persistent challenges I've encountered, and it opens up a fascinating way to think about data, logic, and infrastructure. This isn't just a backend API; the project also includes a simple web UI to manage these configurations visually, making the whole system tangible. Code available here (opens in a new tab)

The Familiar Friction: Why I Started Questioning "Normal"

I've built and maintained my fair share of webhook gateways and integration layers. A few common pain points kept cropping up:

The Code Bottleneck: Every time a new webhook source or a slight variation in transformation logic was needed, it meant code changes. A new handler, a modified Pydantic model, a redeploy. I, or my team, became the bottleneck. Ownership Tangles: Giving multiple teams or users the ability to define their own webhook transformations often meant granting them broader application deployment privileges, or setting up complex, isolated microservices for each. Neither felt quite right. Repetitive Logic: So many webhook handlers do similar things: pick a few fields, rename some keys, maybe enrich with a static lookup. Writing Python for output_payload['userName'] = input_payload['user']['login'] over and over felt like I was missing a more declarative way. Observability Challenges: Understanding why a specific webhook failed or was transformed in a certain way often involved digging through application logs, which could be scattered or inconsistently formatted.

These issues led me to wonder: could there be a more data-centric, self-service approach?

... continue reading