Kubernetes egress control with squid proxy

Kubernetes Egress Control with Squid proxy ¶

This Way to the Egress!

— Sign at P.T. Barnum’s Americam Museum

Kubernetes ingress gets a lot of attention – Gateway API, Ingress controllers, service meshes – compared with the Egress, mostly ignored until someone asks “what exactly is our cluster talking to?”, or, in even simple deployments, “Can we see what we are talking to?”. This is a (very) simple approach to that, using the venerable Squid proxy and a NetworkPolicy, without reaching for heavier machinery (but beginning to understand why we would).

This is the overview of the thing I’m about to describe:

Why do I care ¶ Most Kubernetes tutorials focus on getting traffic into your cluster, which is fair since that’s where it usually starts... but traffic flows both ways, and once your workloads start making outbound calls to APIs, databases, and services beyond your cluster boundary, there’s a discussion on visibility and security to be had. I ran into this while working with OpenShift’s egress policies years ago, in so-called “regulated industries”: while not the most flexible at the time, they were the most straightforward answer to security requirements that defined that outbound traffic should go through a proxy. I’m using Kubernetes through k3s (mostly) and kind (often, for develpment) for my own personal stuff (see Projects), so I went back to basics on this: what if we just used Squid – a proxy that’s been solving this problem since 1996! – and enforced its usage with a NetworkPolicy? Nothing fancy, nothing “next-gen cloud-native” just a proxy with logs, and see where that got me?

Squid and k3s: the solution ¶ The architecture is deliberately simple: ┌────────────────────────────────────────────────────────────┐ │ Cluster │ │ │ │ ┌─────────────────────┐ ┌────────────────────────┐ │ │ │ workload namespace │ │ egress-proxy namespace │ │ │ │ │ │ │ │ │ │ ┌─────┐ │ :3128│ ┌───────┐ │ │ │ │ │ pod │ HTTP_PROXY ├──────┼─▶│ squid │─────────────┼───┼──▶ internet │ │ └─────┘ │ │ └───────┘ │ │ │ │ │ │ │ │ │ │ │ x blocked │ └────────────────────────┘ │ │ │ (direct egress) │ │ │ └─────────────────────┘ │ └────────────────────────────────────────────────────────────┘ Workloads configure HTTP_PROXY / HTTPS_PROXY environment variables pointing to Squid, and a NetworkPolicy on the workload namespace blocks direct egress, allowing traffic only to the proxy. Squid logs everything that passes through. That’s it, and this gives us: Visibility : every outbound connection logged with timestamp, destination, bytes transferred

: every outbound connection logged with timestamp, destination, bytes transferred Enforcement : NetworkPolicy makes the proxy mandatory, not optional

: NetworkPolicy makes the proxy mandatory, not optional Simplicity: no CNI plugins, no service mesh, no CRDs Why Squid? I used Squid deliberately because it predates Kubernetes and most “cloud-native” tooling, but still does exactly what this problem requires: explicit HTTP/HTTPS egress control, logging, and policy enforcement. The point here isn’t to be original, but to show that older, well-understood components still fit naturally inside Kubernetes when used intentionally. Squid has a very good feature set around access control and visibility, and is much less “ingress-first” than common alternatives.

The demo application ¶ To test this out, I’m using a small application I built: Horizons, a Common Lisp application using Datastar that displays the solar system and fetches data from NASA’s JPL Horizons API when you click on a planet. It’s a good test case because it makes real HTTPS calls to an external API – exactly the kind of traffic we want to observe. It’s a scaled-down version of DataSPICE, an app I made to test my Common Lisp SDK for Datastar and that uses NASA SPICE data for a 2D similation of the Cassini-Huygens probe. It uses a multi-stage build process that ends up with a reasonably small binary, horizons-server at 16MB, which isn’t bad for an image-based language like Common Lisp (it can go into ~13MB with some more compression optimisations), inside a trixie-slim Debian image for a total of ~100MB (this can also be optimised, aggressively so).

... continue reading