Skip to content
Tech News
← Back to articles

DNS Is for People – Not for IT Infrastructure

read original get DNS Security Book → more articles
Why This Matters

This article highlights the importance of reevaluating the use of DNS within internal IT infrastructure, emphasizing that while DNS is vital for public-facing services, it can introduce unnecessary risks and complexity internally. Reducing reliance on DNS for internal systems can enhance overall reliability and minimize potential failure points, especially given past high-profile outages linked to DNS dependencies.

Key Takeaways

The Domain Name System exists because it's difficult for people to remember IP addresses (185.15.59.224) and much easier to remember domain names (wikipedia.org).

Regarding internet-accessible services, it makes sense to publish websites, API endpoints or similar services using DNS, as people have to interfact with them. The added benefit of a domain name is that the associated IP address can change without the client being affected.

This article isn't against DNS for public services, but it questions if we should use DNS for internal IT infrastructure (independent of cloud vs. onprem)

It's always DNS

Although DNS can be a very beneficial service, it can also become a liability. If you want a reliable system, you want as little components as possible. Every additional component adds a potential risk of failure. In addition, more components may create unforeseen behaviour and interactions that can cause outages (circular dependancies, and so on). If you can avoid adding components, you'll have a better chance of building a reliable system.

Within the IT operations space, DNS has made a bit of a name for itself. Many may remember this little haiku.

It’s not DNS There’s no way it’s DNS It was DNS

(source)

There are multiple(1) high-profile(2) incidents where DNS was involved. In these linked cases, the root-cause of the incident isn't the DNS system itself. Yet, because the root-cause affects the DNS service - which is in the critical path for virtually all services - the incident has such a huge impact.

The Facebook / Meta outage was so significant because it locked people out of buildings (physical access) due to 'circular' dependancies on DNS being available. Again, it can be said that the circular dependancy is the root-cause, but the blast radius of DNS is in many cases so enormous that it may be difficult to have a clear end-to-end picture of potential risk.

... continue reading