Tech News
← Back to articles

Choosing a Database Schema for Polymorphic Data (2024)

read original related products more articles

Designing a schema for your relational database is a daunting task that has long term implications for the database's performance, maintainability, and correctness. And it often requires making decisions before having a clear picture of the exact shape and distribution of your data, or what the common access patterns will look like. It's not a permanent decision: tables can be altered and databases migrated. But these migrations can be slow and expensive.

To top it all off, some data is less amenable to relational modeling than others. Polymorphic Data, in particular, is a common occurrence that is difficult to model in SQL. You've probably worked with polymorphic data before, even if you're not familiar with the term.

Example: Patient Payment Info

Imagine you're a medical provider storing patient payment methods. Every patient is either insured or uninsured, and you need to store different data for each patient based on their insurance status.

If you've written code in a functional language, you might be familiar with algebraic data types. For instance, you could represent this schema in Haskell with the following ADT:

type Provider = String type MemberId = Int type CardNumber = String type BillingAddress = String data PaymentInfo = Insured Provider MemberId | Uninsured CardNumber BillingAddress

If you're more accustomed to object oriented programming, you might implement this same data type via abstract classes and inheritance, like the following Java code:

abstract class PaymentInfo { } class Insured extends PaymentInfo { String provider ; int memberId ; } class Uninsured extends PaymentInfo { String cardNumber ; String billingAddress ; }

These are both examples of polymorphic data: a datatype where the data can take multiple different shapes. The word comes from Greek roots: "poly" meaning many, and "morph" meaning form. So, data that can take many forms. When a relational database needs to model polymorphic data, we typically use the term polymorphic association instead.

Sometimes, like in the above example, a polymorphic association is the best way to model your data type. But there's no obvious way to represent this data in a relational database. If you want to include data like this in your database, you're going to have to make some tradeoffs. Let's look at some different schemas that attempt to model this data. We're going to use MySQL as our chosen dialect here, but the process is similar regardless of dialect.

... continue reading