Data-at-Rest Encryption in DuckDB
Lotte Felius, Hannes Mühleisen · 22 min
TL;DR: DuckDB v1.4 ships database encryption capabilities. In this blog post, we dive into the implementation details of the encryption, show how to use it and demonstrate its performance implications.
If you would like to use encryption in DuckDB, we recommend using the latest stable version, v1.4.2. For more details, see the latest release blog post.
Many years ago, we read the excellent “Code Book” by Simon Singh. Did you know that Mary, Queen of Scots, used an encryption method harking back to Julius Caesar to encrypt her more saucy letters? But alas: the cipher was broken and the contents of the letters got her executed.
These days, strong encryption software and hardware is a commodity. Modern CPUs come with specialized cryptography instructions, and operating systems small and big contain mostly-robust cryptography software like OpenSSL.
Databases store arbitrary information, it is clear that many if not most datasets of any value should perhaps not be plainly available to everyone. Even if stored on tightly controlled hardware like a cloud virtual machine, there have been many cases of files being lost through various privilege escalations. Unsurprisingly, compliance frameworks like the common SOC 2 “highly recommend” encrypting data when stored on storage mediums like hard drives.
However, database systems and encryption have a somewhat problematic track record. Even PostgreSQL, the self-proclaimed “The World's Most Advanced Open Source Relational Database” has very limited options for data encryption. SQLite, the world’s “Most Widely Deployed and Used Database Engine” does not support data encryption out-of-the-box, its encryption extension is a $2000 add-on.
DuckDB has supported Parquet Modular Encryption for a while. This feature allows reading and writing Parquet files with encrypted columns. However, while Parquet files are great and reports of their impending death are greatly exaggerated, they cannot – for example – be updated in place, a pretty basic feature of a database management system.
Starting with DuckDB 1.4.0, DuckDB supports transparent data encryption of data-at-rest using industry-standard AES encryption.
... continue reading