A lost decade chasing distributed architectures for data analytics?
Published on: 2025-06-24 14:39:31
The Lost Decade of Small Data?
Hannes Mühleisen Published on Reading time: 9 minutes
TL;DR: We benchmark DuckDB on a 2012 MacBook Pro to decide: did we lose a decade chasing distributed architectures for data analytics?
Much has been said, not in the very least by ourselves, about how data is actually not that “Big” and how the speed of hardware innovation is outpacing the growth of useful datasets. We may have gone so far to predict a data singularity in the near future, where 99% of useful datasets can be comfortably queried on a single node. As recently shown, the median scan in Amazon Redshift and Snowflake reads a doable 100 MB of data, and the 99.9-percentile reads less than 300 GB. So the singularity might be closer than we think.
But we started wondering, when did this development really start? When did personal computers like the ubiquitous MacBook Pro, usually condemned to running Chrome, become the data processing powerhouses that they really are today?
Let's turn our a
... Read full article.