Dataframely: A polars-native data frame validation library
Published on: 2025-08-02 22:14:50
At QuantCo, we are constantly trying to improve the quality of our code bases to ensure that they remain easily maintainable. More recently, this often involved migrating data pipelines from pandas to polars in order to achieve significant performance gains.
At the end of 2023, we started undertaking an effort to modernize a massive legacy codebase in our one of our longest-running projects. While doing that, we realized that our existing data frame processing code had an integral flaw: column names, data types, value ranges, and other invariants — none of it was obvious just from reading the code.
As a result, the typical approach for understanding a function's behavior involved executing it on client infrastructure — the only place the actual data is available. Then, we would manually step through each pandas transformation to inspect the data before and after every change. Naturally, this is tedious, error-prone, and far from efficient.
Once we'd rewritten a chain of transformati
... Read full article.