Mypy, Pyrefly, Pyright, ty, Zuban, and possibly more that will come in the future... how are library maintainers expected to cope?
TL;DR: Prioritise running as many type-checkers as possible on your test suite. Run at least one on your source code.
If you only read one section of this blog post, please make it this one. Because this is where a lot of packages get it wrong. It's common to see packages run type checkers on their source code and to leave their tests untyped. That approach has it backwards.
Suppose you maintain a Python package. As a hypothetical user of your code, I don't particularly care about your internal development practices. Whether you use ruff format or black , how you sort your imports, whether you use pytest or unittest , none of this affects me. What I do care about is your public API and my experience interacting with it.
When you run a type-checker on your internal source code, you're mostly testing your internal logic. You can do that with whichever type checker you prefer, that's your choice. Which type-checker your users use, on the other hand, isn't.
By running as many type-checkers as possible over your test suite, you ensure that your package's public API works well for as many of your users as possible.
Polars is a modern dataframe library which, since its launch in 2020, has been taking the data science world by storm. As a heavy user of the library, I was very interested in making its developer experience even better. If Polars' types are accurate, then as a user I get better auto-complete, documentation, and protection from certain classes of bugs. What would it take to add Pyrefly to Polars' continuous integration jobs?
I started investigating this, and quickly ran into some roadblocks. Pyrefly is generally stricter than mypy, so it required rewriting parts of the codebase or adding more explicit type annotations when instantiating variables. Furthermore, I encountered some bugs in Pyrefly, and encouragingly enough, fixes for the vast majority of them were shipped with the highly anticipated v1 release. I think it was worth it, especially as it uncovered a medium-priority bug, but I did have to ask myself whether going through this for another three type-checkers would be.
To illustrate this point, let's look at the function DataType.__eq__ . In Python, any method __eq__ is expected to return bool , and if it doesn't, then we need to explicitly tell type-checkers to ignore the type error. This function in Polars can also return different types depending on the inputs, thus requiring overloads. To get this function to satisfy all of mypy, Pyrefly, and ty, we need to write:
@overload
... continue reading