Comparing Python packages for A/B test analysis (with code examples)

Comparing Python packages for A/B test analysis: tea-tasting, Pingouin, statsmodels, and SciPy

Mar 01, 2026 a/b testing statistics tea-tasting python

Disclosure: I am also the author of tea-tasting.

This article compares four Python packages that are relevant to A/B test analysis: tea-tasting, Pingouin, statsmodels, and SciPy. It does not try to pick a universal winner. Instead, it clarifies what each package does well for common experimentation tasks and how much manual work is needed to produce production-style A/B test outputs.

It assumes familiarity with A/B testing basics, including randomization, p-values, and confidence intervals.

A/B test setting and analysis requirements #

A/B tests in a nutshell #

An A/B test compares two (or more) variants of a product change by randomly assigning experimental units to variants and measuring outcomes. In online experiments, the randomization unit is usually the user, and the standard assumption is that units are independent.

A typical workflow is:

Design the experiment: choose the randomization unit (usually users), define the target population, and estimate sample size and duration with power analysis. Run the experiment: ship the treatment, randomize traffic, and collect data. Analyze and interpret results: compute control and treatment metric values, estimate effects with confidence intervals, and report p-values.

... continue reading