Tech News
← Back to articles

Simulating and Visualising the Central Limit Theorem

read original related products more articles

Simulating and Visualising the Central Limit Theorem

Categories: Statistics R

34 minutes read

I completed a Computer Science degree at uni, and bundled a lot of maths subjects in as electives: partial differential equations, vector calculus, discrete maths, linear algebra. For some reason however I always avoided statistics subjects. Maybe there’s a story to be told about a young person finding uncertainty uncomfortable, because twenty years later I find statistics, particularly the Bayesian flavour, really interesting.

One problem with a self-directed journey is that there’s foundational knowledge that has come to me in dribs and drabs, and one of the most foundational is the Central Limit Theorem (CLT). In this post I want to interrogate and explore the CLT using simulation and visualisation in an attempt to understand how it works in practice, not in theory. This is predominantly a process to help me better understand the CLT; you’re just here for the ride. Hopefully that ride can help you get where you need to go as well.

It’s been awhile since I’ve included any code in a post, so where it makes sense I’ll show the generating R code, with a liberal sprinking of comments so it’s hopefully not too inscrutable.

A Brief Recap

I don’t want this to be like an online recipe with pages of back story before you get to the meat and bones, but a brief summary of the CLT before we begin is unavoidable. In plain English the CLT can be described as such:

“If you take repeated samples of size n from a distribution and calculate the sample mean for each, as n gets approaches infinity, the distribution of sample means approaches a normal distribution.”

... continue reading