Have you ever participated in a poll? Or any process that may have collected data that represents you and a variety of other people? Today’s topic will explain the intuition behind polling.

Understanding the Central Limit Theorem (CLT)

Consider the following scenario.

There’s a marathon race arranged in your country, and you are the organizer of this grand event. Runners from all over the world are going to participate and have already arrived in your country.

On the day of the event, all the participants are being transported to the venue, but one of the buses broke down on the way. Unfortunately, the bus is full of foreigners who can’t communicate with people nearby.

Now, to resolve this matter, you set off on the road with your team. Lucky for you, you immediately find a bus surrounded by a group of unhappy foreigners. You obtain information about those passengers, including their weights, and notice that the mean weight is nearly 220 pounds (~99 kg). (Anything is possible when you know statistics!) There’s no way a random group of marathon runners could all be this heavy. You immediately inform your team to keep searching.

Bus A: Marathon Runners

Avg Weight: 155 lbs

Bus B: Other Passengers

Avg Weight: 220 lbs

Congratulations! If you can grasp how someone who takes a quick look at the weights of passengers on a bus can infer that they are probably not on their way to a marathon’s starting line, then you now understand the basic idea of the Central Limit Theorem.

The core principle of the Central Limit Theorem is that a large, properly drawn sample will always represent the population from which it was drawn. Obviously, it won’t be an exact replica and will vary, but the probability that it will deviate massively from the population is very low.

There’s still a possibility that people weighing 220 pounds do run in a marathon—out of the total population, there might be almost 100 of them—but the likelihood of so many being assigned to the same bus is extremely low. Hence, you can confidently conclude that it’s not the bus you were looking for.

This is the basic intuition of the CLT. The mean weight of a marathon runner is about 155 pounds, so there’s less than a 1 in 100 chance that the mean weight of those 60 passengers is 220.

We just used our knowledge of the population data (the mean weight of a runner is 155 pounds) to conclude that it’s the wrong bus. The inverse is also true, and that’s how polling works.

In reality, it’s impossible to compile a complete population dataset. That’s why sufficiently large sample datasets are used to infer the parent population dataset. This is where the CLT comes into play.

The Central Limit Theorem tells us that a large sample will not typically deviate sharply from its underlying population. A mere poll of 2,000 appropriately chosen people can reveal a great deal about how an entire country is thinking.

The Central Limit Theorem states that the sample means will be distributed roughly as a normal distribution around the population mean. The population from which the samples are drawn does not have to have a normal distribution for the sample means to be distributed normally.

INTERACTIVE DISTRIBUTION PLOTS

Sample Size: 30

Number of Samples: 1000

Population Distribution (Skewed)

Sample Means Distribution

The larger the number of samples, the more closely the distribution will approximate a normal distribution. And the larger the size of each sample, the tighter that distribution will be.

Summary

If you draw large, random samples from any population, the means of those samples will be distributed normally around the population mean.
Most sample means will lie reasonably close to the population mean; the standard error defines “reasonably close.”
The Central Limit Theorem tells us the probability that a sample mean will lie within a certain distance of the population mean. It’s relatively unlikely that a sample mean will lie more than two standard errors from the population mean, and extremely unlikely that it will lie three or more standard errors away.