The Infinite Monkey Rule
I always learn things from listening to the radio. On 'The Infinite Monkey Cage', I heard Brian Cox rattle off a 'rule of thumb', which goes something like:
A number plus or minus the square root of the sample size is consistent with random sampling error.
https://www.bbc.co.uk/programmes/b04yfsst (listen from 20 minutes in)
I hadn't heard that one, but deploying those key research tools, the back of an envelope and a pencil, I could see where it comes from.
I hope I am allowed to paraphrase Brian's rule:
If the excess (or deficit) number of cases meeting a criterion is larger than the square root of the sample size, it's statistically significant.
The way I'd normally look for a statistically significant difference in a proportion is calculating the standard error of proportion (SEP), and using twice that (actually 1.96) to find the 95% confidence interval (CI). For a percentage of 50% (proportion = 0.5) in a sample of 100, it works out as exactly 10%. Suppose we toss a coin a hundred times: we'd expect 50 heads and 50 tails, but we would accept 10% either way, 40%-60%. Anything outside that range suggests something unusual has happened -- a biased coin, a biased recording method -- but maybe just an unusual run of results.
Anyhow, to get to the 'square root of the sample size', we can derive it from the formula:
- For a proportion p seen in a sample n, SEP=root((p(1-p)/n)
- The SEP will be highest for p=0.5 (p(1-p)=0.25) and lowest for something near 0.1 (p(1-p)=0.09).
- For a sample n, the biggest 95% CI = 2*root(0.25/n) = 2*0.5/root(n) = 1/root(n).
- So, if the excess proportion p is expressed as a fraction m/n, this needs to be at least 1/root(n).
- We can cross-multiply by n to get m >= n/root(n) >= root(n), which is the size of Brian Cox's thumb.
So, for a sample size of 100, we're looking for a difference larger than 10, the square root of 100; for a sample of 50, a difference greater than 7.
This rule uses the 'worst case' of a proportion at or near 50%, when the 95% CI is at its widest, so sometimes a smaller difference will be statistically significant.