The Infinite Monkey Rule

I always learn things from listening to the radio. On 'The Infinite Monkey Cage', I heard Brian Cox rattle off a 'rule of thumb', which goes something like:

A number plus or minus the square root of the sample size is consistent with random sampling error. (listen from 20 minutes in)

I hadn't heard that one, but deploying those key research tools, the back of an envelope and a pencil, I could see where it comes from.

I hope I am allowed to paraphrase Brian's rule:

If the excess (or deficit) number of cases meeting a criterion is larger than the square root of the sample size, it's statistically significant.

The way I'd normally look for a statistically significant difference in a proportion is calculating the standard error of proportion (SEP), and using twice that (actually 1.96) to find the 95% confidence interval (CI).  For a percentage of 50% (proportion = 0.5) in a sample of 100, it works out as exactly 10%. Suppose we toss a coin a hundred times: we'd expect 50 heads and 50 tails, but we would accept 10% either way, 40%-60%. Anything outside that range suggests something unusual has happened -- a biased coin, a biased recording method -- but maybe just an unusual run of results.

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
            p=50%     n= 100
            5%   SEP      
            10% 2xSEP    
          -10% +10% ±2xSEP    
        40% <--- ---> 60%      
          95% CI        

Anyhow, to get to the 'square root of the sample size', we can derive it from the formula:

  • For a proportion p seen in a sample n, SEP=root((p(1-p)/n)
  • The SEP will be highest for p=0.5 (p(1-p)=0.25) and lowest for something near 0.1 (p(1-p)=0.09).
  • For a sample n, the biggest 95% CI = 2*root(0.25/n) = 2*0.5/root(n) = 1/root(n).
  • So, if the excess proportion p is expressed as a fraction m/n, this needs to be at least 1/root(n).  
  • We can cross-multiply by n to get m >= n/root(n) >= root(n), which is the size of Brian Cox's thumb.

So, for a sample size of 100, we're looking for a difference larger than 10, the square root of 100; for a sample of 50, a difference greater than 7.

This rule uses the 'worst case' of a proportion at or near 50%, when the 95% CI is at its widest, so sometimes a smaller difference will be statistically significant.




Comments about SHEU

"The Unit has a unique historical and contemporary archive of young people." Prof. Ted Wragg 1938-2005

Prof Ted Wragg, 1938-2005

"We would like to take part in the next ECM survey. We have found the data produced invaluable for supporting evidence in our SEF etc."

School Vice Principal

Any comments on specific survey questions that may have caused difficulty?
All questions are clearly worded and easy to answer

Class teacher

"Our use of the Health-Related Behaviour Questionnaire was commended as part of our accreditation for the National Healthy Schools Scheme." Headteacher

"This is amazing! Thank you." (school report)


(Our) Senior team were very enthused with the rich source of data provided within the reports (and thought that the analyses including within the appendices section of the main reports were really interesting).

Health Improvement Specialist (Children, Schools and Families)

"One year (following the SHEU survey) responses from our Year 4 cohort caused us concern, so we put in place a number of team building, motivational projects. We then assessed their effectiveness by requesting the SHEU questionnaires for these pupils as Year 5's."

Learning Mentor

"The data from last time were spot-on and we have done lots of work with it. We are very keen to repeat the survey." Headteacher


"We are planning next year's programmes around this information." Health Education Adviser

Health Education Adviser

"I have valued greatly the work I have done with the team in Exeter, it has been a highlight of my years here." Health Promotion Specialist

Health Promotion Specialist