Yesterday I read a very strange quote:

2.86% of guinea pigs admitted to veterinary hospitals in the survey had been injured by karaoke machines

from Charlie Stross talking about an article in The Register.

It got me wondering what the minimum sample size would be to get 2.86%. It turns out, to the right number of significant digits, 1 in 35 gives 2.86% so it's possible it was only 1 case out of only 35 in the survey.

Here's some Python code for working out the minimum sample size that will result in a given decimal. Note that decimal is to be given as a string so significant trailing zeroes can be used. e.g. min_sample("0.1") == 7 whereas min_sample("0.10") == 10 as you would expect.

def min_sample(decimal): fl = float(decimal) assert 0 < fl < 1 sig_digits = len(decimal) - 2 for sample_size in range(1, (10 ** sig_digits) + 1): if round(round(fl * sample_size) / sample_size, sig_digits) == fl: return sample_size

So the next time you read 22% of X or 24% or Y or 2.86% of Z you can quickly work out the sample size could be as low as 9, 17 or 35 respectively.

P.S. I leave it as an exercise to the reader to rewrite using the decimal module and decide whether it's worth it.

Tweet

The original post was in the categories: python mathematics but I'm still in the process of migrating categories over.

The original post had **3 comments** I'm in the process of migrating over.