Why the distribution of adult heights doesn't go negative
Leonard Henry Courtney may have said that “There are three kinds of lies: lies, damned lies and Statistics” but who complains when Statisticians lie to themselves? Quetelet fit the normal distribution to height since when it has become a staple of introductory Statistics text books. Subsequently Perlman [1] and other students at the University of Dortmund questioned when they were likely to meet someone of negative height. i.e. Realizability is broken since the normal distribution varies from minus infinity to plus infinity. Reductio ad absurdum or not?
Assuming that the normal distribution is an approximation to a binomial distribution, is realizability broken in the underlying distribution? That binomial distribution can be derived from a coin tossing game where, with probability q (= 1 - p), the payout for the next toss is a function of the stake returned at the n'th toss, S(n),
S(n) = S(n-1) + γ
and with probability p the payout is
S(n) = S(n-1) + δ
The binomial distribution tends, by the de Moivre-Laplace theorem, to the normal distribution
S(n) ~ S(0) + n γ + (δ - γ ) N(np, npq) = N(µ, σ^2)
Hence the lower bound of the binomial can be expressed as
min(S(n)) = S(0) + n γ = µ - σ sqrt(np/q)
and the upper bound as
max(S(n)) = S(0) + n δ = µ + σ sqrt(nq/p)
Eliminating the probability p leads to a relationship between n, the logarithm base 2 of the population size, and the extrema.
n σ^2 = (max(S(n)) - µ) (µ - min(S(n)))
Matthews [2] details John William `Bud' Rogan (1860s - 1905) of Tennessee who was 2.67 metres tall. Given a mean of 1.70 m and a standard deviation of 0.07 m this makes him more than 13 standard deviations taller than his fellow American men of the time. In 1900 the US population was 76.3 million so n is about 25. Given that the mean height is approximately equal to 24 standard deviations then either the minimum height is greater than zero and the Dortmund students are wrong or this argument is flawed.
Matthews argued that the normal distribution isn't appropriate for height because Mr Rogan was one man in 10^44. A contemporary of Mr Rogan was Charles Sherwood Stratton (1838 – 1883) who, at 0.64 m tall, was 15 standard deviations shorter or a man in underflow according to double precision calculations of the error function. To facilitate both gentlemen would require a US adult male population greater than 10^58 people, according to the last equation, given the low value of the variance. Under both arguments the extremes would indicate that height is not normally distributed due to the tails of the normal distribution not being fat enough rather than any breach of realizability.
[1] P. Perlman, When will we see people of negative height?, Significance 10 (1) (2013) 46–48.
[2] R. Matthews, Chancing It: The Laws of Chance and How They Can Work for You, Profile Books, London, 2016.