How to Lie With Statistics

Darrell Huff

A quick read that will teach you how to recognise the all-too-common sneaky use of statistics. Huff exposes the many flaws in statistics and how easy it is to manipulate findings.

🤓 Learning
⭐️
3/5
🕐
4
min

Samples

Always ask: is this a representative sample?

A psychiatrist reported once that practically everybody is neurotic. Aside from the fact that such use destroys any meaning in the word “neurotic,” take a look at the man’s sample. That is, whom has the psychiatrist been observing? It turns out that he has reached this edifying conclusion from studying his patients, who are a long, long way from being a sample of the population.

Averages

The image below shows how many people earn a certain salary in a company. The boss might like to express the situation as “average wage $5,700”—using that deceptive mean. The mode, however, is more revealing: most common rate of pay in this business is $2,000 a year. As usual, the median tells more about the situation than any other single figure does; half the people get more than $3,000 and half get less.

Only when there is a substantial number of trials involved is the law of averages a useful description or prediction.

Never ignore the range.

Illustrations

Never trust graphs that use illustrations. The catch, of course, is this. Because the second image is twice as high as the first, it is also twice as wide. It occupies not twice but four times as much area on the page.

Expressing Data

The evidence: “Four times more fatalities occur on the highways at 7 P.M. than at 7 A.M.” Now that is approximately true, but the conclusion doesn’t follow. More people are killed in the evening than in the morning simply because more people are on the highways then to be killed. You, a single driver, may be in greater danger in the evening, but there is nothing in the figures to prove it either way.

By the same kind of nonsense that the article writer used you can show that clear weather is more dangerous than foggy weather. More accidents occur in clear weather, because there is more clear weather than foggy weather. All the same, fog may be much more dangerous to drive in.

There are often many ways of expressing any figure. You can, for instance, express exactly the same fact by calling it a one per cent return on sales, a fifteen per cent return on investment, a ten-million-dollar profit, an increase in profits of forty per cent (compared with 1935- 39 average), or a decrease of sixty per cent from last year.

"Correlation is Not Causation."

It is the one that says that if B follows A, then A has caused B.

It seems a good deal more probable, however, that neither of these things has produced the other, but both are a product of some third factor.

Spurious correlation: A mathematical relationship in which two or more events or variables are associated but not causally related, due to either coincidence or the presence of a certain third, unseen factor.

e.g. there is a close relationship between the salaries of Presbyterian ministers in Massachusetts and the price of rum in Havana.

Watch out for the general conclusion that the more you go to school the more money you’ll make.

People with Ph.D.s quite often become college teachers and so do not become members of the highest income groups.

Percentages

Percentages offer a fertile field for confusion. And like the ever-impressive decimal they can lend an aura of precision to the inexact.

Any percentage figure based on a small number of cases is likely to be misleading. It is more informative to give the figure itself.

The odd thing about percentiles is that a student with a 99-percentile rating is probably quite a bit superior to one standing at 90, while those at the 40 and 60 percentiles may be of almost equal achievement.

How to Talk Back to a Statistic

Who Says So?

Look for conscious and unconscious bias (often more dangerous).

An improper measure may be used: a mean where a median would be more informative (perhaps all too informative),

How Does He Know?

Watch out for evidence of a biased sample, one that has been selected improperly

Is the sample large enough to permit any reliable conclusion?

With a reported correlation: Is it big enough to mean anything? Are there enough cases to add up to any significance?

What’s Missing?

Many figures lose meaning because a comparison is missing. An article in Look magazine says, in connection with Mongolism, that “one study shows that in 2,800 cases, over half of the mothers were 35 or over.” Getting any meaning from this depends upon your knowing something about the ages at which women in general produce babies.

Sometimes it is percentages that are given and raw figures that are missing, and this can be deceptive too. Long ago, when Johns Hopkins University had just begun to admit women students, someone not particularly enamored of coeducation reported a real shocker: Thirty-three and one-third per cent of the women at Hopkins had married faculty members! The raw figures gave a clearer picture. There were three women enrolled at the time, and one of them had married a faculty man.

Did Somebody Change the Subject?

More reported cases of a disease are not always the same thing as more cases of the disease.

Saying and doing may not be the same thing at all. (i.e. people often lie when questioned/surveyed)

Does It Make Sense?

The trend-to-now may be a fact, but the future trend represents no more than an educated guess.

Image credits: How to Lie with Statistics by Darrell Huff