(featured image via Pixabay)
Numbers appear to give precision, but they are often meaningless without a suitable context
“It never rains in Southern California”, the song by Albert Hammond goes. This is untrue – the average annual rainfall over the period between 1877 and 2018 was over 14 inches (37 cm). But as far as I know, nobody ever challenged Mr Hammond about this false claim.
We humans are more used to communicating with words than with numbers. When we want to express how likely we think something is or will be for instance, chances are we don’t use percentages (excepting, perhaps, 100%), but we use words. Always, never, possibly, usually, more often than not, almost certainly – just a small sample of our probabilistic vocabulary.
Inevitably that means there is lack of precision involved, often a bit of hyperbole (as in the song), and even a risk of misinterpretation. A probability of 25% is the same to everyone, but is that also true for our human-language terms?
This is something that Michael Mauboussin, an adjunct professor of finance at Columbia Business School and an expert in how luck and skill influence investment behaviour, set out to investigate together with his son Andrew, a data scientist. They created a survey with 23 probabilistic terms, and asked people what numeric probability they would associate with each one. Unsurprisingly, as the figure shows, the interpretations varied considerably, even for the absolutes Always and Never (if everyone saw it the same, then there would be a single vertical line).
This lack of quantitative basis can lead to confusion. It’s the same story when we refer to quantities or amounts: how much money is ‘a lot’, and how many people are ‘a few’? We need context, or perspective.
A quantified perspective
In fact, we need perspective for quantitative information as well. If you heard that there were 3,000 more cases of breast cancer in Belgium in 2018 compared to 2017, would you know whether that is just noise, or a worryingly trend? What if you learned that the UK cut its overall carbon emissions by 1.8 million tonnes between 2018 and 2019 – is that a lot, or a little?
There are plenty of (!) numbers in the news (as well as plenty of vague verbiage), but they often lack the context necessary to help us understand them. We need to be able to compare those numbers with something else – and not just anything else. Imagine a report stating that Microsoft’s profit in 2018 was $35 billion. (I could convert this to euros or pounds, but for most people that would make no difference).
35 billion is a very large number. If you wanted to argue that it is ‘excessive’ in some way, then you’d just leave it at that – a profit of that magnitude is obviously insane. If you wanted to pile it on, you could point out that it is equivalent to 4 million profit per minute, or to the salaries of a million nurses, or to the cost to build 800 schools. But none of this really give much meaning to it. It’s like saying that, if every person on Earth stood shoulder to shoulder, you’d have a row that would stretch from the earth to the moon and back, five times over. So what?
A recent paper proposes a mechanism to provide appropriate perspective to numerical data in the media, and describes how that changes the effect numbers have on us. Researchers Pablo Barrio, a computer scientist at Columbia University and Dan Goldstein, a cognitive psychologist and Jake Hofman, a physicist, both at Microsoft Research, started from 10 perspective templates (e.g. “about X times larger than Y”, or “about X% of the Y of Z”). This allowed them to produce perspectives such as ‘one million left homeless after a storm in Honduras, about 12% of the country’s population’ (so here X=12, Y=’population’, Z=’Honduras’).
Next they hired 80 workers via Amazon’s Mechanical Turk platform, to produce 370 actual perspectives on 64 New York Times quotes. Each worker then evaluated a selection of perspectives generated by their peers.
Finally, the researchers presented participants with 12 quotes (each either with the highest rated perspective, or without) and subsequently asked them to either recall the measurement, estimate a missing measurement, or detect whether a measurement had been manipulated. In each of these three experiments, they found the perspectives had improved comprehension. With perspectives, participants recalled about half of the numbers they saw, compared to a third without them. Similarly, 39% of estimates by participants who saw perspectives were within 10% of the original measure, compared to 33% without perspectives. And participants who had seen the perspective were better at detecting errors in most of the quotes (an average of 3.2% improvement with a peak of 15%).
In search of the right analogy
A follow-up paper by Christopher Riederer (then a PhD candidate in computer science at Columbia University) and the same Microsoft team continues this line of thinking. Here, they consider the question what makes an effective perspective sentence, through randomized experiments involving geographical measures. A U.S. reader presented with the area of Pakistan (307,000 square miles, 880,000 km2) might understand this better if it was compared to an American state. But what would be best – twice the size of California, or ten times the size of South Carolina? Would a European get a better understanding with twice the size of Sweden or 45 times the size of Wales?
Participants in the first experiment were asked to estimate the size or the population of a country, using one of the U.S. states as a reference. Well-known references and simple scaling factors (e.g. twice the size of California) turned out to be more useful, even if some alternatives are more accurate (e.g. twice the size of Montana, or five times larger than Georgia).
In a second experiment, participants were randomly assigned to four groups with different perspectives, and asked to estimate the area or the population of a given country. The four perspectives were based on: (1) the best modelled error, i.e. those that performed best in the first experiment (e.g. Angola’s population is about the same size as New York’s); (2) their home state (…about 3.9 times the size of Minnesota’s); (3) the approximation with the lowest objective error (…about ten times the size of New Mexico’s); and (4) one that had a similar objective error to the first group, but a worse modelled error (…about five times Oregon’s). There was also a control group where participants were not given any perspective. All four perspectives led to significantly more accurate estimates than the control group (and the first condition, predictably, was the best of the bunch).
The final experiment gauged how long the improved comprehension provided by an effective perspective lasted. The researchers contacted the participants from the second experiment six weeks later, and asked them to estimate the size and population of the same country as in the prior experiment – but this time nobody was given a perspective. Remarkably, the improvement persisted, suggesting that perspectives can have a lasting effect on comprehension of numerical data.
A perspective for one and all?
It is easy to see how this could be widely implemented. Imagine a browser plug-in that, when you hover over a number in an online text, instantly provides a pop-up with a meaningful perspective, comparing a particular government expenditure to total tax revenue or GDP for example. It could also become a feature in document or presentation applications, suggesting perspectives to the authors as they work.
But of course, there are still many situations where we would be on our own. Picture yourself on a trip to a strange town, and you’re looking for a place to eat. You see one that looks OK, just a short walk from your hotel, with a ‘hygiene rating’ of 85 out of 100. If you’d get that result on your Master’s thesis, you’d be pretty chuffed. But for restaurant cleanliness, is that score good, mediocre or bad? Does it mean that the inspectors found a hair on the edge of the bin in the kitchen rather than in it? Or does it mean that there were fresh rat droppings on the worktop? We need a perspective.
As Dan Goldstein told me, what would help here is to convert the score into a percentile score: regardless of the actual reasons why it didn’t score 100/100, a restaurant in the top-15% cleanest should be OK. Another way to provide a perspective is to compare it with an average McDonald’s restaurant – most people know what those are like.
In the absence of such perspectives, we need to be at our guard. What we can do is acknowledge we may not really know what a number means, and make sure that we don’t superimpose one of our own that is inappropriate. And even if a perspective is given, it’s good to be critical – but then that is good advice anyway.