I am revising a numerical physics course for the forthcoming semester, in particular the bits about data analysis. So I have been reading a couple of books to both learn from them, and to see if they could be useful to the students. One compact but good summary is The Data Loom by Stephen Few. It is quite introductory and short, so I am thinking that it could be good to recommend to the students. It covers a lot of ground and I like the author’s practical, sceptical tone. It is also has some excellent examples.
One of these illustrates how measuring the wrong thing, the wrong metric, leads to the wrong conclusion. Stephen Few and I are, I think, of a similar mind when it comes to the misuse of poor metrics. For example, I think a lot of the metrics used both by newspapers and the UK government to assess universities, do not pass the most basic test of fitness. Anyone trying to analyse data competently, for example, to assess how well a university is doing, needs to start by asking basic questions such as “What questions do I want to answer with the help of the data?”, and “Will what I am measuring be able to answer that question?”. All too often even basic data analysis good practice like that is just not done.
To back up his scepticism Stephen Few provides an eye-opening example. He quotes from the 2010 annual report of a company called Transocean:
“As measured by these standards, we recorded the best year in safety performance in our Company’s history, which is a reflection on our commitment to achieving an incident free environment, all the time, everywhere.”
Transocean owned the oil rig Deepwater Horizon (seen on fire above). In 2010, the period to which this report refers, the Deepwater Horizon exploded, 11 men died and the fireball was visible over 60 km away. It is of course hard to square that with “best year in safety performance in our Company’s history”, unless your really are using the wrong metrics. The exposion caused what is estimated to the largest oil spill in history.
A journalist called Jeff McMahon wrote an article on this in Forbes, which also has the quote from the Transocean annual report: “Notwithstanding the tragic loss of life in the Gulf of Mexico, we achieved an exemplary statistical safety record as measured by our total recordable incident rate and total potential severity rate.” I would say that the loss of life here means that it is obvious that you are measuring the wrong things if you think it is OK to claim an “exemplary statistical safety record” in these circumstances. But as Stephen Few states, and I agree, claims based on metrics that simply measure irrelevant, dumb, or downright odd things, are all too common.