The 2020 Guardian University League Tables are out, and Saturday’s print edition ran with the headline “Oxford falls to third place in university rankings”. As someone who teaches data analysis that seemed to be quite a definite statement to me — there is no obvious caveat to indicate how confident they are of this statement. This omission concerns me, but to be fair to The Guardian, they have the 2020 league table data available for download as a spreadsheet. It looks like a fair number of the data values are missing, so I turned to the 2019 league table data. This data set looks complete, and is of the same form. Each university has nine data values, and in each case the analysis assumes that it is the bigger the better, i.e., large values of each number indicate a good university, or good teaching, somehow*.
In 2019, The Guardian had the top three as Cambridge, Oxford then St. Andrews, while this year, it is headline news that they go for Cambridge, St. Andrews and Oxford. They give details of how they analyse the data, and in particular they do not weight all nine numbers equally, some are weighted up to three times as heavily as others.
I do not quite understand why they chose the weighting that they did, so I thought I would see what differences changing the weights would have. I downloaded the 2019 sheet and then quickly did a Python Jupyter notebook to use Pandas to calculate a ranking based on any weighted sum of these nine values. I then ran this for a large sample of random sets of these nine weights — to see what they could have got if they had used different weights. Effectively, I generated a sample of 5000 league tables that use The Guardian‘s data, together with sets of weights that are different but are of similar size to the ones they use.
In 2019, The Guardian put Oxford into second place, but Oxford was perhaps lucky to be there, in my set of sample league tables, Oxford was in third place 53% of the time. In other words, a slight majority of the ways of analysing the data as The Guardian did last year, would have put Oxford in third place. In 2019 St Andrews was in third place in their league table, but in 36% of the possible league tables, they would have made it to second.
Finally, there is good news, over my sample of league tables no less than 17 universities were in the top ten in at least one league table, i.e., if many other newspapers set up themselves in competition, and used the same data but with different weights, then 17 universities would have been able to claim to be top ten. The complete list of seventeen if you want to claim you are at a top-ten university is: Cambridge, St Andrews, Oxford, Imperial College, Durham, Glasgow, Strathclyde, Bath, Exeter, Dundee, London School of Economics, Bristol, Warwick, Leeds, UCL, Birmingham, Loughborough.
* In this blog post I don’t consider this assumption, although I should say that I think it is bit of a stretch. I also don’t consider the fact that their are substantial statistical uncertainties in these numbers, that are not given by The Guardian.