# All models are wrong, unless you have limited data, in which case most models are right

I am very fond of the quote by the statistician George Box: “essentially, all models are wrong, but some are useful”. It is true, all our models are wrong if you look carefully enough. For example, Newton’s model of gravity is pretty good for calculating our orbit around the Sun, but if you look carefully enough you will see it is wrong. Careful measurements and general relativity have told us that it is an approximation. The same thing applies to pretty much every model we have.

This might make you think that the main problem scientists have is finding a model that describes their data, but often we have the opposite problem. We have several different models that are all consistent with the data. We are spoilt for choice.

This is because our data is usually limited. For example, if we are to see the deviations from Newton’s model of gravity due to general relativity we would have to measure the motion of the Earth very accurately. A less accurate measurement would be consistent with both Newton and relativity.

Measuring the position of planets accurately is relatively easy, but crystallisation is a complex flaky business, and so there are limits to how accurate the data we can get on it. And the less accurate the data the larger the number of models that are consistent with it. Two models are both consistent with our data. I was kind of hoping just one of them would be, but that is not what the data says. I will have to listen, in science data is the ultimate arbiter. It may be telling us that both models are a bit wrong, and a bit right.