Understanding how crystals start to form is tricky. We can’t see how it happens as crystals start off microscopic, it is very sensitive to pretty much every aspect of the experimental set up, and the standard theory (called classical nucleation theory) has basically zero ability to predict anything. So we are a bit stuck. But we don’t have the toughest job around, arguably the most complex, and hardest to understand, thing around is the human body, so perhaps the toughest job belongs to medics and biomedical scientists studying diseases.

I am trying to learn some data analysis techniques used by (medical) doctors and statisticians. They use these techniques to analyse data on mortality from diseases. Not only are we both studying complex systems (crystal formation and the human body, respectively) but we are both studying an irreversible process: the formation of a crystal in our case, death in theirs. So, learning from smart medics and statisticians seems sensible.

One technique they use is applied to answering questions like: If you give patients drug X, does the rate at which they die decrease, increase, or stay and the same, and if it decreases or increases, by how much does it do so? As you would expect, this is a popular question to ask of a drug. It is not as easy to answer as you might expect, as the rate at which people die is typically not a constant, with or without the drug, so you are not comparing just two numbers (as you would if both rates were constant) but two functions, which is a lot harder.

Handily, there are reasonable ways to proceed, one of which is due to a statistician called David Cox. He realised that you could not make much progress without making any assumptions at all, but that you could make progress if you assumed that when you made a change, such as prescribing a drug, that the rate, at all times, scaled by a factor. He showed that you could estimate this factor, without knowing exactly what the mortality rates were with and without the drug, and in particular without assuming either of them was constant. This goes by the not-very-catchy title of the Cox proportional hazards model, and there is a test to see if it is reasonable.

This test is to plot data as I have done at the top, the log of minus the log of the probability of surviving to a time *t**, *as a function of the log of the time. If Cox’s assumption is correct then the curves should be parallel. The data* is for the fraction of droplets of solution of a small amino acid (glycine) that have survived (by not crystallising) to a time *t*. The red and blue curves are for two different concentrations of glycine. The two curves are clearly not perfectly parallel but they are roughly parallel, except perhaps near the start where they almost cross.

So, Cox’s assumption looks pretty reasonable for our data. With this assumption, I estimate that the rate at which a crystal forms is in the range 1.54 to 1.58 times faster, at the higher glycine concentration of 333 mg/ml than at 326 mg/ml. In our original paper* we fitted exponentials and got a ratio of rates of 1.54, so I am basically getting the same result as we did there, just by using a method that does not assume that the fraction surviving decays exponentially with time. As the exponential fits in our paper were OK, this is not a big surprise, although it is nice to get this good agreement between two different techniques. It suggest both methods are working, which is good, and so that we have a good grasp of how to measure how varying concentration varies the rate at which crystals form.

* Data is from our paper, Little, Sear & Keddie, *Crystal Growth & Design, *2015 (data sets G & H. As in the paper, I am neglecting crystals that form in the first hour).