In the plot above, the black circles are data for the probability of infection with COVID, as a function of the time exposed to an infected person. The data are from the NHS app many of us in the UK used during part of the pandemic, and analysed by people at the UK Health Security Agency (HSA) and Oxford University. It was published by Ferretti et al.. The orange line is a power law fit to the data, with exponent (= slope on this log-log plot) of 0.47. It is a decent but not perfect fit to the data. The fit can be viewed as a purely empirical function that just describes the data. But it can also be – after the fact – justified by saying that if you combine the standard Wells-Riley model for disease transmission with a power law distribution of transmission rates, then you can recover this power law.
From the fit you can back out an estimate for the distribution of transmission rates r, it is a power law such that the probability of the rate of transmission being r, p(r) ~ 1/r1.47. Interesting, but can we predict the distribution of rates?
No is the short answer – airborne disease transmission is very complex. But we can have a go.
We start by assuming the rate of transmission of COVID is proportional both to the amount of virus the infected person has in their system, and to the rate at which you breathe in air they have exhaled:
transmission rate ∝ amount of virus exhaled by infected person × fraction of air you breathe in that infected person breathed out
Unfortunately we can’t measure either of these two things. But we do have proxies for both.
A technique called PCR (Polymerase Chain Reaction) can measure how many copies of a gene of the virus there are in a person’s saliva. This is not the same as the amount of infectious virus in the saliva, and does not account for how much of this makes it into droplets that the infected person exhales. But it is a start, and we have data.
We also have a proxy for the rate at which you inhale infected air you breath in, during time spent sharing a room with an infected person. This proxy is the excess amount of carbon dioxide (CO2) in the room. Here excess means above that of the Earth’s atmosphere, which is about 400 ppm (parts per million). The additional CO2 in a room comes from people’s breath*: our metabolism means we breathe in oxygen and breathe out carbon dioxide – at concentrations around 40,000 ppm – 100 times that in atmosphere.
So the fraction of second hand air in a room can be estimated by just subtracting off 400 ppm, and dividing that by 40,000 ppm. We can use this as proxy for the fraction of air inhaled that comes from an infected person. And we have data for the fraction of CO2 in air – see previous post.
So we can multiply our proxy for amount of virus by our proxy for the fraction of second air, and get something that should be proportional to the transmission rate. We can do this thousands of times using different measurements of the virus, and of CO2. Following the standard Wells-Riley model of transmission, we assume the time to infection is exponentially distributed, with a timescale given by one over the rate. As the equation above is just proportional to, not equal to, we obtain the distribution of infection times only up to a timescale – which we can’t estimate.
But we can get the shape of the distribution and so get the functional form of the probability of infection as a function of time. This is the blue curve above, which I have scaled along the x-axis so it overlaps with the data (black circles). In effect there is one fit parameter.
The blue curve is not a great one-parameter fit to the data from the NHS app. It overestimates how rapidly the infection probability varies with the duration of the contact between the susceptible person and the infected person.
But it is a lot better than the standard Wells-Riley model assuming that for each contact the transmission rate is the same. The standard Wells-Riley model gives the green line. As the probability of transmission is mostly much less than one, the exponential behaviour of Wells-Riley model reduces to a line of slope 1. Note that this is an exponent of 1, while the best fit power law exponent was 047 – half as much.
In other words, the standard Wells-Riley model with one rate greatly overestimates how rapidly the probability of infection increases with the duration of the contact, while our crude model with variable rates only overestimates the increase a bit.
So we have made progress. The model with variable rates is better. And given how simple it is, it is not surprising that it is far from perfect. Hopefully our model captures some part of the complexity of COVID transmission.
* It is mostly true that indoors we are the dominant source of CO2, main exception is in kitchen’s with gas cookers, which produce a lot of CO2.