Semester is starting to loom large, lessons start in four weeks. I will be teaching several things, including biological physics – a course where I also try and blend in some learning of estimation. Of course, for biological physics I currently have no shortage of real-world examples. So let’s look at one that involves some estimation. Question: If Guildford’s nightclub *Casino* is full to capacity, what is the probability* that *none* of the patrons are asymptomatic carriers of COVID-19? *Casino’s* capacity is 1,500. ONS data on the fraction of the population that are infected is here. A reasonable estimate for the fraction of people infected with COVID-19, that do not know it, is somewhere in the range say one in five to one in three.

Answer: If I look at the ONS data, then at the time of writing (28th August 2021) they estimate the infection rates to be 3.5 % for ages 12 to 24 years, and 1 % for ages 25 to 34. I assume that most patrons will be 18 to 30 – although I am 50 so am guessing here, most of my students should have a better idea of the demographics of *Casino* than I. So, to start with let’s split the difference and assume a 2 % infection rate for the demographics from which *Casino* patrons come from. If I also split the difference between one in five and one in three, that is one in four, i.e., a quarter. Then, for those that are unaware they have COVID-19, we have 2% divided by four, which is 0.5 %. So, we have ended up assuming that of the relevant demographic, 0.5% have COVID-19 but are unaware of this, and so may go to *Casino*.

Then if there is one person in *Casino*, the probability that they don’t have COVID-19 is 100% − 0.5% = 99.5% or 0.995**. If there are two people in *Casino*, and we assume that the probabilities that the two people have COVID-19 and are unaware of this are independent, then we can multiply these two probabilities together. It is a basic law of probability, that you can multiply the probabilities of two *independent* things together to get the probability that they are both true. So, then probability that neither of the two people in *Casino* have COVID-19 is 0.995^{2} = 0.990025.

If we continue to assume independence then for a full *Casino* the probability that nobody has COVID-19 but is unaware of this, is 0.995^{1500} = 0.0005. This is odds of one in 2,000, i.e., we estimate that 1,999 times out of 2,000, at least someone in a full *Casino* will have COVID-19. In simple terms, with that many people, if: 1) the infected-but-unaware rate is anywhere near 0.5%, and 2) the probabilities are indeed independent, then it is very very likely that at least one of the 1,500 patrons is infected***.

Making an estimate is often the easier bit, it is harder to estimate how accurate this estimate is. But we can try. There are two assumptions, 1) and 2) in the paragraph above. Assessing the affect of 1) is pretty easy. For example, if instead the probability of infection is at the low end, at 1% and only one in five is unaware, then, the probability that a single person is infected but unaware is 99.8% or 0.998. Then if we still assume independent probabilities that people are infected, the probability that nobody in a full *Casino* is infected is 0.998^{1500} = 0.05, or 5%. This is a lot higher but still small.

So reasonable values for assumption 1) gives probabilities that *Casino * has at least one infected patron from around 95% to very close to 100%. Assumption 2) is trickier to assess. But we can test it a bit. For example, if we assume that instead of the 1500 *Casino* patrons, we can make a simple guess that they are say 750 couples, and that in each of them either both is infected or neither is – transmission inside a household is very common. Of course this is a pretty extreme example, but it is simple and allows us to do a simple calculation. If infection is by couples then the probability that a couple is infected is still 99.5% but now the 1500 patrons at capacity are not 1500 independent people but 750 independent couples****, so probability none are infected is 0.995^{750} = 0.02, i.e., 2 %.

So if the patrons are not all independent of each other but are groups of housemates or partners where the probabilities of members of the group being infected are not independent, then this increases the probability of nobody in a full *Casino* having COVID-19. Indeed, if you assume the low infected-but-unaware probability of 0.998, and 750 perfectly correlated couples, then the probability that at least one person is infected drops to 78%.

So, we end up with a rough estimate for the probability that a full *Casino* has at least one infected person in it. This probability is estimated to be in the range around 80% to almost 100%. The exact value of the probability depends on exactly what we assume for the fraction that are infected and what fraction of the 1500 are in groups, and how correlated their infection probabilities are.

I hope this example is useful. It should be a real-world example for those students who enjoy Guildford’s nightlife. And estimation is a very useful skill, both within and outside of science.

* This webpage goes over elementary ideas of probability theory, and this webpage is a cheet sheet of handy rules of probability, if you need that. There is a also a cute demonstration of conditional probabilities, i.e., probabilities that are not independent here. The algebra of probabilities is pretty simple but the ideas can be counter intuitive, so if you don’t follow them, have a read of few webpages like the two above to help you get your head around them.

** As someone either has COVID-19 and is unaware of it, or does not, then the two probabilities must add up to one. So the probability that a person is not infected-with-COVID-19-and-unaware is one minus the probability that are infected-with-COVID-19-and unaware. Here this is 1 − 0.005 = 0.995. See * note for a link to a webpage with a guide to probability theory.

*** By contrast, taking the current infection rate of 1% for 50-year-olds like me, then in a pub with, say, 40 oldies like me in it, the probability that nobody has COVID is 82%. Sometimes it is good to be old :).

**** Note this assumption of 750 couples is silly but keeps the maths simple. Of course, it would be more realistic to assume a mix of single people, groups of two, three, four, etc, and with less than 100% correlation within a group. But that would the maths longer and more boring, without making it much more accurate. So because I like minimum effort solutions, I went for 750 perfectly correlated couples.