One of the tricky things with probability is what you do when your sample size isn’t very big. For example, say a new rocket has had one successful launch and no launch failures. How safe is it to ride on the next flight? A naive frequency based approach would say, “1 launch, 1 success, therefore probability of success is 1”, completely discounting the possibility of failure all together.
You know that this is wrong. Say you have a choice of on a different rocket which has had 99 successful launches and one failure. The probability for success is 0.99. What would you choose? I’d go for the one with the 100 launches, you just know a lot more about it.
Simple statistics don’t work properly with small sample sizes. What we need is a formula which takes account of the fact we don’t know anything at the beginning. Bayesian statistics to the rescue! On space launch report I found this ace formula for estimating the probability properly:
Here P is the probability of success next time, k is the number of successful events so far, and n is the total number of trials so far. It’s called “the first level Bayesian estimate of mean predicted probability of success”. Therefore:
- New rocket: k=1, n=1 therefore P=0.67
- Established rocket: k=99, n=100 therefore P=0.99
For the new rocket it predicts a 2/3 chance that the next flight will be successful. This captures your state of ignorance about the rocket. It’s had one successful flight, and that’s good, but you still don’t know that much about it. For the established rocket it predicts a success probability of 0.99, which is pretty much what normal probability would tell you.
What about if no launches have yet occurred? Well the formula says probability of success is 0.5. A fifty-fifty chance of failure, kind of makes sense; you’ve not tried something yet, and you have no other information, so what else can you say?
This last point does raise a tricky problem with Bayesian reasoning. You have to make an initial guess at the probability, called the prior probability. In this case the choice was a 50-50 chance, which does sound reasonable, but it’s still a guess.
Of course the formula doesn’t just apply to rockets. It works fine for anything which is success/failure, true/false.