THE DATA ORGANISATION

Stumped by Bayes’ Theorem? Try This Simple Workaround

Bayes’ Theorem formula.

Bayes’ Theorem, which The Stanford Encyclopedia of Philosophy calls “…a simple mathematical formula” can be surprisingly difficult to actually solve. If you struggle with Bayesian logic, solving the “simple” formula involves not much more than guesswork. You have to translate a problem into “A given B” and “B given A”, cross your fingers that you’re guess for whatever A and B is is right, double check your thoughts, get thoroughly lost, and punch the resulting fractions into a calculator. The calculator will spit out an answer which may or may not be correct as you have no idea what your point-oh-something solution means in terms of the original problem. If this sounds like you, you’re not alone: various studies have shown that the vast majority of physicians can’t work the formula either.

But there’s a more intuitive way to get to the same answer, without the counter-intuitive formula. The procedure in question? None other than the humble probability tree.

How to Use a Tree to Solve Bayes’ Formula

This example problem is adapted from a problem in Gigrenzer & Hoffrage’s How to Improve Bayesian Reasoning Without Instruction: Frequency Formats

Out of 1,000 patients, 10 have a rare disease. Eight of those diseased individuals display symptoms. Out of the 990 healthy individuals, 95 display symptoms. What is the probability a patient with symptoms actually has the disease?

Here’s the traditional textbook method, using the Bayesian algorithm.

If you’re good with numbers, you may be able to immediately see that the answer this question with a simple ratio: number of diseased people with symptoms / total number of people with symptoms. 

Now let’s construct the same answer with a probability tree:

From there, the math is a simple ratio:

Number of people with disease and symptoms (8) / Total number with symptoms (8 + 95)

which gives us:

8 / 103 = 0.078.

Let’s try another example (borrowed from Bayes’ Theorem Problems):

You want to know a patient’s probability of having liver disease if they are an alcoholic. 10% of patients at a certain clinic have liver disease. Five percent of the clinic’s patients are alcoholics. Out of those patients diagnosed with liver disease, 7% are alcoholics.

Like the first problem, the first branch here is also “disease”, but the second branch needs to address “alcoholism” instead of “symptoms”. We’re not told “how many” patients, so I’ll use 1000–which is usually a sufficient number for problems like this. You’re also not told explicitly the number of alcoholics (or % of non-liver disease alcoholics), but you can use a little logical deduction:

Out of 1000, patients, 5% (50 total) are alcoholic,

7% of patients with liver disease are alcoholic. That gives you 7 (green box), leaving 43 for the orange box.

Now all we have to do is figure out the ratio:

Number of people with disease and alcoholism (7) / Total number with alcoholism (50)

which gives us:

7 / 50 = 0.14

Which is exactly the same answer you would get by actually working the formula. In fact, I’ve never come across a Bayes’ related problem that can’t be answered with a probability tree and a little logical reasoning. So if the formula is giving you headaches, just do what I did–and ditch it in favor of a more intuitive approach.

References

Gigrenzer, G. & Hoffrage, U.  How to Improve Bayesian Reasoning Without Instruction: Frequency Formats. Psychological Review, 102 (4), 1995, 684–704. www.apa.org/journals/rev/

Gould, S. J. (1992). Bully for brontosaurus: Further reflections in natural history. New York: Penguin Books.

Bayes’ Theorem

http://www.datasciencecentral.com/xn/detail/6448529:BlogPost:980607