Bayes-type reasoning
This chapter rigorously explores Bayes' theorem and its application to conditional probability problems. Mastery of the tree and table methods, along with an understanding of reverse probability and diagnostic test scenarios, is critical for success in advanced probability examinations.
---
Chapter Contents
|
| Topic |
|---|-------| | 1 | Tree method | | 2 | Table method | | 3 | Reverse probability | | 4 | Diagnostic test problems |---
We begin with Tree method.
Part 1: Tree method
Tree Method
Overview
The tree method is one of the clearest ways to organise conditional probability. It is especially useful when events happen in stages, when a probability changes after an earlier event, or when we want to compute a reverse probability using Bayes-type reasoning. In exam problems, the main goal is not drawing a decorative tree, but using the tree to track all branches correctly and read probabilities from it with precision. ---Learning Objectives
After studying this topic, you will be able to:
- Draw a probability tree for a multi-stage experiment.
- Label branch probabilities correctly.
- Compute the probability of a complete path by multiplying along the branches.
- Compute the probability of an event by adding relevant path probabilities.
- Use the tree method to solve Bayes-type reverse-probability questions.
Core Idea
A probability tree is a branching diagram used to represent sequential events.
Each level of the tree represents a stage of the experiment, and each branch is labeled by the conditional probability of that outcome given the previous stage.
- first choose source/type/category
- then observe result/outcome
Main Rules
The probability of a complete path is the product of the probabilities written along that path.
If an event can happen through several disjoint paths, then its probability is the sum of the probabilities of those paths.
Why the Tree Method Works Well
A good tree makes three things visible:
- stage-by-stage dependence
- complete-path probabilities
- all possible ways an event can happen
- conditional probability
- repeated draws
- test accuracy problems
- disease-screening problems
- manufacturing/source-identification problems
Bayes-Type Reasoning from a Tree
Suppose an outcome can arise from multiple starting categories .
Then
- numerator = one specific path probability
- denominator = sum of all paths leading to
Minimal Worked Example
Example 1 A box is chosen from two boxes:- Box 1 with probability
- Box 2 with probability
- From Box 1, the probability of red is
- From Box 2, the probability of red is
Common Tree Structures
- source outcome
- disease status test result
- first draw second draw
- biased choice success/failure
- machine chosen defective/non-defective
Drawing the Tree Correctly
When drawing a tree:
- every stage must be clearly separated
- branch probabilities leaving a node must add to
- conditional labels must match the branch's parent node
- final answers should come from path multiplication and path addition
Common Mistakes
- β multiplying probabilities from different paths together
- β forgetting that second-stage probabilities may be conditional
- β adding probabilities of non-disjoint events without care
- β using a tree when the stages are not actually sequential
CMI Strategy
- Identify the stages clearly.
- Put first-stage probabilities on the first split.
- Put conditional probabilities on the next split.
- Multiply along paths.
- Add the relevant final paths.
- For reverse probability, divide the wanted path by the total probability of the observed event.
Practice Questions
:::question type="MCQ" question="In a probability tree, the probability of a complete path is found by" options=["adding the branch probabilities on that path","multiplying the branch probabilities on that path","subtracting the branch probabilities on that path","taking the average of the branch probabilities on that path"] answer="B" hint="Use the multiplication rule for sequential events." solution="For sequential events, the probability of a full path is the product of the probabilities along that path. Therefore the correct option is ." ::: :::question type="NAT" question="A box is chosen: Box 1 with probability and Box 2 with probability . From Box 1, the probability of a red ball is ; from Box 2, it is . Find the probability of drawing a red ball." answer="1/2" hint="Add the red paths." solution="There are two red paths. From Box 1: From Box 2: So Hence the answer is ." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Branch probabilities leaving the same node should add to ","A path probability is found by multiplying along the path","A tree method is useful for Bayes-type problems","In a tree, all probabilities must be equal"] answer="A,B,C" hint="Think about how a probability tree is built." solution="1. True.Summary
- The tree method organizes conditional probability stage by stage.
- Multiply along a path and add across relevant disjoint paths.
- Branches from the same node must total .
- Tree diagrams are especially effective in Bayes-type reasoning.
- A correct tree prevents logical mixing of cases.
---
Proceeding to Table method.
---
Part 2: Table method
Table Method
Overview
The table method is a compact and powerful way to solve conditional probability problems when the information naturally falls into categories. It is especially effective in Bayes-type reasoning, test-result problems, and classification problems. Instead of following paths stage by stage as in a tree, the table method organizes outcomes into rows and columns and lets us read totals, intersections, and conditional probabilities directly. ---Learning Objectives
After studying this topic, you will be able to:
- Construct a probability or frequency table from given data.
- Fill row totals, column totals, and internal cells correctly.
- Use the table to compute conditional probabilities.
- Solve Bayes-type reverse-probability questions using table entries.
- Move cleanly between percentages, frequencies, and probabilities.
Core Idea
A probability table organizes events into categories so that:
- rows represent one classification
- columns represent another classification
- each cell represents an intersection event
- row and column totals represent marginal probabilities
- rows may represent Disease / No Disease
- columns may represent Positive / Negative test result
Why the Table Method Works
The table method makes three things easy:
- seeing intersection probabilities such as
- seeing marginal totals such as
- computing conditional probabilities such as
Main Formula in Table Language
If a table gives you:
- intersection entry
- column or row total
then
- numerator = the favorable cell
- denominator = the total of the conditioning category
Frequencies Often Make Tables Easier
In many problems with percentages, it is easier to imagine a sample of:
Then fill the table with frequencies instead of decimals.
At the end, convert back to probability if needed.
Minimal Worked Example
Example 1 A disease occurs in of a population. A test is positive:- in of diseased people
- in of non-diseased people
- Disease:
- No Disease:
- Diseased and positive:
- Non-diseased and positive:
Table Structure Example
A standard table looks like this:
| Category | Positive | Negative | Total |
|---|---:|---:|---:|
| Disease | | | |
| No Disease | | | |
| Total | | | |
Table Method vs Tree Method
Use the table method when:
- the problem is classification-based
- you want totals and subtotals quickly
- Bayes-type reverse probability is asked
- data is already presented in percentage or count form
- the experiment is sequential
- stages happen one after another
Common Patterns
- disease / test result
- machine / defective status
- class membership / success-failure
- source / observed outcome
- frequency table completion
Common Mistakes
- β mixing row totals and column totals
- β using percentages directly without a consistent base
- β dividing by the wrong total in conditional probability
- β forcing a tree when a table is simpler
CMI Strategy
- Identify the two classifications.
- Draw the table with clear row and column labels.
- Fill the easy totals first.
- Fill intersection cells using the given rates.
- Use row/column totals for conditional probability.
- Check that the full total is consistent.
Practice Questions
:::question type="MCQ" question="In a conditional probability table, the denominator of should be" options=["the grand total","the total corresponding to ","the total corresponding to ","the sum of all unfavorable cells"] answer="B" hint="Use the definition of conditional probability." solution="By definition, So the denominator is the total corresponding to . Hence the correct option is ." ::: :::question type="NAT" question="In a school of students, are girls. Among the girls, play chess. Among the boys, play chess. Find the probability that a randomly chosen student plays chess." answer="1/2" hint="Fill the chess counts and divide by ." solution="Girls who play chess: Boys in the school: Boys who play chess: Total students who play chess: Therefore the required probability is Hence the answer is ." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["A table method is useful for Bayes-type questions","A table can organize intersection events and totals together","Conditional probability is obtained by dividing a relevant cell by the relevant total","A table method can never use frequencies"] answer="A,B,C" hint="Think about what a probability table records." solution="1. True.Summary
- The table method is ideal for category-based conditional probability problems.
- Cells represent intersections; row and column totals represent marginals.
- Conditional probability is cell divided by the relevant row or column total.
- Frequencies often make tables simpler than raw percentages.
- The right structure makes Bayes-type reasoning much easier.
---
Proceeding to Reverse probability.
---
Part 3: Reverse probability
Reverse Probability
Overview
Reverse probability problems ask you to work backward from observed information to the hidden cause that produced it. This is the logic behind Bayes-type reasoning. In CMI-style questions, such problems often look simple but are dangerous because human intuition overweights the observed event and underweights the prior chances. ---Learning Objectives
After studying this topic, you will be able to:
- Interpret reverse probability questions correctly.
- Apply Bayes' theorem in simple and multi-case situations.
- Compute posterior probabilities from prior probabilities and likelihoods.
- Handle box-selection, test-diagnosis, and coin-selection problems.
- Avoid base-rate neglect.
Core Idea
A reverse probability problem asks for
rather than the forward probability
Bayes' Theorem
If , then
- is the prior probability,
- is the likelihood,
- is the posterior probability.
Partition Form
If form a partition of the sample space and , then
Standard Bayes Pattern
- Choose a hidden cause:
box, coin, machine, disease status, route, source
- Observe some event:
red ball, head, defective item, positive test
- Work backward using Bayes' theorem.
Base Rate Warning
A highly likely observation under one cause does not automatically make that cause the most probable.
You must also account for how common the cause was before the observation.
Minimal Worked Examples
Example 1 A box is chosen uniformly from:- Box 1: red, blue
- Box 2: red, blue
- a fair coin,
- a two-headed coin,
- a coin with .
Common Mistakes
- β Confusing with .
- β Forgetting to compute the total probability of the observed event.
- β Ignoring prior probabilities.
- β Using intuition instead of the formula in base-rate problems.
CMI Strategy
- Name the hidden causes clearly.
- Write their prior probabilities.
- Compute the probability of the observed event under each cause.
- Use Bayes' formula carefully.
- Simplify only at the end.
Practice Questions
:::question type="MCQ" question="In a Bayes-type problem, the quantity is called the" options=["likelihood","prior probability","posterior probability","sample probability"] answer="C" hint="It is the probability after the evidence is observed." solution="The probability of the hidden cause after seeing the evidence is called the posterior probability. Hence the correct option is ." ::: :::question type="NAT" question="A box is chosen uniformly from two boxes. Box 1 has red and blue balls, and Box 2 has red and blue ball. A red ball is drawn. Find the probability that Box 2 was chosen." answer="2/3" hint="Apply Bayes' theorem." solution="Let be the event that a red ball is drawn. Then and So Therefore $\qquad P(B_2\mid R)=\dfrac{P(R\mid B_2)P(B_2)}{P(R)} = \dfrac{\frac45\cdot \frac12}{\frac35} = \dfrac23$ Hence the answer is ." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Bayes' theorem computes from forward probabilities","Reverse probability problems often require prior probabilities"," and are always equal","A rare cause can still have a small posterior probability even if the evidence is likely under that cause"] answer="A,B,D" hint="One statement incorrectly treats conditional probabilities as symmetric." solution="1. True.- : fair coin
- : two-headed coin
- : biased coin with
Summary
- Reverse probability means working from observed evidence back to a hidden cause.
- Bayes' theorem is the standard tool.
- Posterior probability depends on both likelihood and prior probability.
- Base-rate effects can make intuition unreliable.
- Good Bayes solutions start by naming the hidden causes clearly.
---
Proceeding to Diagnostic test problems.
---
Part 4: Diagnostic test problems
Diagnostic Test Problems
Overview
Diagnostic test problems are one of the most important applications of conditional probability and Bayes' theorem. The main difficulty is that people often confuse:- the probability of testing positive given disease, and
- the probability of having the disease given a positive test.
Learning Objectives
After studying this topic, you will be able to:
- Interpret sensitivity, specificity, false positive rate, and false negative rate correctly.
- Compute the probability of a positive or negative test using total probability.
- Apply Bayes' theorem to find the probability of disease given a test result.
- Understand base-rate effects in rare-disease testing.
- Avoid the common mistake of confusing with .
Core Setup
Let
- = the event that a person has the disease
- = the event that a person does not have the disease
- = the event that the test result is positive
- = the event that the test result is negative
The prevalence of the disease is
and the probability that a person does not have the disease is
Main Test Quantities
- Sensitivity:
- Specificity:
- False positive rate:
- False negative rate:
The Most Important Distinction
These are different quantities:
- = sensitivity
- = probability that a person has the disease given a positive result
The first is about test performance on diseased people.
The second is about what a positive result means for a person.
Total Probability for Test Outcomes
To compute the overall chance of a positive test, split into diseased and non-diseased cases:
Similarly,
Bayes' Theorem
The probability that a person has the disease given a positive test is
Using the total probability formula for , this becomes
The probability that a person does not have the disease given a negative test is
Standard Formula in Parameters
Let
- prevalence =
- sensitivity =
- specificity =
Then
So
and
Table Method
In many diagnostic test problems, it is easiest to imagine a sample population.
For example, if
- prevalence =
- sensitivity =
- specificity =
then among people:
- diseased:
- non-diseased:
Among the diseased:
- true positives:
- false negatives:
Among the non-diseased:
- true negatives:
- false positives:
Then
Why Rare Diseases Are Tricky
Even a very accurate test can have a surprisingly low value of when the disease is rare.
Reason:
- the diseased group is tiny
- the healthy group is huge
- even a small false positive rate applied to a huge healthy group may create many false positives
Minimal Worked Examples
Example 1 A disease affects of a population. A test has sensitivity and specificity . Find the probability of a positive test. We have So Hence --- Example 2 Using the same data, find the probability that a person has the disease given a positive test. By Bayes' theorem, So This is only about , even though the test is quite accurate. ---Common Derived Quantities
- True positive probability:
- False positive probability:
- True negative probability:
- False negative probability:
Common Mistakes
- β Using sensitivity in place of
- β Forgetting to include false positives when computing total positives
- β Ignoring prevalence
- β Mixing up specificity with false positive rate
- β Forgetting that
CMI Strategy
- Define the events , , , and clearly.
- Write prevalence, sensitivity, and specificity first.
- Compute or using total probability.
- Then apply Bayes' theorem.
- If the numbers are awkward, use a -person or -person table.
- Always check whether the question is asking for:
-
-
-
-
Practice Questions
:::question type="MCQ" question="Which of the following equals the sensitivity of a test?" options=["","","",""] answer="B" hint="Sensitivity is the probability of a positive test among diseased people." solution="By definition, sensitivity is . Hence the correct option is ." ::: :::question type="NAT" question="A disease affects of a population. A test has sensitivity and specificity . Find the probability of a positive test." answer="0.17" hint="Use total probability." solution="We have So Hence the answer is ." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Specificity is ","False positive rate is ","Positive predictive value is ","Sensitivity is "] answer="A,B,C" hint="Separate test-quality quantities from posterior probabilities." solution="1. True. This is the definition of specificity.Summary
- Sensitivity is and specificity is .
- Use total probability to compute and .
- Use Bayes' theorem to compute and .
- Positive predictive value can be much smaller than sensitivity when prevalence is low.
- Diagnostic test questions are really conditional probability questions with careful interpretation.
Chapter Summary
Bayes-type reasoning β Key Points
* Conditional Probability Foundation: Conditional probability quantifies the likelihood of event A occurring given that event B has already occurred. Understanding this distinction is crucial.
* Law of Total Probability: This theorem, for a partition , is fundamental for calculating the marginal probability of an event B and often forms the denominator in Bayes' Theorem.
* Bayes' Theorem: provides a rigorous framework for updating prior beliefs to posterior beliefs based on new evidence . This concept of "reverse probability" is central to the chapter.
* Tree Diagrams: An indispensable tool for visualizing sequential events, partitioning sample spaces, and systematically calculating joint and conditional probabilities, especially useful for understanding the flow of events in multi-stage problems.
* Contingency Tables: For problems involving multiple categorizations (e.g., disease status and test results), constructing a contingency table effectively organizes data, clarifies relationships, and simplifies the calculation of various conditional probabilities.
* Diagnostic Test Problems: A common application where sensitivity () and specificity () must be carefully distinguished from the positive predictive value () and negative predictive value ().
* Interpretation of Results: Beyond calculation, interpreting the updated probabilities (posterior probabilities) in the context of the problem is vital for drawing meaningful conclusions and demonstrating conceptual understanding.
Chapter Review Questions
:::question type="MCQ" question="A rare disease affects 0.1% of the population. A diagnostic test for this disease has a sensitivity of 99% and a specificity of 95%. If a randomly selected person tests positive, what is the probability that they actually have the disease?" options=["Approximately 1.94%", "Approximately 0.099%", "Approximately 99%", "Approximately 5%"] answer="Approximately 1.94%" hint="Use Bayes' Theorem. Let D be the event of having the disease and T+ be the event of testing positive. You need to find . Consider the prevalence, sensitivity, and specificity to calculate , , , and ." solution="Let D be the event that a person has the disease, and T+ be the event that they test positive.
Given:
(prevalence)
(sensitivity)
(specificity)
From specificity, .
We want to find . Using Bayes' Theorem:
First, calculate using the Law of Total Probability:
Now, substitute into Bayes' Theorem:
Converting to percentage, this is approximately 1.94%."
:::
:::question type="NAT" question="Urn A contains 4 red and 6 blue balls. Urn B contains 7 red and 3 blue balls. A fair coin is flipped; if it lands heads, a ball is drawn from Urn A, and if tails, from Urn B. What is the probability that the ball drawn is red?" answer="0.55" hint="Use the Law of Total Probability. Define events for selecting each urn and drawing a red ball from each." solution="Let A be the event that Urn A is chosen, and B be the event that Urn B is chosen.
Since a fair coin is flipped:
Let R be the event that a red ball is drawn.
From Urn A:
From Urn B:
Using the Law of Total Probability:
:::
:::question type="MCQ" question="A factory has two machines, M1 and M2, which produce 60% and 40% of the total output, respectively. Machine M1 produces 3% defective items, while Machine M2 produces 5% defective items. If a randomly selected item is found to be defective, what is the probability that it was produced by Machine M2?" options=["0.05", "0.40", "0.5263", "0.02"] answer="0.5263" hint="Apply Bayes' Theorem. Let D be the event that an item is defective. You need to find ." solution="Let M1 be the event an item is from Machine 1, and M2 be the event an item is from Machine 2.
Let D be the event that an item is defective.
Given:
We want to find . Using Bayes' Theorem:
First, calculate using the Law of Total Probability:
Now, substitute into Bayes' Theorem:
Rounding to four decimal places, ."
:::
What's Next?
Continue Your CMI Journey
With a solid understanding of Bayes-type reasoning and conditional probability, you are well-prepared to delve into the broader landscape of probability theory. The concepts learned here, particularly the foundational idea of updating beliefs with new information, are crucial for future chapters. You should now proceed to explore Discrete Random Variables and their Probability Distributions, followed by Continuous Random Variables and their Probability Distributions. These topics build directly upon the principles of probability to introduce methods for quantifying uncertainty and variability, leading naturally into Expected Value and Variance and eventually Sampling Distributions and Statistical Inference.