Quantitative Methods · Probability Trees and Conditional Expectations · LO 1 of 3

Why does a single bad year destroy a portfolio return that looks good on average?

Expected value and variance measure risk-and-reward as a probability forecast, not as a historical average , and knowing the difference changes how you evaluate every investment.

⏱ 8min-15min

6 questions

HIGH PRIORITYAPPLY🧮 Calculator

Why this LO matters

Expected value and variance measure risk-and-reward as a probability forecast, not as a historical average , and knowing the difference changes how you evaluate every investment.

INSIGHT

Expected value is a bet on the future, not a summary of the past. When you forecast earnings per share across different economic scenarios, you assign probabilities to outcomes that have not yet occurred. That probability-weighted average is your expected value , your best guess of what will happen. It is fundamentally different from a historical average, which looks backward at what already happened. The variance and standard deviation around that expected value measure the dispersion of possible outcomes. They describe the range you must be prepared for, not the range you have already seen.

How to Measure What You Expect and How Wrong You Might Be

Think about how a weather forecaster handles tomorrow's temperature. They do not pick one number and declare it certain. They think through the scenarios , cold front arrives, cold front stalls, cold front reverses , and they assign rough odds to each. Their single forecast number is the weighted blend of all those scenarios. That blending process, weighting outcomes by probability, is exactly what these three tools formalise.

One distinction before we begin. The expected value you will calculate here is a forecast. It looks forward. A historical mean looks backward at data that already happened. The CFA curriculum treats these as fundamentally different things, and exam questions test whether you know the difference.

Expected value, variance, and standard deviation

Expected value. The probability-weighted average of all possible outcomes for a random variable. Multiply each outcome by its probability, then sum all the products. This is your single best forecast.

Variance. The probability-weighted average of squared deviations from the expected value. Use this to measure how spread out possible outcomes are. The wider the spread, the larger the variance, and the more uncertain your forecast.

Standard deviation. The positive square root of variance. Use this when you need dispersion measured in the same units as the original data. It is the number you quote to clients because it is interpretable.

The strict calculation order. Expected value must be computed first, then variance, then standard deviation. Reversing this order makes the calculation impossible. On every question: expected value, then variance, then standard deviation.

What Are the Units?

This trips up nearly every candidate the first time.

If the original data is in basis points, variance is in basis points squared. Standard deviation is back in basis points. If the data is in USD, variance is in USD squared, standard deviation is in USD.

The square root in the standard deviation formula restores the original units. Variance is never in the same units as the data. Standard deviation always is.

The wrong move candidates make: stopping at variance and reporting a number in squared units as if it were a standard deviation. The right move: always take the positive square root as your final step.

FORWARD REFERENCE

Net interest margin , what you need for this LO only

Net interest margin (NIM) is the difference between what a bank earns on its loans and what it pays on its deposits, expressed in percentage points or basis points. For this LO, treat it as a random variable that can take several discrete values, each with an assigned probability. You will study bank income statements fully in Financial Statement Analysis.

→ Financial Statement Analysis

Worked Examples: From One Scenario to Many

The examples below build in sequence. Worked Examples 1 and 2 use a single-level probability distribution. Worked Examples 3 and 4 add a second layer , scenarios within scenarios , and require the total probability rule. Follow the characters from example to example: the logic compounds just as the scenarios do.

Worked Example 1

Calculating Expected Value from a Discrete Probability Distribution

Priya Nair is a banking sector analyst at Meridian Capital. She is forecasting next year's net interest margin for Coastal Bank, a mid-sized regional lender. After reviewing interest rate forecasts and loan book data, she assembles four possible NIM outcomes with their associated probabilities.

Probability	NIM (basis points)
0.27	220
0.33	210
0.16	190
0.24	170

🧠Thinking Flow — Expected value from a discrete probability distribution

The question asks

What single number best summarises Coastal Bank's NIM outcome next year?

Key concept needed

Expected value , the probability-weighted average of all possible outcomes.

Step 1 , Check that the distribution is complete

Many candidates skip this step and dive straight into the multiplication. The wrong move is to proceed without confirming the probabilities sum to 1.00 , a missing scenario silently distorts every number that follows. The right move: add the probabilities first. 0.27 + 0.33 + 0.16 + 0.24 = 1.00. They sum to 1.00, so no outcomes are missing. Now it is safe to proceed.

Step 2 , Apply the expected value formula

Multiply each outcome by its probability, then sum all four products. E(NIM) = (0.27 × 220) + (0.33 × 210) + (0.16 × 190) + (0.24 × 170) = 59.40 + 69.30 + 30.40 + 40.80 = 199.90 basis points

Step 3 , Sanity check

The expected value must fall between the lowest outcome (170 bps) and the highest outcome (220 bps). Our answer is 199.90 bps. It sits inside that range. It does not equal any single outcome in the table , this is normal. The expected value is a weighted centre of gravity, not a listed outcome. ✓ Answer: E(NIM) = 199.90 basis points.

🧮 BA II Plus Keystrokes

`2ND``CLRWORK`

Exit any open worksheet → 0.00

`.27``×``220``+`

First term: 0.27 × 220 → 59.40

`.33``×``210``+`

Second term: 0.33 × 210 → 69.30

`.16``×``190``+`

Third term: 0.16 × 190 → 30.40

`.24``×``170``=`

Fourth term: 0.24 × 170, then sums all → 199.90

⚠️ Using a simple average , (220 + 210 + 190 + 170) ÷ 4 = 197.50 bps , treats all outcomes as equally likely. This ignores the stated probabilities and gives the wrong answer. Always weight by the stated probabilities before summing.

Worked Example 2

Calculating Variance and Standard Deviation from a Discrete Distribution

Priya's supervisor at Meridian Capital asks her to quantify the dispersion around the 199.90 bps forecast , specifically, the variance and standard deviation of Coastal Bank's NIM , so the risk team can set scenario bounds.

🧠Thinking Flow — Variance and standard deviation from a probability distribution

The question asks

How dispersed are the possible NIM outcomes around the expected value of 199.90 bps?

Key concept needed

Variance is the probability-weighted average of squared deviations from E(X). Standard deviation is the positive square root of variance. The mandatory sequence: expected value first, variance second, standard deviation third.

Step 1 , Confirm E(NIM) from Worked Example 1

E(NIM) = 199.90 bps. This is the anchor for all deviation calculations.

Step 2 , Calculate each squared deviation, weighted by probability

Outcome (bps)	Probability	Deviation	Deviation²	Prob × Deviation²
220	0.27	220 − 199.90 = +20.10	404.01	0.27 × 404.01 = 109.08
210	0.33	210 − 199.90 = +10.10	102.01	0.33 × 102.01 = 33.66
190	0.16	190 − 199.90 = −9.90	98.01	0.16 × 98.01 = 15.68
170	0.24	170 − 199.90 = −29.90	894.01	0.24 × 894.01 = 214.56

Step 3 , Sum the weighted squared deviations to get variance

σ²(NIM) = 109.08 + 33.66 + 15.68 + 214.56 = 372.98 basis points² Notice the unit: basis points squared. The squaring step changes the unit. This number is not directly interpretable alongside the original NIM values.

Step 4 , Take the positive square root to get standard deviation

σ(NIM) = √372.98 = 19.31 basis points The unit is now back to basis points , the same unit as the original NIM data. NIM outcomes typically deviate from the forecast by about 19 basis points.

Step 5 , Sanity check

Standard deviation must be (a) positive, (b) in the same units as the data, and (c) smaller than the full range of outcomes (220 − 170 = 50 bps). Our answer is 19.31 bps. All three conditions hold. ✓ Answer: Variance = 372.98 basis points²; Standard deviation = 19.31 basis points.

🧮 BA II Plus Keystrokes

`2ND``CLRWORK`

Clear any open worksheet → 0.00

`.27``×``(``220``−``199.9``)``x²``+`

First term: 0.27 × (20.10)² = 109.08 → 109.08

`.33``×``(``210``−``199.9``)``x²``+`

Second term: 0.33 × (10.10)² = 33.66 → 33.66

`.16``×``(``190``−``199.9``)``x²``+`

Third term: 0.16 × (9.90)² = 15.68 → 15.68

`.24``×``(``170``−``199.9``)``x²``=`

Fourth term: 0.24 × (29.90)², then sums all → 372.98

`2ND``x²`

Take positive square root of 372.98 → 19.31

⚠️ Stopping after step 4 and reporting 372.98 as "standard deviation." This is the most common error on this calculation type. The variance is an intermediate result. Always take the square root as the final step when the question asks for standard deviation.

Worked Example 3

Using the Total Probability Rule with Conditional Expectations

Marcus Osei is a fixed income analyst at Verdant Asset Management. He is estimating the expected recovery on a distressed bond issued by Harlow Logistics. Recovery per USD1 of principal depends on the prevailing economic scenario. There is a 0.75 probability of a moderate economic environment (Scenario M) and a 0.25 probability of a stressed environment (Scenario S). Under Scenario M, recovery is USD0.90 with probability 0.45, or USD0.80 with probability 0.55. Under Scenario S, recovery is USD0.50 with probability 0.85, or USD0.40 with probability 0.15.

🧠Thinking Flow — Total probability rule with conditional expectations

The question asks

What is the overall expected recovery per USD1 of principal, combining both economic scenarios?

Key concept needed

The total probability rule for expected value. Calculate E(recovery | each scenario) first using the conditional probabilities within each scenario. Then weight each conditional expectation by the probability of its scenario. The wrong approach is to ignore the scenario-level probabilities and compute a simple average of all four outcomes , this ignores the 0.75/0.25 split and gives a badly distorted answer.

Step 1 , Calculate expected recovery under Scenario M

E(recovery | Scenario M) = (0.45 × USD0.90) + (0.55 × USD0.80) = USD0.405 + USD0.440 = USD0.845 per USD1 principal

Step 2 , Calculate expected recovery under Scenario S

E(recovery | Scenario S) = (0.85 × USD0.50) + (0.15 × USD0.40) = USD0.425 + USD0.060 = USD0.485 per USD1 principal

Step 3 , Apply the total probability rule

E(recovery) = E(recovery | Scenario M) × P(Scenario M) + E(recovery | Scenario S) × P(Scenario S) = (USD0.845 × 0.75) + (USD0.485 × 0.25) = USD0.63375 + USD0.12125 = USD0.755 per USD1 principal

Step 4 , Verify by direct calculation (second method)

Compute joint probabilities for all four outcomes, then apply the expected value formula directly.

Outcome	Joint probability	Prob × Outcome
USD0.90	0.75 × 0.45 = 0.3375	0.3375 × 0.90 = 0.30375
USD0.80	0.75 × 0.55 = 0.4125	0.4125 × 0.80 = 0.33000
USD0.50	0.25 × 0.85 = 0.2125	0.2125 × 0.50 = 0.10625
USD0.40	0.25 × 0.15 = 0.0375	0.0375 × 0.40 = 0.01500
Sum	1.0000	0.75500

Both methods give USD0.755. The two methods agree.

Step 5 , Sanity check

The overall expected value must fall between USD0.485 (Scenario S) and USD0.845 (Scenario M). USD0.755 sits inside that range and pulls toward the Scenario M result , which makes sense, because Scenario M carries a 0.75 probability weight. ✓ Answer: E(recovery) = USD0.755 per USD1 principal.

🧮 BA II Plus Keystrokes

`2ND``CLRWORK`

Clear any open worksheet → 0.00

`.45``×``.9``+``.55``×``.8``=`

E(recovery, Scenario M) = 0.845 → 0.845

`STO``1`

Save to Memory 1 → 0.845

`.85``×``.5``+``.15``×``.4``=`

E(recovery, Scenario S) = 0.485 → 0.485

`STO``2`

Save to Memory 2 → 0.485

`RCL``1``×``.75``+``RCL``2``×``.25``=`

Total probability rule: 0.845 × 0.75 + 0.485 × 0.25 → 0.755

⚠️ Computing (0.90 + 0.80 + 0.50 + 0.40) ÷ 4 = 0.65 , the simple average of the four outcomes. This ignores both the within-scenario probabilities and the between-scenario weights. The correct answer of 0.755 is substantially higher than 0.65 because the high-recovery moderate scenario carries a 75% probability.

Worked Example 4

Conditional Variance and Scenario-Specific Risk

Elena Vasquez is a portfolio risk manager at Stonehaven Fund. She is reviewing Marcus's Harlow Logistics bond analysis. Her risk committee requires the standard deviation of recovery under each economic scenario separately , so that scenario-specific tail risk can be assessed independently.

🧠Thinking Flow — Conditional variance and standard deviation

The question asks

What is the standard deviation of recovery under Scenario M, and under Scenario S, separately?

Key concept needed

Conditional variance , variance computed using the conditional probabilities and the conditional expected value for a specific scenario. The formula is identical to the standard variance formula, but every input is the conditional version for that scenario. The wrong approach is to use the unconditional expected value of USD0.755 as the anchor , that mixes across scenarios and gives a meaningless result for scenario-specific risk.

Step 1 , Recall the conditional expected values from Worked Example 3

E(recovery | Scenario M) = USD0.845 E(recovery | Scenario S) = USD0.485 These are the anchors for each scenario's variance calculation.

Step 2 , Calculate variance under Scenario M

σ²(recovery | Scenario M) = 0.45 × (0.90 − 0.845)² + 0.55 × (0.80 − 0.845)² = 0.45 × (0.055)² + 0.55 × (0.045)² = 0.45 × 0.003025 + 0.55 × 0.002025 = 0.001361 + 0.001114 = 0.002475 USD² σ(recovery | Scenario M) = √0.002475 = 0.04975 ≈ USD0.0498 per USD1 principal

Step 3 , Calculate variance under Scenario S

σ²(recovery | Scenario S) = 0.85 × (0.50 − 0.485)² + 0.15 × (0.40 − 0.485)² = 0.85 × (0.015)² + 0.15 × (0.085)² = 0.85 × 0.000225 + 0.15 × 0.007225 = 0.000191 + 0.001084 = 0.001275 USD² σ(recovery | Scenario S) = √0.001275 = 0.03571 ≈ USD0.0357 per USD1 principal

Step 4 , Sanity check

Both standard deviations must be positive and smaller than the spread of outcomes within their scenario. Under Scenario M, outcomes are 0.90 and 0.80 , a spread of 0.10. Our σ of 0.0498 is just under half that spread, consistent with probabilities close to 50/50 (0.45/0.55). Under Scenario S, outcomes are 0.50 and 0.40 , a spread of 0.10. Our σ of 0.0357 is smaller, consistent with a skewed split (0.85/0.15) that pulls the expected value toward the higher outcome and reduces dispersion. ✓ Answer: σ(recovery | Scenario M) ≈ USD0.0498 per USD1 principal; σ(recovery | Scenario S) ≈ USD0.0357 per USD1 principal.

🧮 BA II Plus Keystrokes

`2ND``CLRWORK`

Clear any open worksheet → 0.00

`.45``×``(``.9``−``.845``)``x²``+``.55``×``(``.8``−``.845``)``x²``=`

Variance under Scenario M → 0.002475

`2ND``x²`

Standard deviation under Scenario M → 0.049749

`STO``1`

Store for reference → 0.049749

`.85``×``(``.5``−``.485``)``x²``+``.15``×``(``.4``−``.485``)``x²``=`

Variance under Scenario S → 0.001275

`2ND``x²`

Standard deviation under Scenario S → 0.035707

⚠️ Using the unconditional expected value of USD0.755 instead of the scenario-specific conditional expected values. Using USD0.755 as the anchor produces deviations like (0.90 − 0.755) and (0.80 − 0.755) , these measure distance from the overall forecast, not within-scenario dispersion. Always anchor the variance calculation on the conditional expected value for the scenario being analysed.

⚠️

Watch out for this

The Variance-as-Standard-Deviation Trap. A candidate who stops after computing the variance and reports that number as the standard deviation gets 372.98 (basis points)², or the equivalent squared figure in their problem's units , a result that is in squared units and is orders of magnitude larger than any interpretable risk figure. The correct standard deviation requires one additional step: take the positive square root of the variance, giving 19.31 basis points in the NIM example , a number in the original units and directly comparable to the expected value of 199.90 bps. Candidates make this error because the variance calculation involves many arithmetic steps. Arriving at a numeric result feels like completion. The square root reads as optional tidying rather than a mandatory unit restoration. Before submitting any standard deviation answer, confirm that the unit of your result matches the unit of the original data. If your answer is in basis points squared or dollars squared, you have not finished.

🧠

Memory Aid

FORMULA HOOK

Weight, square the gap, sum, then root , the root is never optional.

Practice Questions · LO1

6 Questions LO1

Score: — / 6

Q 1 of 6 — REMEMBER

Which of the following best describes the expected value of a random variable in a probability distribution?

CORRECT: B

CORRECT: B , The expected value multiplies each possible outcome by its probability and sums all those products. It is a forward-looking, probability-weighted centre of gravity for an uncertain quantity. No outcome needs to be the most likely or the most common , the probability weights do the work, and the result can fall between two listed outcomes that are neither of them.

Why not A? Option A describes the mode of a distribution, not the expected value. The mode is the most frequently occurring value. The expected value does not require any outcome to occur more often than others. It can land on a value that is never actually observed , as in Worked Example 1, where the expected NIM of 199.90 bps does not appear anywhere in Priya's four-row table.

Why not C? Option C describes a simple arithmetic mean, which treats all outcomes as equally likely by dividing by the count of outcomes. The expected value uses probability weights, not equal division. When probabilities differ across outcomes , as they almost always do in investment problems , the arithmetic mean and the expected value diverge. This is one of the most common first mistakes on this topic: computing (x₁ + x₂ + x₃) ÷ 3 instead of p₁x₁ + p₂x₂ + p₃x₃.

---

Q 2 of 6 — UNDERSTAND

An analyst states: "I computed the expected return of a portfolio and got 8.4%. The variance of that portfolio is therefore 8.4% squared, or 70.56 (%)²." Which part of this statement is incorrect, and why?

CORRECT: C

CORRECT: C , Variance measures how spread out outcomes are around the expected value. The formula is σ² = Σ p(xᵢ) × (xᵢ − E(X))². It requires computing the deviation of each outcome from E(X), squaring those deviations, weighting by probability, and summing. Simply squaring the expected return itself , (8.4%)² = 70.56 (%)² , produces a number that is arithmetically defined but financially meaningless as a dispersion measure. It tells you nothing about how far outcomes might deviate from 8.4%.

Why not A? The expected return of 8.4% is not necessarily wrong based on the information given. The analyst's error is in the variance step alone. Dismissing the expected return without evidence is the wrong diagnosis. On exam questions that describe a multi-step error, always identify which specific step failed , do not reject all steps by default.

Why not B? There is no valid shortcut that produces variance by squaring the expected value. This would only be correct if every single outcome in the distribution equalled the expected value , meaning zero dispersion and a variance of exactly zero. In any realistic scenario with multiple possible outcomes, squaring E(X) and squaring the deviations from E(X) are entirely different operations that produce entirely different numbers.

---

Q 3 of 6 — APPLY

Fatima Al-Rashidi is a credit analyst at Pinnacle Investments. She is forecasting recovery rates on a distressed loan with three scenarios: What is the expected recovery rate?

Scenario	Probability	Recovery Rate
Optimistic	0.30	85%
Base	0.50	60%
Pessimistic	0.20	35%

CORRECT: A

CORRECT: A , Apply the expected value formula: E(recovery) = (0.30 × 85%) + (0.50 × 60%) + (0.20 × 35%) = 25.5% + 30.0% + 7.0% = 62.5%. Each outcome is weighted by its stated probability before summing. The result sits between the lowest (35%) and highest (85%) outcomes, and it is pulled above the base case of 60% because the optimistic scenario (85%) carries a 0.30 weight that more than offsets the pessimistic scenario's 0.20 weight.

Why not B? Option B is the base case outcome , 60% , not the expected value. A candidate selecting this option has either assumed the most-probable outcome equals the expected value, or has computed only the base case term (0.50 × 60% = 30%) and misread it as the final answer. The expected value is always a probability-weighted blend of all outcomes, not a single scenario's result. The base case is only the expected value if all other probabilities are zero.

Why not C? Option C (60.2%) likely results from computing a simple arithmetic average without applying probabilities: (85% + 60% + 35%) ÷ 3 = 60.0%, not 60.2%. A candidate arriving at 60.2% may have introduced a small rounding error while treating the outcomes as equally weighted. The core error is the same regardless: equal weighting when the scenarios explicitly carry different probabilities. Always weight each outcome by its stated probability before summing.

---

Q 4 of 6 — APPLY+

Using Fatima's data from Q3 , E(recovery) = 62.5% , the risk team asks for the standard deviation of recovery rates. What is the standard deviation?

Scenario	Probability	Recovery Rate
Optimistic	0.30	85%
Base	0.50	60%
Pessimistic	0.20	35%

CORRECT: C

CORRECT: C , First compute the variance. Deviations from E(recovery) = 62.5% are: (85 − 62.5) = 22.5, (60 − 62.5) = −2.5, (35 − 62.5) = −27.5. Squared deviations: 506.25, 6.25, 756.25. Weighted by probability: (0.30 × 506.25) + (0.50 × 6.25) + (0.20 × 756.25) = 151.875 + 3.125 + 151.25 = 306.25 (%)². Standard deviation: √306.25 = 17.50%. The unit is now back to percentage points, matching the original recovery rate data.

Why not A? Option A is the variance , 306.25 (%)² , not the standard deviation. This is exactly the error the trap box describes: stopping at variance and reporting it as standard deviation. The unit alone reveals the error: (percentage points)² cannot be compared to a recovery rate expressed in percentage points. Standard deviation requires the positive square root as a final mandatory step.

Why not B? Option B (15.42%) results from anchoring the variance calculation on the base case outcome of 60% instead of the expected value of 62.5%. Using 60% as the anchor gives deviations of 25, 0, and −25 for the three scenarios, producing a variance of (0.30 × 625) + (0.50 × 0) + (0.20 × 625) = 312.5 (%)², and a standard deviation of √312.5 ≈ 17.68% , not exactly 15.42%, suggesting the candidate using this wrong approach also compounded a further arithmetic error. The correct anchor is always the probability-weighted expected value, not the most central or most probable listed outcome.

---

Q 5 of 6 — ANALYZE

Two analysts are discussing how to report dispersion in a bond recovery rate distribution. Analyst Yuki argues: "Variance is the better measure to report to clients because larger deviations get amplified by the squaring, capturing tail risk more explicitly." Analyst Dmitri argues: "Standard deviation is the better measure to report because its units are interpretable alongside the expected value." Which analyst makes the stronger argument for reporting to clients?

CORRECT: B

CORRECT: B , Dmitri's argument is stronger for the specific purpose of reporting to clients. Standard deviation is measured in the same units as the underlying data. If recovery rates are in percent, standard deviation is in percent. A client can interpret "expected recovery of 62.5% with a standard deviation of 17.5%" directly. Variance, at 306.25 (%)², is mathematically equivalent but requires a unit conversion before it communicates anything intuitive. For internal risk modelling and portfolio mathematics, variance has its own advantages , but for client communication, standard deviation is the right tool.

Why not A? Yuki is correct that squaring amplifies larger deviations , this is a real and useful property in certain contexts, particularly when building portfolio risk models where variance terms combine algebraically. But amplification is a feature of how variance is constructed, not a reason to prefer it as the number you hand a client. A client comparing a 62.5% expected recovery to a 306.25 (%)² risk figure cannot make that comparison without doing the square root themselves. Yuki's argument is valid in a modelling context, not a communication context.

Why not C? It is true that variance and standard deviation carry equivalent mathematical information , each can be derived from the other. But "equally correct" ignores the stated purpose: reporting to clients. For that purpose, units matter enormously. Calling the choice "purely aesthetic" incorrectly implies no practical difference, when in fact a squared-unit figure is not interpretable alongside a percentage-unit expected value without additional computation. The distinction is practical, not cosmetic.

---

Q 6 of 6 — TRAP

Kwame Mensah is a risk analyst at Bridgefield Capital. He is computing the risk of a loan portfolio's annual loss rate. The probability distribution of annual losses has an expected value of USD18 million. After completing his variance calculation, he arrives at USD148.84 million². Kwame reports to his supervisor: "The standard deviation of annual losses is USD148.84 million." What error has Kwame made, and what is the correct standard deviation?

CORRECT: B

CORRECT: B , Kwame has stopped one step too early. Variance (USD148.84 million²) and standard deviation are not the same number and are not in the same units. Standard deviation is the positive square root of variance: √148.84 = 12.20. The correct standard deviation is USD12.20 million , in the same units as the original loss data (USD millions) and directly comparable to the expected loss of USD18 million. Kwame's reported figure of USD148.84 million is in USD² (millions squared), is not interpretable as a risk measure, and is more than twelve times too large.

Why not A? The choice of anchor for the variance calculation is a separate issue from the square root step. The expected value (mean) is the correct anchor for computing variance from a probability distribution , using the median instead would produce a different metric that is not the standard variance. Even if Kwame had used the correct anchor, he would still arrive at the variance, not the standard deviation. Replacing the mean with the median does not fix the error Kwame made.

Why not C? The magnitude of the underlying data has no bearing on whether the square root step is required. Variance is always in squared units, regardless of whether losses are USD1 million or USD1 billion. A large expected loss does not make the variance interpretable as a standard deviation. The unit mismatch , USD million² versus USD million , is always present, always requires the square root to resolve, and is never made acceptable by the scale of the numbers involved. Accepting a squared-unit figure as a standard deviation because the underlying numbers are large is precisely the cognitive error the trap box names.

---

Glossary

total probability rule

A method for calculating an overall probability or expected value by breaking a problem into scenarios, computing the result within each scenario, then weighting and summing those results by the probability of each scenario. If you want to know the expected recovery on a bond, you might calculate expected recovery under a boom scenario and under a recession scenario, then combine them weighted by each scenario's probability.

conditional expectations

The expected value of an outcome given that a specific scenario or condition has occurred. Think of it as a forecast that assumes one particular thing is already true , for example, expected NIM if interest rates rise is conditional on that rate environment materialising. Different conditions produce different conditional expectations.

probability trees

A diagram showing how different starting scenarios branch into different outcomes, with probabilities marked on each branch. Like a decision tree, but for uncertain events , each branch represents a path the future might take, and the probabilities tell you how likely each path is.

expected value

The probability-weighted average of all possible outcomes of an uncertain quantity. Multiply each outcome by its probability of occurring, then add up all those products. If a bond has a 60% chance of returning USD80 and a 40% chance of returning USD50, the expected value is (0.60 × 80) + (0.40 × 50) = USD68.

random variable

A quantity whose future value is unknown because it depends on events that have not yet happened. Tomorrow's stock price, next quarter's earnings, or next year's net interest margin are all random variables , each could take one of several values depending on which scenario occurs.

variance

A measure of how spread out possible outcomes are around the expected value, calculated by squaring each deviation from the expected value and weighting by probability. Because of the squaring step, variance is always in squared units , if outcomes are in percent, variance is in (percent)², making it difficult to interpret directly alongside the expected value.

standard deviation

The positive square root of variance. It converts variance back into the original units of measurement, making it directly comparable to the expected value. If possible returns are in percent, standard deviation is also in percent.

basis points

One basis point equals 0.01% (one hundredth of a percent). A move from a net interest margin of 2.00% to 2.20% is a move of 20 basis points, often written as 20 bps. Used because it avoids ambiguity when comparing percentage changes in rates that are already expressed as percentages.

net interest margin

The difference between the interest income a bank earns on its loans and the interest it pays on its deposits, expressed as a percentage or in basis points. Think of it as the bank's lending profit spread , the gap between what it charges borrowers and what it pays depositors.

probability distribution

A complete list of all possible outcomes of an uncertain quantity, paired with the probability of each outcome occurring. The probabilities must sum to exactly 1.0. A distribution answers the question: what could happen, and how likely is each possibility?

total probability rule for expected value

A method for computing an overall expected value when outcomes depend on which scenario occurs. First calculate the expected value within each scenario (the conditional expectation). Then weight those conditional expectations by the probability of each scenario and sum them. The result is the unconditional expected value , valid regardless of which scenario actually occurs.

conditional probabilities

The probability of an outcome given that a specific scenario or condition is true. If you ask "what is the probability that recovery is USD0.90?" you get the unconditional answer. If you ask "what is the probability that recovery is USD0.90, given that the economy is in moderate shape?" you get a conditional probability , narrowed to one branch of the probability tree.

LO 1 Done ✓

Ready for the next learning objective.

🔒 PRO Feature

How analysts use this at work

Real-world applications and interview questions from top firms.

Quantitative Methods · Probability Trees and Conditional Expectations · LO 2 of 3

Why does a bank's expected earnings change when you find out interest rates will rise?

Learn to build a probability tree, calculate conditional expectations within each scenario, and use the total probability rule to aggregate them into one forecast.

⏱ 8min-15min

6 questions

HIGH PRIORITYANALYZE

Why this LO matters

Learn to build a probability tree, calculate conditional expectations within each scenario, and use the total probability rule to aggregate them into one forecast.

INSIGHT

A probability tree is not a new calculation technique. It is a visual filing system. It separates what you know unconditionally (scenarios that may or may not occur) from what you know conditionally (if this scenario is true, these outcomes are possible). Once you have organised the information this way, one scenario per branch and one outcome per sub-branch, the calculation is mechanical. Multiply along the branches to get unconditional probabilities. Build conditional expected values within each scenario. Weight and sum. The tree makes invisible dependencies visible, which prevents the most common error: forgetting which probabilities apply only to a single branch.

How to Structure a Probability Problem Before You Calculate Anything

Think about how a weather forecaster organises information. She does not say "there is a 60% chance of sun and also a 70% chance of warm temperatures" without specifying which conditions depend on which. She separates what happens first (a high-pressure system moves in or it does not) from what follows if each case occurs (sunny or cloudy, warm or cool).

A probability tree does the same thing for investment scenarios. It forces you to separate two questions that candidates constantly mix up: "what happens in a specific scenario?" and "what happens overall?" Keeping those two questions separate is the entire skill this LO tests.

Probability Trees and Conditional Expectations

Probability tree diagram. A branching diagram where each fork represents a scenario and each branch endpoint represents a possible outcome. Use it to organise complex multi-scenario problems before computing anything.

Scenario. A mutually exclusive and exhaustive event that determines which branch of the tree applies, such as "interest rates rise" or "economic expansion occurs." Identify the scenarios first, assign each a probability, and confirm those probabilities sum to 1.

Conditional probability. The probability of an outcome given that a specific scenario has already occurred. These probabilities live inside a single branch and sum to 1 within that branch only, not across the whole tree.

Unconditional probability. The probability of an outcome without restricting to any scenario. Compute it by multiplying the scenario probability by the conditional probability along that path: P(outcome) = P(scenario) × P(outcome | scenario).

Conditional expected value. The weighted average of outcomes within one scenario, using that scenario's conditional probabilities as weights. Compute one conditional expected value for each scenario branch, then combine them using the total probability rule.

Total probability rule for expected value. The formula that combines conditional expected values into a single overall expected value. Weight each scenario's conditional expected value by the probability of that scenario occurring and sum all terms. This is the final aggregation step , the step candidates most often skip.

FORWARD REFERENCE

Unconditional expected value , what you need for this LO only

The overall expected value E(X) = Σ P(Xi) × Xi, computed directly from a single set of outcomes and their unconditional probabilities, was introduced in Learning Module 4.1. For this LO, you need to recognise that this same overall expected value can be reached two ways: directly from unconditional probabilities, or via the total probability rule using conditional expected values. Both routes must give the same answer. Verifying that they do is the check that confirms your tree is correct. Full treatment: Quantitative Methods, Learning Module 4.1.

→ Quantitative Methods

Building the Tree: Structure Before Numbers

The wrong move is to immediately multiply the numbers you are given. When a problem has four or six outcomes spread across two scenarios, that shortcut assigns probabilities to the wrong outcomes. Build the structure first, then fill in the numbers.

Worked Example 1

Building a Probability Tree from Scratch

Priya Mehta is a credit analyst at Solaris Capital, evaluating a defaulted corporate bond issued by Meridian Logistics. Recovery depends on which of two legal rulings prevails. Scenario 1 (probability 0.75): the court enforces full collateral, giving a 0.45 probability of recovering USD0.90 per USD1 of principal or a 0.55 probability of recovering USD0.80. Scenario 2 (probability 0.25): the court applies a haircut, giving a 0.85 probability of recovering USD0.50 per USD1 or a 0.15 probability of recovering USD0.40. Priya must structure this as a probability tree and identify every unconditional probability before calculating any expected value.

🧠Thinking Flow — Building the tree , assigning probabilities to every node

The question asks

How do we translate a word problem with two scenarios and two outcomes each into a formal probability tree with correctly labelled conditional and unconditional probabilities?

Key concept needed

Probability tree diagram. Many candidates immediately jump to multiplying numbers without first drawing the structure. That shortcut works when the tree is tiny. It systematically causes wrong probability assignments once the tree has more than two branches.

Step 1 , Identify the structure

Many candidates treat all four outcomes (USD0.90, USD0.80, USD0.50, USD0.40) as directly comparable events with a flat probability each. That is wrong. These outcomes are nested inside scenarios. The correct structure has two levels: scenario first, outcome second. Draw two branches from the root: Scenario 1 (P = 0.75) and Scenario 2 (P = 0.25). Confirm these sum to 1.00. ✓

Step 2 , Populate the conditional branches

From the Scenario 1 node, draw two sub-branches: USD0.90 with P(USD0.90 | Scenario 1) = 0.45, and USD0.80 with P(USD0.80 | Scenario 1) = 0.55. Confirm: 0.45 + 0.55 = 1.00. ✓ From the Scenario 2 node, draw two sub-branches: USD0.50 with P(USD0.50 | Scenario 2) = 0.85, and USD0.40 with P(USD0.40 | Scenario 2) = 0.15. Confirm: 0.85 + 0.15 = 1.00. ✓ These four probabilities are conditional probabilities. They live inside a single branch and describe what happens only if that scenario occurs.

Step 3 , Compute all four unconditional probabilities by multiplying along each path

- P(USD0.90) = 0.75 × 0.45 = 0.3375 - P(USD0.80) = 0.75 × 0.55 = 0.4125 - P(USD0.50) = 0.25 × 0.85 = 0.2125 - P(USD0.40) = 0.25 × 0.15 = 0.0375

Step 3 , Sanity check: verify the four unconditional probabilities sum to 1

0.3375 + 0.4125 + 0.2125 + 0.0375 = 1.0000. ✓ If this sum is not exactly 1.00, either a scenario probability or a conditional probability was entered incorrectly. ✓ Answer: The four unconditional probabilities are 0.3375, 0.4125, 0.2125, and 0.0375. They sum to 1.00, confirming the tree is internally consistent.

Calculating Expected Values Within Each Scenario

Now that the tree is correctly built, calculate what Priya expects to recover in each legal scenario separately. This is the conditional expected value step. It uses only the information inside one branch at a time.

Worked Example 2

Calculating Conditional Expected Values Within Each Scenario

Continuing the Meridian Logistics bond from Worked Example 1, Priya now needs to calculate the expected recovery per USD1 of principal separately for each legal scenario. She will report these conditional expected values to her portfolio manager before any aggregation is done.

🧠Thinking Flow — Conditional expected values , one scenario at a time

The question asks

What is E(recovery | Scenario 1) and E(recovery | Scenario 2) for the Meridian Logistics bond?

Key concept needed

Conditional expected value. A common wrong move is using the unconditional probabilities (0.3375, 0.4125, etc.) in place of the conditional ones (0.45, 0.55, etc.). This gives a different number and answers the wrong question. It mixes scenario-level information with branch-level information.

Step 1 , Identify which probabilities belong inside Scenario 1

The conditional probabilities inside Scenario 1 are 0.45 for USD0.90 and 0.55 for USD0.80. These are the correct weights. The unconditional probabilities 0.3375 and 0.4125 are not used here.

Step 2 , Calculate E(recovery | Scenario 1)

E(recovery | S1) = 0.45 × USD0.90 + 0.55 × USD0.80 = USD0.405 + USD0.440 = USD0.845

Step 3 , Identify which probabilities belong inside Scenario 2

The conditional probabilities inside Scenario 2 are 0.85 for USD0.50 and 0.15 for USD0.40.

Step 4 , Calculate E(recovery | Scenario 2)

E(recovery | S2) = 0.85 × USD0.50 + 0.15 × USD0.40 = USD0.425 + USD0.060 = USD0.485 Step 5 , Sanity check: are both conditional expected values within the range of their scenario's outcomes? Scenario 1 outcomes span USD0.80 to USD0.90. E(recovery | S1) = USD0.845. It falls between the two. ✓ Scenario 2 outcomes span USD0.40 to USD0.50. E(recovery | S2) = USD0.485. It falls between the two, closer to USD0.50 because the 0.85 weight pulls it upward. ✓ A conditional expected value that falls outside its scenario's outcome range signals an arithmetic error. Check it immediately. ✓ Answer: E(recovery | S1) = USD0.845. E(recovery | S2) = USD0.485.

Aggregating the Scenarios: The Step Candidates Most Often Skip

Priya now has two numbers: USD0.845 and USD0.485. These are inputs to the final step, not the final answer. The total probability rule connects the scenario-level picture to the overall picture.

Worked Example 3

Applying the Total Probability Rule to Find Overall Expected Recovery

Priya has her two conditional expected values. Her portfolio manager now asks for a single number: what is Solaris Capital's overall expected recovery per USD1 of principal on the Meridian Logistics bond, taking into account both the probability of each legal scenario and the expected outcome within each one?

🧠Thinking Flow — Total probability rule , the aggregation step

The question asks

What is E(recovery) for the Meridian Logistics bond using the total probability rule?

Key concept needed

Total probability rule for expected value. The conditional expected values from Worked Example 2 are not the answer to this question. They are inputs into this step. Missing this step is the single most common error on exam questions about probability trees.

Step 1 , State the formula

E(X) = E(X | S1) × P(S1) + E(X | S2) × P(S2) This weights each scenario's conditional expected value by the probability of that scenario occurring.

Step 2 , Substitute the values from Examples 1 and 2

E(recovery) = USD0.845 × 0.75 + USD0.485 × 0.25

Step 3 , Calculate each term

First term: USD0.845 × 0.75 = USD0.63375 Second term: USD0.485 × 0.25 = USD0.12125

Step 4 , Sum the terms

E(recovery) = USD0.63375 + USD0.12125 = USD0.755

Step 5 , Verify using unconditional probabilities (the second calculation path)

The unconditional probabilities from Worked Example 1 were 0.3375, 0.4125, 0.2125, and 0.0375. E(recovery) = 0.3375(USD0.90) + 0.4125(USD0.80) + 0.2125(USD0.50) + 0.0375(USD0.40) = USD0.30375 + USD0.33000 + USD0.10625 + USD0.01500 = USD0.755 ✓ Both calculation paths give the same answer. This is the consistency check. If the two methods disagree, either a conditional probability or a scenario probability was assigned incorrectly. Go back to the tree. Step 6 , Sanity check: does the overall expected value sit between the two conditional expected values? USD0.755 lies between E(recovery | S1) = USD0.845 and E(recovery | S2) = USD0.485. ✓ It is closer to USD0.845 because Scenario 1 has the higher probability (0.75 vs 0.25). A weighted average must always fall between its components and lean toward the more probable one. ✓ Answer: E(recovery) = USD0.755 per USD1 of principal. Confirmed by both the total probability rule path and the unconditional probability path.

Measuring Risk Within a Scenario: Conditional Variance

FORWARD REFERENCE

Conditional variance , what you need for this LO only

Conditional variance measures how dispersed the outcomes are within a single scenario branch. It is computed exactly like ordinary variance, but restricted to one branch: use the conditional probabilities as weights and subtract the conditional expected value (not the overall expected value) as the mean. For this LO, compute and interpret conditional variance within one named scenario. You will not be asked to reconcile conditional variances back to an overall unconditional variance. That relationship requires the law of total variance and is covered in later modules. Priya's portfolio manager knows the overall expected recovery is USD0.755. But he also wants to know how uncertain the outcome is within each legal scenario. That is a conditional variance question.

→ Quantitative Methods

Worked Example 4

Conditional Variance , Measuring Risk Within One Scenario

Priya's portfolio manager asks her to compute the variance of recovery outcomes under Scenario 1 (full-collateral enforcement) and Scenario 2 (haircut), so he can compare the relative risk of each legal path independently.

🧠Thinking Flow — Conditional variance , dispersion inside a single scenario branch

The question asks

What are σ²(recovery | Scenario 1) and σ²(recovery | Scenario 2) for the Meridian Logistics bond?

Key concept needed

Conditional variance. The most common wrong move is subtracting the overall E(recovery) = USD0.755 instead of the scenario-specific conditional expected value. This gives a numerically different and conceptually wrong result.

Step 1 , Recall the conditional expected values

E(recovery | S1) = USD0.845 (from Worked Example 2). E(recovery | S2) = USD0.485 (from Worked Example 2).

Step 2 , Compute σ²(recovery | Scenario 1)

Formula: σ²(X | S) = Σ P(Xi | S) × [Xi − E(X | S)]² Use only the Scenario 1 branch outcomes and their conditional probabilities. Term 1: P(USD0.90 | S1) × (USD0.90 − USD0.845)² = 0.45 × (0.055)² = 0.45 × 0.003025 = 0.0013613 Term 2: P(USD0.80 | S1) × (USD0.80 − USD0.845)² = 0.55 × (−0.045)² = 0.55 × 0.002025 = 0.0011138 σ²(recovery | S1) = 0.0013613 + 0.0011138 = 0.002475

Step 3 , Compute σ²(recovery | Scenario 2)

Term 1: P(USD0.50 | S2) × (USD0.50 − USD0.485)² = 0.85 × (0.015)² = 0.85 × 0.000225 = 0.0001913 Term 2: P(USD0.40 | S2) × (USD0.40 − USD0.485)² = 0.15 × (−0.085)² = 0.15 × 0.007225 = 0.0010838 σ²(recovery | S2) = 0.0001913 + 0.0010838 = 0.001275

Step 4 , Compare and interpret

σ²(recovery | S1) = 0.002475, so σ(recovery | S1) = √0.002475 ≈ USD0.0497. σ²(recovery | S2) = 0.001275, so σ(recovery | S2) = √0.001275 ≈ USD0.0357. Scenario 1 has higher conditional variance. Both scenarios have a USD0.10 gap between their outcomes. The difference in variance comes from the probability split: Scenario 1 is nearly even (0.45 vs 0.55), which spreads weight across both outcomes. Scenario 2 is strongly skewed (0.85 vs 0.15), which concentrates weight near one outcome and reduces dispersion. Step 5 , Sanity check: are both conditional standard deviations smaller than the outcome gap? The outcome gap in each scenario is USD0.10. Both σ values (USD0.0497 and USD0.0357) are smaller than USD0.10. ✓ For a two-outcome distribution, the standard deviation cannot exceed the range of outcomes. If your result exceeds the gap, recheck the mean you subtracted.

Scope note

Do not attempt to reconcile these conditional variances back to an overall unconditional variance. That step is beyond this LO. ✓ Answer: σ²(recovery | S1) = 0.002475 (σ ≈ USD0.0497). σ²(recovery | S2) = 0.001275 (σ ≈ USD0.0357). Scenario 1 carries greater within-scenario risk because its outcome probabilities are more evenly balanced.

Telling Conditional and Unconditional Probabilities Apart Under Pressure

Worked Example 5

Distinguishing Conditional from Unconditional Probabilities

Tariq Osei is a junior analyst at Northgate Asset Management studying Priya's Meridian Logistics report. He notices that the tree shows 0.45 next to the USD0.90 recovery outcome and 0.3375 in a separate column. His supervisor asks: "Which probability would you use if you wanted to know the chance of recovering USD0.90 given that Scenario 1 has already occurred? And which would you use if you wanted to know the overall chance of recovering USD0.90 before we know which scenario prevails?"

🧠Thinking Flow — Conditional vs unconditional , which probability answers which question

The question asks

What is the difference between P(USD0.90 | Scenario 1) = 0.45 and P(USD0.90) = 0.3375, and when does each apply?

Key concept needed

The distinction between conditional probability and unconditional probability. Many candidates treat these interchangeably under time pressure. They use a conditional probability where an unconditional one is required, or vice versa, and arrive at a plausible-looking number that answers the wrong question.

Step 1 , Identify the question being asked in each case

Question A: "Given that Scenario 1 has already occurred, what is the probability of USD0.90 recovery?" This restricts the universe to Scenario 1 only. The answer is the conditional probability: P(USD0.90 | S1) = 0.45. This number lives inside the Scenario 1 branch and sums to 1.00 only with the other outcomes in that branch. Question B: "Before we know which scenario prevails, what is the overall probability of USD0.90 recovery?" This looks across the whole tree. The answer is the unconditional probability: P(USD0.90) = 0.75 × 0.45 = 0.3375. It is computed by multiplying along the path from root to terminal node.

Step 2 , Apply the signal words

The phrase "given that" or "if Scenario 1 occurs" signals a conditional probability question. Use only the probabilities within that branch. The phrase "overall" or "before we know which scenario" or a question about an outcome with no condition attached signals an unconditional probability question. Multiply along the path from root to terminal node. Step 3 , Sanity check: which sets of numbers sum to 1.00? The conditional probabilities within Scenario 1 (0.45 and 0.55) sum to 1.00. ✓ The conditional probabilities within Scenario 2 (0.85 and 0.15) sum to 1.00. ✓ The four unconditional probabilities (0.3375 + 0.4125 + 0.2125 + 0.0375) sum to 1.00. ✓ The conditional probabilities from different branches combined do not sum to 1.00 (0.45 + 0.85 = 1.30, which is meaningless). If a set of probabilities you are working with sums above 1.00, you have mixed conditional probabilities from separate branches. ✓ Answer: Use 0.45 (conditional) when the scenario is known or fixed. Use 0.3375 (unconditional) when no scenario has been specified. These two numbers answer two fundamentally different questions and must never be substituted for each other.

⚠️

Watch out for this

The Conditional-Stop Trap: computing E(X|S) and forgetting to aggregate. A candidate who computes E(recovery | Scenario 1) = USD0.845 and E(recovery | Scenario 2) = USD0.485 but reports only the Scenario 1 result gets USD0.845, not the correct overall expected recovery of USD0.755. A candidate who averages the two conditional expectations with equal weights gets (USD0.845 + USD0.485) / 2 = USD0.665, also wrong. The correct approach applies the total probability rule: E(recovery) = USD0.845 × 0.75 + USD0.485 × 0.25 = USD0.755. Candidates make this error because they treat the conditional expected value as the final answer, assuming the calculation is complete once each scenario's expectation is found, when the formula actually requires one more step: weighting those conditional expectations by the probability of each scenario before summing. Before submitting any expected value answer on a probability tree question, verify that you have multiplied each conditional expected value by its scenario probability and summed all terms.

🧠

Memory Aid

CONTRAST ANCHOR

Conditional expectations describe what happens inside one branch. The total probability rule is what connects all branches into a single answer.

Practice Questions · LO2

6 Questions LO2

Score: — / 6

Q 1 of 6 — REMEMBER

In a probability tree, conditional probabilities within a single scenario branch must:

CORRECT: B

CORRECT: B , Conditional probabilities describe what can happen given that a specific scenario has already occurred. Because the scenario is fixed, the outcomes within it are exhaustive and mutually exclusive for that branch. They must account for all possibilities within that one scenario, so they sum to 1.00 inside the branch. They tell us nothing about outcomes in other branches.

Why not A? Summing conditional probabilities across all branches produces a meaningless number. If Scenario 1 has conditional probabilities 0.45 and 0.55, and Scenario 2 has 0.85 and 0.15, adding all four gives 2.00, which cannot be a valid probability sum. Each branch is its own universe. The conditional probabilities within it are not comparable to those in another branch.

Why not C? Unconditional probabilities are computed by multiplying the scenario probability by the conditional probability along each path. They live at the terminal nodes and represent the overall chance of each outcome before any scenario is known. Saying conditional probabilities equal unconditional probabilities confuses the two levels of the tree entirely. P(USD0.90 | S1) = 0.45 and P(USD0.90) = 0.3375 are different numbers answering different questions.

---

Q 2 of 6 — UNDERSTAND

An analyst is working with a two-scenario probability tree. She has correctly computed E(earnings | Scenario A) = USD8.00 and E(earnings | Scenario B) = USD3.00. She reports USD8.00 as the overall expected earnings because Scenario A is the more likely scenario. Which statement best describes her reasoning?

CORRECT: C

CORRECT: C , The total probability rule states that E(X) = E(X|S_A) × P(S_A) + E(X|S_B) × P(S_B). Both conditional expected values contribute to the overall expected value, weighted by how likely each scenario is. Even if Scenario A is far more probable, Scenario B's contribution does not vanish. It is simply scaled down by its lower probability. Reporting only E(X|S_A) skips the aggregation step entirely and is never correct unless P(S_B) = 0.

Why not A? No rule in probability says the overall expected value equals the conditional expected value of the most probable scenario. That reasoning would produce the correct answer only in the degenerate case where P(S_A) = 1.00 and Scenario B cannot occur. If P(S_A) = 1.00, there is no tree to draw.

Why not B? There is no threshold probability above which the conditional expected value of one scenario approximates the overall expected value well enough to report as exact. Even at P(S_A) = 0.90, the contribution of Scenario B is 0.10 × USD3.00 = USD0.30, which shifts the true answer to USD7.50, a meaningful difference from USD8.00 on most investment decisions.

---

Q 3 of 6 — APPLY

Valentina Cruz is a portfolio manager at Dunebrook Investments analysing an emerging-market bond. She constructs a probability tree with two scenarios: What is the overall expected recovery per USD1 of principal?

Scenario	P(Scenario)	Outcome 1	P(Outcome 1 \	Scenario)	Outcome 2	P(Outcome 2 \	Scenario)
Stable macro	0.60	USD0.85	0.70	USD0.65	0.30
Stressed macro	0.40	USD0.55	0.50	USD0.35	0.50

CORRECT: A

CORRECT: A , Apply the total probability rule in two steps. First, compute the conditional expected value within each scenario: E(recovery | Stable) = 0.70 × USD0.85 + 0.30 × USD0.65 = USD0.595 + USD0.195 = USD0.790. E(recovery | Stressed) = 0.50 × USD0.55 + 0.50 × USD0.35 = USD0.275 + USD0.175 = USD0.450. Then weight by scenario probabilities: E(recovery) = 0.60 × USD0.790 + 0.40 × USD0.450 = USD0.474 + USD0.180 = USD0.654. Verify using unconditional probabilities: P(USD0.85) = 0.60 × 0.70 = 0.42; P(USD0.65) = 0.60 × 0.30 = 0.18; P(USD0.55) = 0.40 × 0.50 = 0.20; P(USD0.35) = 0.40 × 0.50 = 0.20. E = 0.42(0.85) + 0.18(0.65) + 0.20(0.55) + 0.20(0.35) = 0.357 + 0.117 + 0.110 + 0.070 = USD0.654. ✓

Why not B? USD0.620 results from averaging the two conditional expected values with equal weights: (USD0.790 + USD0.450) / 2 = USD0.620. This treats both scenarios as equally likely, ignoring the actual scenario probabilities of 0.60 and 0.40. The total probability rule requires weighting by scenario probability, not simple averaging.

Why not C? USD0.700 is close to the Stable scenario's conditional expected value of USD0.790, suggesting the candidate weighted it too heavily or averaged the four raw outcome values: (0.85 + 0.65 + 0.55 + 0.35) / 4 = USD0.600, or applied the wrong weight to the Stable scenario. Neither procedure is correct. The total probability rule requires both the conditional expected values and the correct scenario probability weights.

---

Q 4 of 6 — APPLY+

Marcus Webb is a fixed income analyst at Cairnfield Partners evaluating a distressed loan. The borrower's repayment depends on whether refinancing succeeds. He builds the following probability tree: Marcus's supervisor asks for the conditional variance under the Refinancing Fails scenario only. What is σ²(recovery | Refinancing Fails)?

Scenario	P(Scenario)	Recovery	Conditional probability
Refinancing succeeds	0.55	USD80,000	0.60
Refinancing succeeds	0.55	USD60,000	0.40
Refinancing fails	0.45	USD40,000	0.75
Refinancing fails	0.45	USD20,000	0.25

CORRECT: C

CORRECT: C , First, compute E(recovery | Fails) = 0.75 × USD40,000 + 0.25 × USD20,000 = USD30,000 + USD5,000 = USD35,000. Then compute the conditional variance using only the Fails branch outcomes and their conditional probabilities: σ²(X|Fails) = 0.75 × (USD40,000 − USD35,000)² + 0.25 × (USD20,000 − USD35,000)² = 0.75 × (USD5,000)² + 0.25 × (USD15,000)² = 0.75 × 25,000,000 + 0.25 × 225,000,000 = USD18,750,000 + USD56,250,000 = USD75,000,000.

Why not A? USD200,000,000 results from treating both Fails outcomes as equally likely and computing the variance around their simple average of USD30,000: (40,000 − 30,000)² + (20,000 − 30,000)² = 100,000,000 + 100,000,000 = 200,000,000. This omits the probability weighting entirely. Conditional variance, like all variance calculations, requires each squared deviation to be weighted by its probability.

Why not B? USD166,250,000 likely results from subtracting the overall expected value of the entire loan as the mean instead of the scenario-specific conditional expected value. The overall E(recovery) = 0.55 × (0.60 × 80,000 + 0.40 × 60,000) + 0.45 × 35,000 = 0.55 × 72,000 + 0.45 × 35,000 = 39,600 + 15,750 = USD55,350. Using USD55,350 instead of USD35,000 as the mean inflates the deviations inside the Fails branch and produces a larger, incorrect variance. Conditional variance always uses the conditional expected value of that specific branch as the mean.

---

Q 5 of 6 — ANALYZE

An analyst builds a two-scenario probability tree for a project's cash flow. She correctly computes all conditional probabilities, all unconditional probabilities, both conditional expected values, and the overall expected value using the total probability rule. A colleague suggests she could have skipped the tree entirely and just used the unconditional probabilities directly to compute the overall expected value. Which statement best evaluates this claim?

CORRECT: B

CORRECT: B , Both calculation paths produce the same overall expected value. Worked Example 3 demonstrated this: computing E(recovery) via the total probability rule and computing it directly from unconditional probabilities both yielded USD0.755. The tree is not required for the overall expected value alone. However, the tree provides two additional outputs the direct approach cannot: conditional expected values for each scenario (which let a manager compare outcomes under different scenarios independently) and conditional variances (which measure within-scenario dispersion). These are analytically valuable and cannot be recovered from unconditional probabilities alone.

Why not A? This overstates the tree's necessity. The overall expected value E(X) = Σ P(Xi) × Xi, where P(Xi) are unconditional probabilities, is mathematically equivalent to the total probability rule formulation. They are two paths to the same number. Claiming that only the tree-based approach is valid is incorrect. It confuses the tool with the result.

Why not C? This understates the tree's value. While the overall expected value is reachable without the tree, conditional expected values and conditional variances are not. A manager who wants to know "what is our expected return if the recession scenario occurs?" cannot answer that question from unconditional probabilities alone. The tree preserves scenario-level information that aggregation discards.

---

Q 6 of 6 — TRAP

Isabelle Fontaine is a credit analyst at Redcliff Partners evaluating a sovereign bond. She constructs a probability tree with two scenarios: Scenario 1 (probability 0.70) yields a conditional expected recovery of USD0.78 per USD1 principal, and Scenario 2 (probability 0.30) yields a conditional expected recovery of USD0.42 per USD1 principal. Isabelle reports USD0.78 as the overall expected recovery, reasoning that Scenario 1 is the dominant scenario and its conditional expected value is therefore the best single estimate. What is the correct overall expected recovery, and what error did Isabelle make?

CORRECT: B

CORRECT: B , The total probability rule requires one final step after computing conditional expected values: weight each by its scenario probability and sum. E(recovery) = 0.70 × USD0.78 + 0.30 × USD0.42 = USD0.546 + USD0.126 = USD0.672. Isabelle's error was treating E(recovery | Scenario 1) = USD0.78 as the overall expected recovery. Conditional expected values describe what happens inside one branch. They become inputs to the overall answer only when weighted and aggregated across all branches using the total probability rule.

Why not A? USD0.600 results from a different error: averaging the two conditional expected values with equal weights, (USD0.78 + USD0.42) / 2 = USD0.600. This is not Isabelle's error. Isabelle ignored Scenario 2 entirely. The equal-weight average error ignores the unequal scenario probabilities but at least considers both scenarios. Both errors share the same root cause (failing to apply the total probability rule correctly) but they produce different wrong numbers: USD0.78 from ignoring one scenario, USD0.600 from ignoring the probability weights.

Why not C? There is no rule that makes a conditional expected value the correct answer for the overall expected value once one scenario exceeds a probability threshold. Even at P(Scenario 1) = 0.99, Scenario 2 still contributes 0.01 × USD0.42 = USD0.0042 to the overall expected value, and the correct answer would be USD0.7796, not USD0.78. Choosing a conditional expected value as the overall expected value is never correct unless that scenario has probability 1.00, at which point no other scenario exists and there is no tree to build.

---

Glossary

conditional probability

The probability of an outcome given that a specific scenario has already occurred. Instead of asking "what is the chance it will rain today?", you ask "what is the chance it will rain given that it is already cloudy?" The condition narrows the universe of possibilities to that branch only.

unconditional probability

The probability of an outcome without restricting to any particular scenario. This is the overall likelihood before you know which scenario is happening. Like asking "what is the chance it will rain today?" without looking outside first. In a probability tree, it is computed by multiplying the scenario probability by the conditional probability along each path from root to terminal node.

Probability tree diagram

A branching diagram that maps out all possible scenarios and outcomes in sequence, with each fork representing a scenario and each branch endpoint representing a possible result. Think of it like a tournament bracket where each round determines which path continues, or a flowchart where each choice splits into new possibilities.

Scenario

A mutually exclusive and exhaustive event that determines which branch of a probability tree applies. Only one scenario can occur at a time, and together the scenarios cover all possibilities. Like a coin flip with heads and tails: one happens, both cannot happen together, and nothing else can happen instead.

Conditional expected value

The weighted average of outcomes within one scenario, computed using that scenario's conditional probabilities as weights. It answers "what do we expect to happen if this specific scenario occurs?" Like asking "what is the average temperature in July specifically?" rather than "what is the average temperature across the whole year?"

Total probability rule for expected value

The formula that combines conditional expected values into one overall expected value by weighting each scenario's conditional expected value by the probability of that scenario occurring: E(X) = E(X|S₁) × P(S₁) + E(X|S₂) × P(S₂) + ... Like computing a final grade by weighting each subject's score by its credit hours before summing to a GPA.

Conditional variance

The variance (dispersion) of outcomes within a single scenario branch, measuring how spread out the results are if that specific scenario occurs. Computed like ordinary variance, but using conditional probabilities as weights and subtracting the conditional expected value (not the overall expected value) as the mean. Like measuring how much temperatures vary in July specifically, rather than across the whole year.

LO 2 Done ✓

Ready for the next learning objective.

🔒 PRO Feature

How analysts use this at work

Real-world applications and interview questions from top firms.

Quantitative Methods · Probability Trees and Conditional Expectations · LO 3 of 3

You know 80% of defaults have poor credit reports , so what's the real risk when you see a poor credit report?

Use Bayes' formula to reverse conditional probability and update your beliefs about future outcomes when new information arrives.

⏱ 8min-15min

6 questions

HIGH PRIORITYAPPLY

Why this LO matters

Use Bayes' formula to reverse conditional probability and update your beliefs about future outcomes when new information arrives.

INSIGHT

New information reverses the arrow. You are given P(Information | Event): the probability of something happening if a condition is true. But you need P(Event | Information): the probability of the condition being true when you observe the information. These are not the same number. Bayes' formula is the tool that flips the arrow. It is not a complicated new idea. It is a reversal of direction using a denominator that accounts for all the ways the information could have appeared. Once you see this, every Bayes' problem reduces to the same two steps: calculate what normalises the likelihood, then divide the scaled likelihood by that normaliser.

Turning Information Around: How Bayes' Formula Updates What You Believe

Think about a doctor interpreting a positive test result.

The doctor knows that 95% of people who have a disease test positive. That is P(positive test | has disease). But the patient sitting across the desk is asking something different: "I just tested positive. Do I actually have the disease?" That is P(has disease | positive test). These are two completely different questions with two completely different answers. If the disease affects only 1 person in 10,000, the vast majority of positive tests are false positives , even with a 95% detection rate.

The wrong move is to answer "95%" to the patient's question. Candidates make exactly this error on exam day. They read the likelihood off the page and hand it back as the answer.

The right move is to recognise the direction of conditioning has reversed, calculate the denominator, and apply Bayes' formula.

Bayes' formula lives at the intersection of what you knew before and what you just found out. The core skill for this LO: given P(Information | Event), calculate P(Event | Information). Exam questions are built around that gap.

The four building blocks of Bayes' formula

Prior probability. Your belief about an event before any new information arrives. It is the starting point, not the answer. Identify it by asking: "What probability am I given about the event itself, with no conditions attached?"

Likelihood. The probability of observing the new information, given that a specific event has already occurred. Written P(Information | Event). Identify it by finding the conditional probability where the information is "given," not the event. This is what the exam hands you , not what you are solving for.

Unconditional probability of the information. The overall probability that the new information occurs, across all possible events. Calculate it using the total probability rule: multiply each likelihood by its corresponding prior, then sum all the products. This is the denominator in Bayes' formula and the step most candidates skip.

Posterior probability. Your updated belief about the event after incorporating the new information. This is what Bayes' formula produces. Written P(Event | Information), it reverses the direction of the likelihood you were given.

Bayes' formula

P(A | B) = [P(B | A) / P(B)] × P(A)

P(A | B) = posterior probability , what you are solving for
P(B | A) = likelihood , the conditional probability you are given
P(B) = unconditional probability of the information , the denominator
P(A) = prior probability , your starting belief

Condition: A and B are not independent events.
P(B) is calculated as: Σ [P(B | A_i) × P(A_i)] across all mutually exclusive scenarios.

FORWARD REFERENCE

The total probability rule , what you need for this LO only

The unconditional probability of any outcome equals the sum of its probability across all possible scenarios, each weighted by the probability of that scenario. For this LO, apply it as: multiply each P(Information | Event_i) by P(Event_i), then add up all the products to get P(Information). This is the denominator in every Bayes' calculation. You will study this rule in full context within Quantitative Methods, Learning Module 4.

→ Quantitative Methods

Worked Examples: Applying the Direction Reversal

Now let us put the formula to work. In each example, notice that the first step is always the same: confirm the direction of the conditional probability you have been given, and confirm it is the opposite of what you need.

Worked Example 1

Reversing the conditional with frequency data

Priya Mehta is an equity analyst at Sentinel Asset Management. She is reviewing a universe of 500 listed companies. Of these, 200 are classified as financial sector firms and 300 are non-financial. Within the financial sector, 160 are large-cap companies and 40 are mid-cap. Within the non-financial group, 120 are large-cap and 180 are mid-cap. A colleague asks: "If you randomly select a large-cap company from this universe, what is the probability it is a financial sector firm?"

🧠Thinking Flow — Direction reversal , financial sector given large-cap

The question asks

Given that a company is large-cap, what is the probability it belongs to the financial sector?

Key concept needed

Direction reversal using Bayes' formula. Many candidates immediately answer "80%" , that is P(large-cap | financial sector), the probability of being large-cap given you are already in the financial sector. The question asks the opposite: P(financial sector | large-cap). These are not the same number.

Step 1 , Identify the direction of conditioning

The given likelihood is P(large-cap | financial sector) = 160/200 = 0.80. The question asks for P(financial sector | large-cap). The arrow must be reversed.

Step 2 , Calculate P(large-cap), the denominator

Total large-cap companies = 160 (financial) + 120 (non-financial) = 280. P(large-cap) = 280 / 500 = 0.56. We also need: P(financial sector) = 200 / 500 = 0.40.

Step 3 , Apply Bayes' formula

P(financial sector | large-cap) = [P(large-cap | financial sector) / P(large-cap)] × P(financial sector) = (0.80 / 0.56) × 0.40 = 1.42857 × 0.40 = 0.5714, or 57.14%

Step 4 , Sanity check (frequency method)

Of the 280 total large-cap companies, 160 are financial sector firms. 160 / 280 = 0.5714. ✓ Both methods agree. The posterior (57.14%) is higher than the prior (40.0%). That makes sense: financial sector firms skew toward large-cap, so observing "large-cap" should increase our probability estimate for "financial sector." ✓ Answer: P(financial sector | large-cap) = 57.14%

Worked Example 2

Identifying priors, likelihoods, and the denominator

Jake Bronson is a credit analyst at Thornfield Bank. He is trying to assess the risk of loan delinquency among the bank's corporate borrowers. He knows that 10% of borrowers fall delinquent on their loans. He also knows that 30% of all borrowers file their financial statements late. Among borrowers who eventually go delinquent, 80% filed their statements late. A new borrower has just filed their statements late. What is the probability this borrower will go delinquent?

🧠Thinking Flow — Labelling the three inputs correctly

The question asks

P(delinquent | late filing). Given that a borrower filed late, what is the probability they go delinquent?

Key concept needed

Correct identification of prior, likelihood, and unconditional probability before applying Bayes' formula. The most common error is using P(late filing | delinquent) = 0.80 as the answer. That is the likelihood , what you were given , not the posterior , what you are solving for.

Step 1 , Label each piece of data

Symbol	Plain English	Value
P(A) , prior	Probability of delinquency	0.10
P(B) , unconditional	Probability of late filing	0.30
P(B\	A) , likelihood	Probability of late filing given delinquency	0.80
P(A\	B) , posterior	Probability of delinquency given late filing	Solve

Step 2 , Apply Bayes' formula

P(A | B) = [P(B | A) / P(B)] × P(A)

Step 3 , Substitute

P(delinquent | late filing) = (0.80 / 0.30) × 0.10 = 2.6667 × 0.10 = 0.2667, or 26.67%

Step 4 , Sanity check (frequency table method)

Assume 100 borrowers. 10 are delinquent; 90 are not. Of the 10 delinquent borrowers, 80% file late: 8 borrowers. Total late filers = 30 (given as 30% of 100). Of those 30 late filers, 8 went delinquent. 8 / 30 = 0.2667. ✓ Both methods agree. The posterior (26.67%) is much higher than the prior (10.0%). That makes sense: late filing is a warning signal. Observing it raises the estimated probability of delinquency substantially. ✓ Answer: P(delinquent | late filing) = 26.67%

Worked Example 3

Using the total probability rule to build the denominator

Sofia Reyes is a portfolio manager at Meridian Capital. She holds shares in DriveMed, Inc., a manufacturer expanding into electric vehicle components. DriveMed is about to release last quarter's earnings. Before the release, Sofia estimates the following prior probabilities: P(EPS beat consensus) = 0.45, P(EPS met consensus) = 0.30, P(EPS missed consensus) = 0.25. DriveMed then announces it is expanding factory capacity in Singapore and Ireland. Sofia assesses the likelihood of this expansion announcement under each earnings scenario: P(expansion | beat) = 0.75, P(expansion | met) = 0.20, P(expansion | missed) = 0.05. What is the updated probability that DriveMed beat consensus earnings?

🧠Thinking Flow — Building P(Information) from scratch using the total probability rule

The question asks

P(beat consensus | expansion announced). What is the updated probability that DriveMed beat consensus, given the expansion announcement?

Key concept needed

The total probability rule to calculate P(expansion), the denominator, before applying Bayes' formula. Skipping this step and using a likelihood directly as the answer is the most common error.

Step 1 , Identify the three scenarios and their priors

Scenario	Prior P(scenario)
EPS beat consensus	0.45
EPS met consensus	0.30
EPS missed consensus	0.25
Total	1.00 ✓

Step 2 , Identify the likelihoods

Scenario	P(expansion \	scenario)
EPS beat	0.75
EPS met	0.20
EPS missed	0.05

Step 3 , Calculate P(expansion) using the total probability rule

P(expansion) = P(expansion | beat) × P(beat) + P(expansion | met) × P(met) + P(expansion | missed) × P(missed) = (0.75 × 0.45) + (0.20 × 0.30) + (0.05 × 0.25) = 0.3375 + 0.0600 + 0.0125 = 0.4100

Step 4 , Apply Bayes' formula for the "beat" scenario

P(beat | expansion) = [P(expansion | beat) / P(expansion)] × P(beat) = (0.75 / 0.41) × 0.45 = 1.82927 × 0.45 = 0.8232, or 82.32%

Step 5 , Sanity check (direction and magnitude)

Sofia's prior was 45%. After seeing an expansion announcement, which strongly signals strong demand and earnings, her updated probability rises to 82.3%. The direction is correct: the expansion is far more likely if DriveMed beat earnings (75%) than under the overall base rate (41%). The ratio 0.75/0.41 is greater than 1, which pulls the posterior above the prior. ✓ ✓ Answer: P(EPS beat consensus | expansion announced) = 82.32%

Worked Example 4

Updating all three posteriors and verifying they sum to 1

Continuing with Sofia Reyes and DriveMed, Inc. from Worked Example 3. The priors are P(beat) = 0.45, P(met) = 0.30, P(missed) = 0.25. The likelihoods given the expansion announcement are P(expansion | beat) = 0.75, P(expansion | met) = 0.20, P(expansion | missed) = 0.05. The denominator P(expansion) = 0.41 has already been calculated. Sofia now wants to update all three scenario probabilities and confirm her arithmetic is correct.

🧠Thinking Flow — Updating three posteriors with one shared denominator

The question asks

What are the updated probabilities for all three EPS scenarios given the expansion announcement? Do they sum to 1.00?

Key concept needed

The same denominator P(expansion) = 0.41 applies to all three Bayes' calculations. Calculate it once. Reuse it three times. Many candidates recalculate the denominator separately for each scenario and introduce arithmetic errors. That is the wrong approach.

Step 1 , Confirm the shared denominator

P(expansion) = 0.41 (calculated in Worked Example 3). This number does not change across the three scenarios.

Step 2 , Calculate P(met | expansion)

P(met | expansion) = [P(expansion | met) / P(expansion)] × P(met) = (0.20 / 0.41) × 0.30 = 0.48780 × 0.30 = 0.1463

Step 3 , Calculate P(missed | expansion)

P(missed | expansion) = [P(expansion | missed) / P(expansion)] × P(missed) = (0.05 / 0.41) × 0.25 = 0.12195 × 0.25 = 0.0305

Step 4 , Compile all three posteriors

Scenario	Prior	Posterior
EPS beat consensus	0.45	0.8232
EPS met consensus	0.30	0.1463
EPS missed consensus	0.25	0.0305

Step 5 , Sanity check (sum to 1.00)

0.8232 + 0.1463 + 0.0305 = 1.0000 ✓ The three scenarios are mutually exclusive and exhaustive. Exactly one of them must be true. So the posteriors must sum to 1.00, just as the priors did. If your sum deviates from 1.00, recalculate P(expansion) first, then recheck each numerator. The interpretation here is striking. Before the announcement, Sofia gave only a 45% chance to DriveMed having beaten earnings. After the expansion news, that probability jumps to 82.3%. Conversely, the probability that DriveMed merely met consensus collapses from 30% to 14.6%. The probability it missed drops from 25% to just 3.1%. New information moves all three numbers simultaneously. That is Bayesian updating in practice. ✓ Answer: Posteriors are 0.8232, 0.1463, and 0.0305. Sum = 1.0000. Arithmetic confirmed.

⚠️

Watch out for this

The direction-reversal trap: P(B|A) mistaken for P(A|B) A candidate who reads P(large-cap | financial sector) = 0.80 and reports 0.80 as the answer to "what is the probability a large-cap company is a financial sector firm?" has confused the given likelihood with the posterior they were asked to calculate. The correct posterior, applying Bayes' formula with P(financial sector) = 0.40 and P(large-cap) = 0.56, is 0.5714, or 57.14%. Candidates make this error because they assume the conditional probability they are handed points in the right direction, when Bayes' formula exists precisely because P(B|A) and P(A|B) are different numbers requiring separate calculation. Before submitting any Bayes' answer, read the question one more time and check: does the given probability have the same direction as the answer you need, or do you need to reverse it?

🧠

Memory Aid

FORMULA HOOK

Flip the arrow, weight by the prior, divide by the total.

Practice Questions · LO3

6 Questions LO3

Score: — / 6

Q 1 of 6 — REMEMBER

In Bayes' formula, the term P(Event | Information) is best described as the:

CORRECT: C

CORRECT: C , P(Event | Information) is the posterior probability. It is what Bayes' formula produces: the probability of the event, conditioned on new information having been observed. The posterior is the output of the calculation, not an input.

Why not A? The prior probability is P(Event) , the unconditional belief about the event before any new information arrives. It carries no conditioning on information at all. A candidate who selects A is confusing the starting point of the calculation with the ending point. The prior is an input to Bayes' formula. The posterior is the result.

Why not B? The likelihood is P(Information | Event): the probability of observing the new information, assuming a specific event has already occurred. Its direction of conditioning is the reverse of the posterior. In the likelihood, Information is "given." In the posterior, Event is "given." Confusing the likelihood with the posterior is the single most common error in Bayes' problems, and it is precisely the error Bayes' formula was designed to correct.

---

Q 2 of 6 — UNDERSTAND

An analyst is given P(rating downgrade | earnings miss) and asked to find P(earnings miss | rating downgrade). Which statement best explains why these two probabilities are not interchangeable?

CORRECT: A

CORRECT: A , P(A|B) and P(B|A) are structurally different quantities. Bayes' formula shows that P(A|B) = [P(B|A) / P(B)] × P(A). The ratio P(A)/P(B) acts as a scaling factor. Unless P(A) = P(B), the two conditional probabilities will differ. The direction of conditioning always matters, regardless of the size of the priors.

Why not B? Independence means P(A|B) = P(A) and P(B|A) = P(B). Even under independence, P(A|B) equals P(A), not P(B|A). The claim that the two conditional probabilities become interchangeable under independence is false. In the independent case, both conditionals simply collapse to their respective unconditional probabilities, which are themselves two different numbers unless P(A) happens to equal P(B).

Why not C? The divergence between P(A|B) and P(B|A) is not triggered by any threshold in the prior probability. It is always present unless P(A) = P(B) exactly. A prior greater than 50% does not cause the divergence. The ratio P(A)/P(B) drives it regardless of where either prior sits relative to any particular level.

---

Q 3 of 6 — APPLY

Valentina Cruz is a fund manager reviewing technology stocks. She estimates that 35% of technology companies in her universe are profitable, and 65% are unprofitable. From her research, 60% of profitable companies are currently trading above their 52-week moving average, while only 20% of unprofitable companies are trading above that level. A stock is drawn at random and is found to be trading above its 52-week moving average. What is the probability that this stock is profitable?

CORRECT: C

CORRECT: C , Apply the total probability rule first: P(above average) = (0.60 × 0.35) + (0.20 × 0.65) = 0.210 + 0.130 = 0.340. Then apply Bayes' formula: P(profitable | above average) = (0.60 × 0.35) / 0.340 = 0.210 / 0.340 = 0.617, or 61.7%. The prior was 35%. Observing the above-average signal raises it substantially to 61.7%.

Why not A? The value 0.210 is the numerator of Bayes' formula: P(above average | profitable) × P(profitable) = 0.60 × 0.35. A candidate who stops here has calculated P(above average AND profitable), the joint probability, not the conditional probability P(profitable | above average). The denominator step , dividing by P(above average) = 0.340 , is essential and must not be skipped.

Why not B? The value 0.600 is the likelihood P(above average | profitable): the probability of trading above the average given that the company is already profitable. This number was given in the problem. Reporting it as the answer confuses the direction of conditioning. The question asks P(profitable | above average), which reverses the given likelihood. Bayes' formula is required precisely to perform that reversal.

---

Q 4 of 6 — APPLY+

Marcus Okonkwo is a credit risk officer at a regional bank. The bank's loan portfolio shows that 8% of borrowers default within three years. A proprietary risk model flags 70% of eventual defaulters and also flags 15% of eventual non-defaulters as high-risk. A borrower is flagged by the model. What are the updated probabilities that this borrower will default and will not default, and do they sum to 1.00?

CORRECT: A

CORRECT: A , First, calculate P(flagged) using the total probability rule: P(flagged) = (0.70 × 0.08) + (0.15 × 0.92) = 0.056 + 0.138 = 0.194. Then: P(default | flagged) = (0.70 × 0.08) / 0.194 = 0.056 / 0.194 ≈ 0.289. P(no default | flagged) = (0.15 × 0.92) / 0.194 = 0.138 / 0.194 ≈ 0.711. Sum: 0.289 + 0.711 = 1.000. Because "default" and "no default" are mutually exclusive and exhaustive, the posteriors must sum to 1.00.

Why not B? The values 0.700 and 0.300 come from using the likelihoods directly as posteriors. P(flagged | default) = 0.70, and a candidate might guess P(no default | flagged) = 1 − 0.70 = 0.30. Neither calculation applies Bayes' formula. The likelihoods point in the wrong direction. The denominator P(flagged) = 0.194 has been skipped entirely, producing an incorrect answer that happens to sum to 1.00 by coincidence.

Why not C? The first posterior, 0.289, is calculated correctly. But 0.850 is the complement of the likelihood P(flagged | no default): 1 − 0.15 = 0.85. A candidate who calculates P(default | flagged) correctly but then substitutes 1 − 0.15 for the second posterior has forgotten to apply Bayes' formula to the "no default" scenario. Both posteriors must share the same denominator, P(flagged) = 0.194. Any result that sums to something other than 1.00 signals an error in the denominator or one of the numerators.

---

Q 5 of 6 — ANALYZE

Two analysts are estimating the probability that a company's CEO will resign, given that a major shareholder has just filed a public activist letter. Analyst Fatima uses Bayes' formula, deriving a posterior from prior CEO turnover rates and historical activist success rates. Analyst Leon states: "In my experience, activist letters precede CEO departures about 65% of the time, so that is my estimate." Which approach better satisfies the formal requirements of Bayesian updating, and why?

CORRECT: B

CORRECT: B , Bayesian updating requires four explicit components: a prior P(A), a likelihood P(B|A), the unconditional probability of the information P(B), and the posterior P(A|B). Fatima's method addresses each component and uses Bayes' formula to reverse the direction of conditioning. Leon's statement does not distinguish whether his 65% represents P(activist letter | CEO departs) or P(CEO departs | activist letter), and it does not document or adjust for the base rate of CEO departures in general.

Why not A? Empirical frequency data can support either approach. Fatima's prior probabilities and likelihoods should also be derived from historical data. The strength of Bayesian analysis lies not in avoiding priors but in making them explicit and combining them correctly with new evidence. Leon's undocumented anecdotal figure is harder to challenge and update than an explicit Bayesian model, not easier.

Why not C? Two approaches are equivalent only if they produce the same numerical answer through the same logical structure. Leon's figure may reflect P(activist letter | CEO departs), collected from past cases where a departure eventually occurred. That would make it a likelihood, not a posterior. Without confirming that his 65% correctly accounts for all activist letters that did not result in CEO departures, we cannot claim equivalence. Implicit calculations that skip the denominator are a known source of error in probability reasoning.

---

Q 6 of 6 — TRAP

Ingrid Larsson is an equity analyst covering pharmaceutical firms. She estimates that 25% of pharmaceutical companies in her coverage universe are acquisition targets within any given year. She also knows that 80% of companies that were eventually acquired had received analyst upgrades in the prior quarter. Ingrid observes that a specific company in her coverage just received an analyst upgrade. She immediately states: "The probability this company will be acquired is 80%." Is Ingrid correct?

CORRECT: B

CORRECT: B , Ingrid has made the direction-reversal error. She was given P(upgrade | acquired) = 0.80, which answers "among companies that were acquired, how many had received upgrades beforehand?" The question she wants to answer is P(acquired | upgrade): "given this company just received an upgrade, what is the probability of acquisition?" To find that, Ingrid needs P(upgrade), the unconditional probability that any company in her universe receives an upgrade. Then she applies Bayes' formula: P(acquired | upgrade) = (0.80 × 0.25) / P(upgrade). Without P(upgrade), she cannot produce a valid posterior.

Why not A? The 80% figure is not the posterior. It is the likelihood. It describes how often upgrades appeared among companies that were already identified as acquired, not the probability of acquisition given an upgrade observed today. Treating a likelihood as a posterior is precisely the error Bayes' formula was designed to prevent. The prior (25%) and the total probability of upgrades are both required inputs that Ingrid has ignored entirely.

Why not C? Reverting entirely to the prior (25%) would ignore the new information, which is the opposite error from what Ingrid made. Bayesian updating exists to incorporate new signals, not to discard them. The correct answer lies between the prior and the maximum possible posterior: it is calculated using both the prior and the new information. Returning to the prior is only appropriate when the new information carries zero likelihood of distinguishing between outcomes, a specific condition that does not apply here.

---

Glossary

prior probability

Your belief about an event before any new information arrives. Like estimating the chance of rain tomorrow before you check the forecast , based on the season and historical patterns, with no fresh data. The prior is where Bayesian updating always begins.

likelihood

The probability of observing specific new information, given that a particular event has already occurred. Written P(Information | Event). Like asking: "If it really is going to rain, how likely is it that my joints are aching this morning?" The event (rain) is assumed true. You are asking how probable the signal (joint pain) is given that assumption.

unconditional probability

The overall probability of an outcome, without conditioning on any other event. Like the probability that any randomly chosen person owns a car, regardless of their age, income, or city. In Bayes' problems, you calculate the unconditional probability of the new information using the total probability rule. This becomes the denominator in Bayes' formula.

total probability rule

A formula that calculates the unconditional probability of an event by summing its probability across all mutually exclusive scenarios, each weighted by the probability of that scenario. Like calculating the average wait time at a coffee shop by weighting the wait time during the morning rush, the lunch rush, and the quiet afternoon by how often each period occurs.

posterior probability

Your updated belief about an event after incorporating new information. It is the output of Bayes' formula, written P(Event | Information). Like revising your estimate that tomorrow will be rainy after you check the weather forecast and see a 90% rain prediction , your posterior is now much higher than your prior.

conditional probability

The probability of one event occurring given that another event is known to have occurred. Written P(A | B), read "probability of A given B." Like the probability of winning a card game, given that you have already been dealt two aces. The known condition (two aces in hand) changes the probability of the outcome (winning).

LO 3 Done ✓

You have completed all learning objectives for this module.

🔒 PRO Feature

How analysts use this at work

Real-world applications and interview questions from top firms.

Quantitative Methods · Probability Trees and Conditional Expectations · Job Ready

From exam to career

Scenario analysis, risk measurement, and probability updating in investment management and credit analysis

Why this session exists

Why this session exists: The exam tests whether you can calculate expected values, variances, and apply Bayes' formula in a clean scenario. Interviewers test whether you know when to use them and what the numbers actually mean in practice. These are different questions. This session bridges them.

Where this module shows up professionally: Fixed income analysts use these tools to forecast bond recovery under different legal scenarios. Credit risk officers apply Bayes' formula to update default probabilities as new borrower data arrives. Portfolio managers use probability trees to structure scenarios before committing capital. Risk teams report variance and standard deviation to clients in ways that inform actual investment decisions.

LO 1

Return and risk forecasting: expected value, variance, and standard deviation in investment decisions

How analysts use this at work

Credit analysts at firms like BlackRock and PIMCO use expected value to produce a single best estimate of what a bond or loan will return. They assign probabilities to each possible outcome, weight them, and sum. The result is not a prediction that any one scenario will occur. It is a probability-weighted centre of gravity for the entire distribution. That number feeds directly into relative value decisions, where analysts compare a bond's expected return against its risk-adjusted required return.

Standard deviation is the number analysts quote to clients and risk committees. Variance, at 372 basis points squared in one worked example, is not in the same units as the original data and cannot be compared to the expected value directly. A risk analyst at a pension fund consultant who reports variance instead of standard deviation has produced a number that is arithmetically correct but practically meaningless. The square root restores interpretability. Clients understand "expected NIM of 200 basis points with a standard deviation of 19 basis points." They cannot interpret a variance figure.

Interview questions

Vanguard Fixed Income Analyst "A distressed debt fund tells you their expected recovery on a position is 62 cents on the dollar. You calculate the variance as 306 (percentage points)². What is the standard deviation, and why does it matter?"

BlackRock Risk Analyst "Two analysts are arguing about a probability distribution for a bond's annual return. One says the expected value is 8%, so the variance must be 64 (%)². The other says the variance is 25 (%)² with a standard deviation of 5%. Which analyst is describing the problem correctly, and how can you verify?"

State Street Portfolio Analytics "A portfolio manager tells you she uses the historical average return as her expected value forecast for next year. What is wrong with this approach, and what would you use instead?"

One-line to use in your interview

Interviewers listen for industry-specific language. It signals you understand the concept, not just the definition. Use the plain English version to adapt it in your own words.

In practice, I treat expected value as a forward-looking forecast weighted by probability, not a historical average, and I always take the square root of variance before reporting dispersion to a client or risk committee because squared units are not interpretable.

In plain English

When I need a single best guess about a bond return or a loan recovery, I weight each possible outcome by how likely it is. That is the expected value. Then I check how spread out those outcomes are. Variance gives me a number in the wrong units. I have to take the square root to get a figure I can actually compare to the expected value itself.

LO 2

Scenario-based forecasting: probability trees, conditional expectations, and the aggregation step professionals forget

How analysts use this at work

Fixed income analysts at Goldman Sachs and Morgan Stanley use probability trees when a bond's outcome depends on which scenario occurs first. A distressed bond might recover USD0.90 if the court enforces collateral, or USD0.40 if it applies a haircut. These are different scenarios, not two equally likely outcomes. The analyst must draw the tree, compute the expected recovery inside each scenario separately, then weight those conditional expectations by how likely each scenario is. That final step, the total probability rule, is the step analysts most often skip under time pressure. The result is a conditional expected value that answers the wrong question.

Investment consultants advising pension funds use scenario analysis to stress-test portfolio outcomes. They do not stop at "what do we expect if rates rise?" They also need to know how uncertain that outcome is within the rising-rate scenario specifically. That is a conditional variance question. Using the unconditional expected value as the anchor for the variance calculation produces a misleading risk figure that mixes across scenarios. The correct anchor is always the conditional expected value for that specific branch of the tree.

Interview questions

Goldman Sachs Investment Analyst "A credit analyst computes expected recovery under a strong economy as USD0.845 and under a weak economy as USD0.485. She reports USD0.845 as the overall expected recovery because the strong economy is more likely. What is wrong with her reasoning?"

Morgan Stanley Fixed Income Strategist "You are building a probability tree for a sovereign bond's recovery rate. The two scenarios are restructuring and no restructuring. Under restructuring, the bond recovers USD0.60 with 70% probability and USD0.40 with 30%. Under no restructuring, it recovers USD0.90 with 90% probability and USD0.85 with 10%. The probability of restructuring is 0.40. What is the overall expected recovery?"

PIMCO Portfolio Manager "Under each scenario in your probability tree, the conditional standard deviations come out to USD0.0497 and USD0.0357. A colleague says you should combine these into an overall portfolio standard deviation by averaging them. Is your colleague correct, and why or why not?"

One-line to use in your interview

Interviewers listen for industry-specific language. It signals you understand the concept, not just the definition. Use the plain English version to adapt it in your own words.

I always build the probability tree structure before I calculate anything, because assigning conditional probabilities across scenarios instead of within them is the most common mistake in multi-scenario analysis, and I verify my tree is correct by checking that unconditional probabilities at the terminal nodes sum to 1.00.

In plain English

Before I multiply any numbers, I draw the tree and label which probabilities belong inside each scenario versus across all scenarios. That separation is where analysts go wrong. I then check that all my probabilities add up to exactly 1.00 before I treat any calculation as done.

LO 3

Probability updating: Bayes' formula in credit risk, screening, and investment decision-making

How analysts use this at work

Credit risk officers at JPMorgan and Bank of America use Bayes' formula to update default probabilities as new information arrives about a borrower. They know the base rate of default across their loan portfolio. They observe that a borrower has filed financial statements late. Late filing is a signal. The question is not "how often do delinquent borrowers file late?" That is the likelihood, given in the problem. The question is "this borrower just filed late. What is the updated probability of default?" These are different questions. Bayes' formula connects them. Getting the direction wrong and reporting the likelihood as the answer would overstate the default probability dramatically in most realistic scenarios.

Fund managers at Fidelity use Bayes' formula to update their beliefs about a company's earnings outcome after observing an announcement. A prior estimate of 45% probability of beating earnings gets revised to 82% after a factory expansion announcement, because expansion is far more likely under a beat scenario than across the full distribution. The tool is not just a calculation. It is a formal discipline for incorporating evidence without double-counting what you already knew.

Interview questions

JPMorgan Credit Risk Officer "A model flags a borrower as high-risk. The model's true positive rate is 70% and its false positive rate is 15%. The base rate of default is 8%. What is the actual probability the flagged borrower defaults, and why is it lower than 70%?"

CFA Institute Investment Analyst "An analyst reads that 80% of companies eventually acquired had received analyst upgrades beforehand. She observes an upgrade on a stock in her coverage and concludes the acquisition probability is 80%. What error has she made, and how would you correct it?"

Fidelity Equity Research Analyst "You are estimating the probability a company beats earnings consensus. Your prior is 45% beat, 30% meet, 25% miss. An expansion announcement arrives. The likelihood of expansion given beat is 75%, given meet is 20%, given miss is 5%. You calculate the posterior for beat as 82%. Without calculating the other two posteriors, how can you verify your arithmetic is correct?"

One-line to use in your interview

Interviewers listen for industry-specific language. It signals you understand the concept, not just the definition. Use the plain English version to adapt it in your own words.

When I receive new information about a borrower or a company, I use Bayes' formula to update my probability estimates rather than simply substituting the new signal as if it were the answer, because the direction of the conditional probability matters and the base rate always pulls the posterior toward it.

In plain English

If I hear that delinquent borrowers often file late, that does not mean a late filer is usually delinquent. Those are opposite questions. Bayes' formula accounts for the fact that late filing is relatively rare overall, so even a strong signal does not guarantee the outcome. I always compute the denominator before I report a posterior.