100% FREE
Updated: Mar 2026 Probability and Statistics Probability Theory
Random Variables and Distributions
Comprehensive study notes on Random Variables and Distributions for CMI Data Science preparation.
This chapter covers key concepts, formulas, and examples needed for your exam.
Welcome to the foundational chapter on Random Variables and Distributions, a cornerstone of your CMI Masters in Data Science curriculum. This chapter is absolutely critical, as it lays the theoretical and practical groundwork for understanding and applying nearly every statistical and machine learning concept you will encounter. Without a firm grasp of random variables and their distributions, topics like hypothesis testing, regression analysis, and even advanced deep learning architectures become abstract and difficult to interpret effectively.
In the CMI context, mastering this material is not just about theoretical understanding; it's about developing the intuition and analytical tools to tackle real-world data challenges. You'll learn how to mathematically model uncertainty, quantify variability, and make informed decisions based on probabilistic outcomes. This chapter directly addresses core competencies required for the CMI exams, ensuring you can correctly identify, apply, and interpret different probabilistic models crucial for data analysis and predictive modeling.
By the end of this chapter, you will possess the essential framework for reasoning about data generation processes, understanding the behavior of estimators, and interpreting the output of complex algorithms. This knowledge is indispensable for building robust data science solutions and effectively communicating insights, making it a high-yield area for your CMI success.
---
Chapter Contents
| # | Topic | What You'll Learn | |---|-------|-------------------| | 1 | Random Variables | Quantify outcomes of random experiments numerically. | | 2 | Distribution Functions | Describe probability of variable taking values. | | 3 | Expectation and Variance | Measure central tendency and data spread. | | 4 | Standard Distributions | Explore common, well-understood probabilistic models. |
---
Learning Objectives
βBy the End of This Chapter
After studying this chapter, you will be able to:
Define and classify discrete and continuous random variables, and understand their role in modeling uncertainty.
Calculate and interpret Probability Mass Functions (PMF), Probability Density Functions (PDF), and Cumulative Distribution Functions (CDF).
Compute and explain the expectation, variance, and standard deviation of random variables.
Identify characteristics and apply properties of key standard distributions (e.g., Bernoulli, Binomial, Poisson, Uniform, Normal, Exponential).
---
Now let's begin with Random Variables... ## Part 1: Random Variables
Introduction
In the realm of probability and statistics, a random variable serves as a fundamental concept, bridging the gap between abstract outcomes of a random experiment and numerical values. It is a function that assigns a real number to each outcome in the sample space of a random experiment. This transformation allows us to apply mathematical tools, such as algebra and calculus, to analyze the probabilities associated with these numerical outcomes.
Understanding random variables is crucial for CMI as it forms the bedrock for analyzing data, modeling uncertainty, and making predictions. In data science, almost every piece of data collected or generated can be viewed as a realization of one or more random variables, from the success rate of an algorithm to the error in a measurement. This unit will rigorously define random variables, explore their types, and detail methods for characterizing their behavior through probability distributions.
The set of all possible values that a random variable X can take is called its range or support, denoted by RXβ.
---
Key Concepts
1. Types of Random Variables
Random variables are primarily classified into two types based on their range:
* Discrete Random Variables: A random variable is discrete if its range RXβ is a finite or countably infinite set of real numbers. These variables typically arise from counting processes. * Examples: The number of heads in three coin flips (RXβ={0,1,2,3}), the number of customers arriving at a store in an hour (RXβ={0,1,2,β¦}).
* Continuous Random Variables: A random variable is continuous if its range RXβ is an uncountable infinite set, typically an interval or a collection of intervals on the real line. These variables usually arise from measurements. * Examples: The height of a student, the time it takes for a process to complete, the temperature of a room.
For the purpose of this chapter and the CMI exam, we will primarily focus on discrete random variables, as they are frequently encountered and directly relevant to the provided PYQ.
---
2. Probability Mass Function (PMF)
For a discrete random variable, its probability distribution is described by a Probability Mass Function (PMF). The PMF specifies the probability that the random variable takes on each of its possible values.
πProbability Mass Function (PMF)
For a discrete random variable X with range RXβ={x1β,x2β,β¦}, its Probability Mass Function (PMF), denoted by pXβ(x) or P(X=x), is a function such that:
pXβ(x)β₯0 for all xβRXβ.
xβRXβββpXβ(x)=1
pXβ(x)=0 for xβ/RXβ.
The value pXβ(x) represents the probability that the random variable X takes on the specific value x.
Worked Example:
Problem: A fair coin is flipped three times. Let X be the random variable representing the number of heads obtained. Determine the PMF of X.
Solution:
Step 1: Identify the sample space and the values of X.
Answer: The PMF of X is: pXβ(0)=1/8 pXβ(1)=3/8 pXβ(2)=3/8 pXβ(3)=1/8 and pXβ(x)=0 for xβ/{0,1,2,3}.
---
3. Cumulative Distribution Function (CDF)
The CDF provides a cumulative view of the probabilities, indicating the probability that a random variable X takes on a value less than or equal to a given value x.
πCumulative Distribution Function (CDF)
The Cumulative Distribution Function (CDF) of a random variable X, denoted by FXβ(x), is defined for any real number x as:
FXβ(x)=P(Xβ€x)
For a discrete random variable, the CDF can be calculated by summing the PMF values:
FXβ(x)=xiββ€xββpXβ(xiβ)
Properties of a CDF:
0β€FXβ(x)β€1 for all xβR.
FXβ(x) is non-decreasing: if a<b, then FXβ(a)β€FXβ(b).
xβββlimβFXβ(x)=0
xββlimβFXβ(x)=1
tβx+limβFXβ(t)=FXβ(x)
Worked Example:
Problem: Using the PMF from the previous example (X = number of heads in 3 coin flips), find the CDF of X.
Solution:
Step 1: Recall the PMF values. pXβ(0)=1/8 pXβ(1)=3/8 pXβ(2)=3/8 pXβ(3)=1/8
Step 2: Calculate FXβ(x) for different intervals of x.
For x<0: FXβ(x)=P(Xβ€x)=0 (since X cannot be negative)
For 0β€x<1: FXβ(x)=P(Xβ€x)=pXβ(0)=81β
For 1β€x<2: FXβ(x)=P(Xβ€x)=pXβ(0)+pXβ(1)=81β+83β=84β=21β
For 2β€x<3: FXβ(x)=P(Xβ€x)=pXβ(0)+pXβ(1)+pXβ(2)=81β+83β+83β=87β
For xβ₯3: FXβ(x)=P(Xβ€x)=pXβ(0)+pXβ(1)+pXβ(2)+pXβ(3)=81β+83β+83β+81β=88β=1
Often, we are interested in a new random variable Y that is a function of an existing random variable X, i.e., Y=g(X). To find the PMF of Y, we need to identify the possible values of Y and sum the probabilities of all X values that map to each Y value. This concept is directly tested in the provided CMI PYQ.
βFinding the PMF of Y = g(X)
Let X be a discrete random variable with PMF pXβ(x) and range RXβ. Let Y=g(X) be a new discrete random variable. The range of Y is RYβ={yβ£y=g(x)Β forΒ someΒ xβRXβ}. The PMF of Y, pYβ(y), is given by:
pYβ(y)=P(Y=y)=xβRXβ:g(x)=yββpXβ(x)
This means for each value y in the range of Y, we sum the probabilities of all x values in RXβ that are mapped to y by the function g.
Worked Example:
Problem: Let X be a discrete random variable with PMF given by: pXβ(1)=0.2, pXβ(2)=0.3, pXβ(3)=0.3, pXβ(4)=0.2. Let Y=(Xβ2)2. Find the PMF of Y.
Solution:
Step 1: Identify the range of X and its PMF. RXβ={1,2,3,4} pXβ(1)=0.2 pXβ(2)=0.3 pXβ(3)=0.3 pXβ(4)=0.2
Step 2: Determine the possible values of Y=(Xβ2)2 by applying the function g(x)=(xβ2)2 to each value in RXβ.
For x=1: y=(1β2)2=(β1)2=1 For x=2: y=(2β2)2=(0)2=0 For x=3: y=(3β2)2=(1)2=1 For x=4: y=(4β2)2=(2)2=4
The range of Y is RYβ={0,1,4}.
Step 3: Calculate the PMF of Y for each value in RYβ.
For Y=0: Only X=2 maps to Y=0. pYβ(0)=P(Y=0)=P(X=2)=pXβ(2)=0.3
For Y=1: X=1 and X=3 map to Y=1. pYβ(1)=P(Y=1)=P(X=1Β orΒ X=3)=pXβ(1)+pXβ(3)=0.2+0.3=0.5
For Y=4: Only X=4 maps to Y=4. pYβ(4)=P(Y=4)=P(X=4)=pXβ(4)=0.2
Step 4: Verify the PMF properties. All pYβ(y)β₯0.
yβRYβββpYβ(y)=0.3+0.5+0.2=1.0
Answer: The PMF of Y is: pYβ(0)=0.3 pYβ(1)=0.5 pYβ(4)=0.2 and pYβ(y)=0 for yβ/{0,1,4}.
---
5. Expected Value of a Discrete Random Variable
The expected value (or mean) of a random variable is a measure of its central tendency, representing the average value we would expect to observe if the experiment were repeated many times.
πExpected Value (Mean)
For a discrete random variable X with PMF pXβ(x) and range RXβ, the Expected Value (or Mean), denoted by E[X] or ΞΌXβ, is:
E[X]=xβRXβββxβ pXβ(x)
Variables:
X = discrete random variable
x = a specific value in the range of X
pXβ(x) = probability that X takes the value x
When to use: To find the long-run average of a random variable, or its central location.
πExpected Value of a Function of a Random Variable
If Y=g(X) is a function of a discrete random variable X, its expected value can be calculated directly from the PMF of X:
E[g(X)]=xβRXβββg(x)β pXβ(x)
Variables:
g(X) = function of the random variable X
x = a specific value in the range of X
pXβ(x) = probability that X takes the value x
When to use: To find the average value of a transformation of a random variable without first finding the PMF of Y=g(X).
Properties of Expected Value:
E[c]=c for any constant c.
E[aX+b]=aE[X]+b for constants a,b. (Linearity of Expectation)
E[X+Y]=E[X]+E[Y] (for any random variables X,Y, not necessarily independent).
Worked Example:
Problem: For the random variable X (number of heads in 3 coin flips) with PMF: pXβ(0)=1/8, pXβ(1)=3/8, pXβ(2)=3/8, pXβ(3)=1/8, calculate E[X] and E[2X+1].
An often more convenient computational formula for variance is:
Var(X)=E[X2]β(E[X])2
where E[X2]=βxβRXββx2pXβ(x).
Variables:
X = discrete random variable
ΞΌXβ = expected value (mean) of X
pXβ(x) = probability that X takes the value x
When to use: To quantify the spread or variability of a random variable's distribution.
πStandard Deviation
The Standard Deviation of a random variable X, denoted by ΟXβ, is the positive square root of its variance:
ΟXβ=Var(X)β
Properties of Variance:
Var(c)=0 for any constant c.
Var(aX+b)=a2Var(X) for constants a,b. Note that the constant b does not affect variance.
Var(X)β₯0.
Worked Example:
Problem: For the random variable X (number of heads in 3 coin flips) with PMF: pXβ(0)=1/8, pXβ(1)=3/8, pXβ(2)=3/8, pXβ(3)=1/8, and E[X]=1.5, calculate Var(X) and Var(2X+1).
Step 2: Calculate Var(X) using the computational formula.
Var(X)=E[X2]β(E[X])2
Var(X)=3β(1.5)2
Var(X)=3β2.25
Var(X)=0.75
Step 3: Calculate Var(2X+1) using properties of variance.
Var(2X+1)=22Var(X)
Var(2X+1)=4β 0.75
Var(2X+1)=3
Answer: Var(X)=0.75Β andΒ Var(2X+1)=3β
---
7. Joint Probability and Independence of Random Variables
When dealing with multiple random variables, we often need to understand their joint behavior.
πJoint Probability Mass Function (Joint PMF)
For two discrete random variables X and Y, their Joint Probability Mass Function (Joint PMF), denoted by pX,Yβ(x,y) or P(X=x,Y=y), is a function such that:
pX,Yβ(x,y)β₯0 for all (x,y) in the joint range.
βxββyβpX,Yβ(x,y)=1.
The value pX,Yβ(x,y) represents the probability that X takes value x AND Y takes value y simultaneously.
From a joint PMF, we can derive the marginal PMFs for X and Y:
pXβ(x)=yββpX,Yβ(x,y)
pYβ(y)=xββpX,Yβ(x,y)
πIndependence of Discrete Random Variables
Two discrete random variables X and Y are said to be independent if and only if their joint PMF is equal to the product of their marginal PMFs for all possible values x and y:
pX,Yβ(x,y)=pXβ(x)β pYβ(y)forΒ allΒ x,y
Equivalently, X and Y are independent if P(X=x,Y=y)=P(X=x)P(Y=y) for all x,y.
β οΈIndependence of X and g(X)
β A common mistake is assuming that if Y=g(X), then X and Y are independent. β This is generally false. If Y is a non-trivial function of X, they are dependent. Knowing the value of X directly tells you the value of Y, which is the definition of dependence. The only exception is if g(X) is a constant, in which case Y is not truly random (and independence holds vacuously for X and a constant). The PYQ explicitly tests this concept.
---
8. Uniform Discrete Distribution
A random variable follows a uniform discrete distribution if each value in its finite range has an equal probability of being observed. This is directly stated in the PYQ.
πUniform Discrete Random Variable
A discrete random variable X has a uniform distribution over a finite set of N values {x1β,x2β,β¦,xNβ} if its PMF is given by:
When asked about the distribution or probability of Y=g(X):
List RXβ and pXβ(x): Clearly write down the range and PMF of the original random variable X.
Determine RYβ: For each xβRXβ, calculate y=g(x). Collect these unique y values to form RYβ.
Map X to Y: For each yβRYβ, identify all xβRXβ such that g(x)=y.
Sum Probabilities:pYβ(y)=βx:g(x)=yβpXβ(x).
Verify: Ensure βyβRYββpYβ(y)=1.
This systematic approach minimizes errors, especially when g(X) is not a one-to-one function.
π‘CMI Strategy: Independence Check
To verify if X and Y are independent:
Calculate Marginal PMFs: Find pXβ(x) and pYβ(y) from the joint PMF pX,Yβ(x,y).
Check Condition: For all pairs (x,y), verify if pX,Yβ(x,y)=pXβ(x)β pYβ(y).
One Counterexample is Enough: If you find even one pair (x,y) for which the equality does not hold, then X and Y are dependent.
---
Common Mistakes
β οΈAvoid These Errors
β Assuming X and g(X) are independent: This is a very common trap. As discussed, knowing X usually determines g(X), making them dependent. For instance, if X is the number of heads, and Y=X2, they are clearly dependent.
β Correct approach: Always assume X and g(X) are dependent unless g(X) is a constant function or specifically proven otherwise.
β Incorrectly calculating PMF for Y=g(X): Forgetting to sum probabilities for all X values that map to the sameY value.
β Correct approach: Systematically list all x values, calculate their corresponding y values, group x values that result in the same y, and sum their original pXβ(x) values.
β Confusing PMF and CDF: Using P(X=x) when P(Xβ€x) is required, or vice versa.
β Correct approach: Remember pXβ(x) is for a single value, FXβ(x) is for values up to and including x. For discrete RVs, P(a<Xβ€b)=FXβ(b)βFXβ(a).
β Arithmetic errors with modulo operator: Misunderstanding the range of values produced by a(modn).
β Correct approach: Recall that a(modn) always results in a value in the set {0,1,β¦,nβ1} for positive n. For example, 5(mod3)=2, and 0(mod3)=0.
---
Practice Questions
:::question type="MCQ" question="Let X be a discrete random variable with PMF pXβ(x) given by pXβ(1)=0.1, pXβ(2)=0.3, pXβ(3)=0.4, pXβ(4)=0.2. Let Y=β£Xβ2β£. Which of the following is the correct PMF for Y?" options=["pYβ(0)=0.3,pYβ(1)=0.5,pYβ(2)=0.2","pYβ(0)=0.1,pYβ(1)=0.3,pYβ(2)=0.4,pYβ(3)=0.2","pYβ(0)=0.3,pYβ(1)=0.3,pYβ(2)=0.4","pYβ(0)=0.3,pYβ(1)=0.6,pYβ(2)=0.1"] answer="pYβ(0)=0.3,pYβ(1)=0.5,pYβ(2)=0.2" hint="Map each value of X to Y and sum probabilities for repeated Y values." solution=" Step 1: Determine the values of Y=β£Xβ2β£ for each xβRXβ.
If X=1, Y=β£1β2β£=β£β1β£=1.
If X=2, Y=β£2β2β£=β£0β£=0.
If X=3, Y=β£3β2β£=β£1β£=1.
If X=4, Y=β£4β2β£=β£2β£=2.
Step 2: Identify the range of Y, which is RYβ={0,1,2}.
Step 4: Verify that the probabilities sum to 1: 0.3+0.5+0.2=1.0. Answer: pYβ(0)=0.3,pYβ(1)=0.5,pYβ(2)=0.2β " :::
:::question type="NAT" question="A discrete random variable X has the following PMF: pXβ(x)=c(x+1) for xβ{0,1,2}, and 0 otherwise. Calculate the value of E[X2]. (Enter your answer as a decimal rounded to two decimal places.)" answer="2.33" hint="First find the constant c by ensuring the sum of probabilities is 1. Then calculate E[X2]." solution=" Step 1: Find the constant c. The sum of probabilities must be 1:
x=0β2βpXβ(x)=1
c(0+1)+c(1+1)+c(2+1)=1
c(1)+c(2)+c(3)=1
c(1+2+3)=1
6c=1
c=61β
Step 2: Write out the full PMF. pXβ(0)=61β(0+1)=61β pXβ(1)=61β(1+1)=62β pXβ(2)=61β(2+1)=63β
Step 3: Calculate E[X2].
E[X2]=xβRXβββx2pXβ(x)
E[X2]=(02β 61β)+(12β 62β)+(22β 63β)
E[X2]=(0β 61β)+(1β 62β)+(4β 63β)
E[X2]=0+62β+612β
E[X2]=614β=37β
Step 4: Convert to decimal rounded to two places. 7/3β2.3333... Rounded to two decimal places, E[X2]=2.33. Answer: 2.33β " :::
:::question type="MSQ" question="Let X be a random variable representing the outcome of rolling a fair six-sided die, so RXβ={1,2,3,4,5,6}. Let Y=X(mod3). Which of the following statements is/are true?" options=["P(Y=0)=1/3","X and Y are independent","E[Y]=1","Var(Y)=2/3"] answer="A,C,D" hint="Calculate the PMF of Y first. Then evaluate independence, expected value, and variance." solution=" Step 1: Determine the PMF of X. Since it's a fair die, pXβ(x)=1/6 for xβ{1,2,3,4,5,6}.
Step 2: Determine the PMF of Y=X(mod3).
If X=1, Y=1(mod3)=1.
If X=2, Y=2(mod3)=2.
If X=3, Y=3(mod3)=0.
If X=4, Y=4(mod3)=1.
If X=5, Y=5(mod3)=2.
If X=6, Y=6(mod3)=0.
The range of Y is RYβ={0,1,2}.
Step 3: Calculate pYβ(y).
P(Y=0)=P(X=3Β orΒ X=6)=pXβ(3)+pXβ(6)=1/6+1/6=2/6=1/3. (Statement A is TRUE)
Statement A: P(Y=0)=1/3. This is TRUE from our calculation above.
Statement B: X and Y are independent. Since Y is a function of X (Y=g(X)), they are generally dependent. For example, if we know X=1, then Y must be 1(mod3)=1. This means P(Y=1β£X=1)=1ξ =P(Y=1)=1/3. Thus, X and Y are dependent. (Statement B is FALSE)
:::question type="SUB" question="Let X be a discrete random variable with PMF pXβ(x)=2x1β for xβ{1,2,3,β¦}, and 0 otherwise. a) Prove that this is a valid PMF. b) Derive the expression for the CDF, FXβ(x). c) Calculate E[X]. " answer="a) βpXβ(x)=1. b) FXβ(x)=1β2βxβ1β. c) E[X]=2." hint="For part a), use the sum of a geometric series. For part b), use the definition of CDF. For part c), use the formula for E[X] and the sum βk=1ββkrk=(1βr)2rβ for β£rβ£<1." solution=" Part a) Prove that this is a valid PMF.
Step 1: Check non-negativity. For xβ{1,2,3,β¦}, pXβ(x)=2x1β>0. For other x, pXβ(x)=0. So, pXβ(x)β₯0 for all x.
Step 2: Check if the sum of probabilities is 1. We need to evaluate βx=1ββpXβ(x).
x=1βββ2x1β=211β+221β+231β+β¦
This is a geometric series with first term a=1/2 and common ratio r=1/2. The sum of an infinite geometric series is 1βraβ for β£rβ£<1.
x=1βββ2x1β=1β1/21/2β=1/21/2β=1
Since both conditions are met, pXβ(x) is a valid PMF.
Part b) Derive the expression for the CDF, FXβ(x).
Step 1: Define FXβ(x)=P(Xβ€x).
For x<1: FXβ(x)=0 (since X cannot take values less than 1)
PMF for Discrete RVs: The Probability Mass Function pXβ(x)=P(X=x) describes the probability of a discrete random variable taking a specific value. It must satisfy pXβ(x)β₯0 and βxβpXβ(x)=1.
CDF Provides Cumulative Probabilities: The Cumulative Distribution Function FXβ(x)=P(Xβ€x) gives the probability that X is less than or equal to x.
Functions of RVs are Crucial: To find the PMF of Y=g(X), sum the probabilities pXβ(x) for all x values that map to the same y value. This is a common exam concept.
Expected Value and Variance:E[X] measures central tendency, and Var(X) measures spread. Remember their formulas and properties, especially linearity of expectation and Var(aX+b)=a2Var(X).
Independence of X and g(X) is Rare:X and Y=g(X) are generally dependent. Do not assume independence unless g(X) is a constant. Check P(X=x,Y=y)=P(X=x)P(Y=y) for independence.
---
What's Next?
π‘Continue Learning
This topic connects to:
Common Discrete Distributions: Understanding specific PMFs (e.g., Bernoulli, Binomial, Poisson) that arise from specific random experiments. These distributions are built upon the fundamental concepts of random variables.
Joint Distributions of Multiple Random Variables: Extending the concepts of PMF, CDF, expectation, and variance to scenarios involving two or more random variables, exploring their relationships (e.g., covariance, correlation).
Continuous Random Variables: While this chapter focused on discrete RVs, the principles extend to continuous RVs using Probability Density Functions (PDFs) and integrals instead of sums.
Master these connections for comprehensive CMI preparation!
---
π‘Moving Forward
Now that you understand Random Variables, let's explore Distribution Functions which builds on these concepts.
---
Part 2: Distribution Functions
Introduction
Distribution functions are fundamental to probability theory and statistics, providing a comprehensive way to describe the behavior of random variables. In the context of the CMI Masters in Data Science, a deep understanding of these functions is crucial for modeling real-world phenomena, performing statistical inference, and building predictive models. This topic covers the essential concepts of how probabilities are distributed across the possible values of a random variable, whether discrete or continuous. Mastery of distribution functions allows us to quantify uncertainty, calculate probabilities of events, and characterize key aspects like the central tendency and spread of data, which are indispensable skills for any data scientist.
πRandom Variable
A random variable is a function that maps the outcomes of a random experiment to real numbers. Random variables can be broadly classified into two types:
Discrete Random Variable: A random variable whose set of possible values is finite or countably infinite.
Continuous Random Variable: A random variable whose set of possible values is an interval (finite or infinite) on the real number line.
---
Key Concepts
1. Probability Mass Function (PMF)
The Probability Mass Function (PMF) is used to describe the probability distribution of a discrete random variable. It assigns a probability to each possible value that the random variable can take.
πProbability Mass Function (PMF)
For a discrete random variable X, its Probability Mass Function (PMF), denoted by pXβ(x) or P(X=x), satisfies the following properties:
pXβ(x)β₯0 for all possible values x.
βxβpXβ(x)=1, where the sum is over all possible values of X.
Worked Example:
Problem: Let X be the number of heads in two coin tosses. Determine its PMF.
Solution:
Step 1: Identify the sample space and possible values of X.
The sample space for two coin tosses is S={HH,HT,TH,TT}. The possible values for X (number of heads) are 0,1,2.
Step 2: Calculate the probability for each value of X.
Answer: The PMF is pXβ(0)=1/4, pXβ(1)=1/2, pXβ(2)=1/4.
---
2. Probability Density Function (PDF)
The Probability Density Function (PDF) is used to describe the probability distribution of a continuous random variable. Unlike the PMF, the PDF does not give the probability of a specific value, but rather the relative likelihood of the random variable taking on a given value. Probabilities for continuous random variables are calculated over intervals.
πProbability Density Function (PDF)
For a continuous random variable X, its Probability Density Function (PDF), denoted by fXβ(x) or f(x), satisfies the following properties:
f(x)β₯0 for all xβR.
β«ββββf(x)dx=1.
The probability that X falls into an interval [a,b] is given by P(aβ€Xβ€b)=β«abβf(x)dx.
βMust Remember
For a continuous random variable X, the probability of X taking any single specific value is 0. That is, P(X=x0β)=0 for any x0β. Consequently, P(aβ€Xβ€b)=P(a<Xβ€b)=P(aβ€X<b)=P(a<X<b).
Worked Example:
Problem: Let X be a continuous random variable with PDF f(x)=cx(1βx) for 0β€xβ€1, and 0 otherwise. (a) Determine the value of c. (b) Find the probability P(X>0.5).
Solution (a):
Step 1: Apply the normalization property of a PDF.
β«ββββf(x)dx=1
Step 2: Substitute the given PDF and integrate over its non-zero range.
β«01βcx(1βx)dx=1
Step 3: Simplify the integrand and perform the integration.
cβ«01β(xβx2)dx=1
c[2x2ββ3x3β]01β=1
c((212ββ313β)β(202ββ303β))=1
c(21ββ31β)=1
c(63β2β)=1
c(61β)=1
Step 4: Solve for c.
c=6
Answer (a):c=6.
Solution (b):
Step 1: Set up the integral for P(X>0.5) using the determined PDF.
The Cumulative Distribution Function (CDF) provides the probability that a random variable X takes a value less than or equal to a given value x. It is defined for both discrete and continuous random variables.
πCumulative Distribution Function (CDF)
For any random variable X, its Cumulative Distribution Function (CDF), denoted by FXβ(x) or F(x), is defined as:
F(x)=P(Xβ€x)
Properties of a CDF:
0β€F(x)β€1 for all xβR.
F(x) is non-decreasing: if a<b, then F(a)β€F(b).
limxββββF(x)=0.
limxβββF(x)=1.
F(x) is right-continuous: limtβx+βF(t)=F(x).
For a discrete random variable X with PMF pXβ(x):
F(x)=tβ€xββpXβ(t)
For a continuous random variable X with PDF fXβ(x):
F(x)=β«ββxβfXβ(t)dt
Conversely, if F(x) is differentiable, then fXβ(x)=dxdβF(x).
πProbability from CDF
For any random variable X:
P(a<Xβ€b)=F(b)βF(a)
For a continuous random variable:
P(X>a)=1βF(a)
Variables:
F(x) = Cumulative Distribution Function
P(Xβ€x) = Probability that X is less than or equal to x
When to use: Calculating probabilities over intervals for any type of random variable.
Worked Example:
Problem: For the continuous random variable X with PDF f(x)=6x(1βx) for 0β€xβ€1, and 0 otherwise, find its CDF F(x). Then, use the CDF to find P(X>0.5).
Answer: The CDF is F(x)=3x2β2x3 for 0β€xβ€1, and P(X>0.5)=0.5.
---
---
# ## 4. Expected Value (Mean)
The expected value, or mean, of a random variable is a measure of its central tendency. It represents the average value one would expect if the experiment were repeated many times.
πExpected Value (Mean)
For a discrete random variable X with PMF pXβ(x):
E[X]=xββxβ pXβ(x)
For a continuous random variable X with PDF fXβ(x):
E[X]=β«ββββxβ fXβ(x)dx
πExpected Value of a Function
For a discrete random variable X and a function g(X):
E[g(X)]=xββg(x)β pXβ(x)
For a continuous random variable X and a function g(X):
E[g(X)]=β«ββββg(x)β fXβ(x)dx
Variables:
X = Random variable
pXβ(x) = PMF of X
fXβ(x) = PDF of X
g(X) = A function of X
When to use: To find the average value of a random variable or a function of a random variable.
Worked Example:
Problem: Find the expected value of X for the continuous random variable with PDF f(x)=6x(1βx) for 0β€xβ€1.
Solution:
Step 1: Apply the formula for the expected value of a continuous random variable.
E[X]=β«ββββxβ f(x)dx
Step 2: Substitute the PDF and integrate over its non-zero range.
E[X]=β«01βxβ [6x(1βx)]dx
E[X]=6β«01β(x2βx3)dx
Step 3: Perform the integration.
E[X]=6[3x3ββ4x4β]01β
E[X]=6((313ββ414β)β(303ββ404β))
E[X]=6(31ββ41β)
E[X]=6(124β3β)
E[X]=6(121β)
E[X]=21β
Answer:E[X]=0.5.
---
# ## 5. Variance
The variance measures the spread or dispersion of a random variable's values around its expected value. A higher variance indicates greater variability.
πVariance
The variance of a random variable X, denoted by Var(X) or ΟX2β, is defined as:
Var(X)=E[(XβE[X])2]
An equivalent and often more convenient formula is:
Var(X)=E[X2]β(E[X])2
The standard deviation, ΟXβ, is the positive square root of the variance: ΟXβ=Var(X)β.
When to use: To find the variance of linear transformations or sums of independent random variables.
β οΈCommon Mistake
β Assuming Var(X+Y)=Var(X)+Var(Y) for any random variables X,Y. β The property Var(X+Y)=Var(X)+Var(Y) holds only if X and Y are independent. If they are not independent, the covariance term must be included: Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y).
---
# ## 7. Central Limit Theorem (CLT) and Normal Approximation
The Central Limit Theorem (CLT) is one of the most powerful theorems in statistics. It explains why many natural phenomena follow a normal distribution, even if the individual components contributing to them do not.
πCentral Limit Theorem (CLT)
Let X1β,X2β,β¦,Xnβ be a sequence of independent and identically distributed (i.i.d.) random variables, each with finite mean E[Xiβ]=ΞΌ and finite variance Var(Xiβ)=Ο2. As n approaches infinity, the distribution of the sample mean XΛnβ=n1ββi=1nβXiβ approaches a normal distribution with mean ΞΌ and variance nΟ2β. That is, for large n:
XΛnββΌN(ΞΌ,nΟ2β)
Equivalently, the distribution of the sum Snβ=βi=1nβXiβ approaches a normal distribution with mean nΞΌ and variance nΟ2:
SnββΌN(nΞΌ,nΟ2)
The standardized random variable Z=Ο/nβXΛnββΞΌβ (or Z=ΟnβSnββnΞΌβ) approaches a standard normal distribution N(0,1) as nββ.
π‘Exam Shortcut
For problems involving sums or averages of a large number of i.i.d. random variables, immediately think of applying the Central Limit Theorem to approximate the distribution as normal. This allows you to use Z-scores and standard normal tables for probability calculations.
Worked Example:
Problem: The time taken (in minutes) for a data scientist to complete a specific task is a random variable with mean 15 minutes and standard deviation 4 minutes. If a data scientist completes 100 such tasks independently, what is the approximate probability that the total time taken for these 100 tasks is less than 1450 minutes?
Solution:
Step 1: Identify the given parameters for a single task Xiβ.
Mean E[Xiβ]=ΞΌ=15 minutes. Standard deviation Ο=4 minutes. Number of tasks n=100.
Step 2: Define the total time S100β and apply CLT.
The total time for 100 tasks is S100β=βi=1100βXiβ. By the Central Limit Theorem, for large n, Snβ is approximately normally distributed.
Calculate the mean of S100β:
E[S100β]=nΞΌ=100Γ15=1500
Calculate the variance of S100β:
Var(S100β)=nΟ2=100Γ(42)=100Γ16=1600
Calculate the standard deviation of S100β:
ΟS100ββ=1600β=40
So, S100ββΌN(1500,1600) approximately.
Step 3: Standardize the random variable to use the Z-score.
We want to find P(S100β<1450).
Z=ΟS100ββS100ββE[S100β]β
Z=401450β1500β
Z=40β50β
Z=β1.25
Step 4: Look up the probability using the standard normal CDF (or Z-table).
P(S100β<1450)βP(Z<β1.25)
Using a standard normal table or calculator, P(Z<β1.25)β0.1056.
Answer: The approximate probability that the total time taken is less than 1450 minutes is 0.1056.
---
# ## 8. Standardization (Z-score)
Standardization transforms a random variable into a standard score (Z-score), which represents how many standard deviations an observation is from the mean. This is particularly useful for comparing values from different normal distributions or for using standard normal tables.
πZ-score
For a random variable X with mean ΞΌ and standard deviation Ο:
Z=ΟXβΞΌβ
Variables:
Z = Standardized score (Z-score)
X = Value of the random variable
ΞΌ = Mean of X
Ο = Standard deviation of X
When to use: To transform any normally distributed variable into a standard normal variable N(0,1), allowing for the use of standard normal tables to find probabilities. Also used in conjunction with the CLT.
---
Problem-Solving Strategies
π‘CMI Strategy
Identify Random Variable Type: First, determine if the random variable is discrete or continuous. This dictates whether to use PMF/summation or PDF/integration.
Check PDF/PMF Properties: For questions involving determining constants or verifying a function, always use βpXβ(x)=1 (for discrete) or β«f(x)dx=1 (for continuous). Remember f(x)β₯0.
Probability from CDF/PDF:P(a<Xβ€b)=F(b)βF(a) for CDF. For PDF, it's β«abβf(x)dx.
Expectation & Variance: Remember the "shortcut" formula for variance: Var(X)=E[X2]β(E[X])2.
CLT Application: When dealing with sums or averages of a large number of independent and identically distributed random variables, the Central Limit Theorem is your go-to. This implies a normal approximation, and thus Z-scores.
Read Carefully: Pay attention to "total number," "average number," "more than," "less than," "at least," etc., to set up the correct integral or sum limits and inequalities.
---
Common Mistakes
β οΈAvoid These Errors
β Confusing PMF and PDF: Using integration for a discrete random variable or summation for a continuous one.
β Correct: PMF for discrete (summation), PDF for continuous (integration).
β Incorrect PDF Properties: Forgetting to check f(x)β₯0 or not normalizing the integral to 1.
β Correct: Always ensure f(x)β₯0 and β«f(x)dx=1.
β Probability of a single point for continuous RV: Assuming P(X=x0β) is non-zero for a continuous random variable.
β Correct: For continuous RVs, P(X=x0β)=0. Probabilities are over intervals.
β Ignoring Independence for Variance Sums: Applying Var(X+Y)=Var(X)+Var(Y) when X and Y are not independent.
β Correct: This property requires independence. If not independent, use Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y).
β Misapplying CLT: Using CLT for small sample sizes or for variables that are not i.i.d.
β Correct: CLT is for large n (typically nβ₯30) and i.i.d. random variables.
β Calculation Errors in Integration/Summation: Simple algebraic or calculus mistakes when evaluating integrals or sums.
β Correct: Double-check calculations, especially definite integrals and series summations.
---
---
Practice Questions
:::question type="MCQ" question="Let X be a continuous random variable with the probability density function given by:
f(x)={keβ2x,0,βx>0otherwiseβ
What is the value of k that makes f(x) a valid PDF?" options=["1/2","1","2","e"] answer="2" hint="Use the property that the integral of a PDF over its entire range must equal 1." solution="Step 1: Apply the normalization condition for a PDF.
β«ββββf(x)dx=1
Step 2: Substitute the given PDF into the integral.
β«0ββkeβ2xdx=1
Step 3: Evaluate the integral.
k[β21βeβ2x]0ββ=1
k(bββlimβ(β21βeβ2b)β(β21βe0))=1
k(0β(β21β))=1
k(21β)=1
Step 4: Solve for k.
k=2
Answer: \boxed{2}" :::
:::question type="NAT" question="A discrete random variable Y has the following Probability Mass Function:
P(Y=y)=y+1cβ,forΒ y=0,1,2
What is the value of c (rounded to two decimal places)?" answer="0.55" hint="The sum of all probabilities in a PMF must be equal to 1." solution="Step 1: Apply the normalization condition for a PMF.
y=0β2βP(Y=y)=1
Step 2: Sum the probabilities for each possible value of Y.
P(Y=0)=0+1cβ=c
P(Y=1)=1+1cβ=2cβ
P(Y=2)=2+1cβ=3cβ
Step 3: Set the sum equal to 1 and solve for c.
c+2cβ+3cβ=1
Find a common denominator (6).
66cβ+63cβ+62cβ=1
611cβ=1
c=116β
Step 4: Round to two decimal places.
cβ0.5454...β0.55
Answer: \boxed{0.55}" :::
:::question type="MSQ" question="Let X be a continuous random variable with CDF given by:
Which of the following statements is/are true?" options=["The PDF of X is f(x)=2x for 0β€x<1.","P(Xβ€0.5)=0.25.","P(0.2<X<0.8)=0.6.","E[X]=2/3." ] answer="A,B,C,D" hint="Remember that f(x)=dF(x)/dx for continuous random variables. Use the CDF to find probabilities. Calculate E[X]=β«xf(x)dx." solution="Statement A: The PDF of X is f(x)=2x for 0β€x<1. To find the PDF from the CDF, differentiate the CDF:
f(x)=dxdβF(x)=dxdβ(x2)=2x
This is valid for 0β€x<1. For x<0 and xβ₯1, f(x)=0. So, statement A is true.
Statement B: P(Xβ€0.5)=0.25. Using the CDF definition:
P(Xβ€0.5)=F(0.5)
Since 0β€0.5<1, we use F(x)=x2:
F(0.5)=(0.5)2=0.25
So, statement B is true.
Statement C: P(0.2<X<0.8)=0.6. Using the CDF property P(a<X<b)=F(b)βF(a):
P(0.2<X<0.8)=F(0.8)βF(0.2)
F(0.8)=(0.8)2=0.64
F(0.2)=(0.2)2=0.04
P(0.2<X<0.8)=0.64β0.04=0.60
So, statement C is true.
Statement D: E[X]=2/3. Using the PDF f(x)=2x for 0β€x<1:
E[X]=β«ββββxf(x)dx=β«01βx(2x)dx
E[X]=β«01β2x2dx
E[X]=[32x3β]01β
E[X]=32(1)3ββ32(0)3β=32β
So, statement D is true.
All options A, B, C, D are true. Answer: \boxed{A,B,C,D}" :::
:::question type="SUB" question="A manufacturing process produces items whose weights are independent random variables with a mean of 10 kg and a standard deviation of 2 kg. A sample of 64 items is taken from the production line. (a) What is the probability that the average weight of the items in the sample is less than 9.5 kg? (b) What is the total expected weight of the 64 items?" answer="0.0228, 640 kg" hint="For part (a), use the Central Limit Theorem to approximate the distribution of the sample mean. For part (b), use the linearity of expectation for a sum of random variables." solution="(a) Probability that the average weight is less than 9.5 kg:
Step 1: Identify parameters for a single item Xiβ. Mean E[Xiβ]=ΞΌ=10 kg. Standard deviation Ο=2 kg. Sample size n=64.
Step 2: Apply the Central Limit Theorem to the sample mean XΛnβ. For large n, XΛnβ is approximately normally distributed with: Mean of sample mean: E[XΛnβ]=ΞΌ=10 kg. Standard deviation of sample mean (standard error): ΟXΛnββ=nβΟβ=64β2β=82β=0.25 kg. So, XΛ64ββΌN(10,(0.25)2) approximately.
Step 3: Standardize the value 9.5 kg. We want to find P(XΛ64β<9.5).
Z=ΟXΛ64ββXΛ64ββE[XΛ64β]β
Z=0.259.5β10β
Z=0.25β0.5β
Z=β2
Step 4: Look up the probability using the standard normal CDF.
P(XΛ64β<9.5)βP(Z<β2)
Using a standard normal table or calculator, P(Z<β2)β0.0228.
(b) Total expected weight of the 64 items:
Step 1: Define the total weight S64β. S64β=βi=164βXiβ.
PMF vs. PDF: Discrete random variables use Probability Mass Functions (PMF) which sum to 1. Continuous random variables use Probability Density Functions (PDF) which integrate to 1.
CDF for All: The Cumulative Distribution Function (CDF) F(x)=P(Xβ€x) is defined for both discrete and continuous variables, is non-decreasing, and ranges from 0 to 1. P(a<Xβ€b)=F(b)βF(a).
Expected Value & Variance: These measure central tendency and spread. Remember E[aX+b]=aE[X]+b and Var(aX+b)=a2Var(X). For independent variables, Var(X+Y)=Var(X)+Var(Y).
Central Limit Theorem (CLT): For a large number of i.i.d. random variables, their sum or average is approximately normally distributed. This is critical for inferential statistics and often tested in CMI.
Standardization (Z-score): Use Z=(XβΞΌ)/Ο to convert any normal random variable to a standard normal N(0,1), which allows for probability look-ups in Z-tables.
---
What's Next?
π‘Continue Learning
This topic connects to:
Specific Probability Distributions: Understanding distribution functions is the foundation for studying named distributions like Bernoulli, Binomial, Poisson, Exponential, Uniform, and Normal distributions. Each has its own PMF/PDF and CDF.
Joint Distributions: Extending these concepts to multiple random variables, understanding their joint behavior, and concepts like covariance and correlation.
Statistical Inference: The Central Limit Theorem forms the bedrock for hypothesis testing and confidence intervals, allowing us to make inferences about population parameters from sample data.
Master these connections for comprehensive CMI preparation!
---
π‘Moving Forward
Now that you understand Distribution Functions, let's explore Expectation and Variance which builds on these concepts.
---
Part 3: Expectation and Variance
Introduction
Expectation and variance are fundamental concepts in probability theory, providing concise summaries of the central tendency and spread of a random variable's distribution. The expectation, or expected value, quantifies the "average" outcome of a random variable over a large number of trials. It represents the weighted average of all possible values a random variable can take, with weights given by their respective probabilities. The variance, on the other hand, measures the dispersion or spread of the random variable's values around its expected value. A low variance indicates that values tend to be close to the mean, while a high variance suggests that values are spread out over a wider range.
In the CMI examination, a deep understanding of expectation and variance is crucial. These concepts are extensively tested, often through complex scenarios involving multiple random variables, indicator functions, and various probability distributions. Mastery of linearity of expectation and the properties of variance is essential for efficiently solving problems that might otherwise appear intractable.
πRandom Variable
A random variable is a function that maps the outcomes of a random experiment to real numbers. It can be discrete (taking on a finite or countably infinite number of values) or continuous (taking on any value within a given interval).
---
Key Concepts
# ## 1. Expectation of a Random Variable
The expectation, also known as the expected value or mean, of a random variable X is denoted by E[X] or ΞΌ. It represents the long-run average value of the variable.
# ### 1.1. Discrete Random Variables
For a discrete random variable X with probability mass function (PMF) P(X=x), the expectation is calculated by summing the products of each possible value of X and its corresponding probability.
πExpectation of a Discrete Random Variable
E[X]=xββxP(X=x)
Variables:
X = discrete random variable
x = a possible value of X
P(X=x) = probability mass function (PMF) at x
When to use: To find the average value of a discrete random variable.
Worked Example:
Problem: A fair six-sided die is rolled. Let X be the number rolled. Calculate E[X].
Solution:
Step 1: Identify the possible values of X and their probabilities. The possible values are 1,2,3,4,5,6. Since the die is fair, each outcome has a probability of 1/6.
P(X=x)=61βforΒ xβ{1,2,3,4,5,6}
Step 2: Apply the formula for the expectation of a discrete random variable.
For a continuous random variable X with probability density function (PDF) f(x), the expectation is calculated by integrating the product of x and its PDF over the entire range of possible values.
πExpectation of a Continuous Random Variable
E[X]=β«ββββxf(x)dx
Variables:
X = continuous random variable
f(x) = probability density function (PDF) of X
When to use: To find the average value of a continuous random variable.
Worked Example:
Problem: Let X be a continuous random variable with PDF f(x)=2x for 0β€xβ€1, and f(x)=0 otherwise. Calculate E[X].
Solution:
Step 1: Identify the PDF and its range.
f(x)=2xforΒ 0β€xβ€1
Step 2: Apply the formula for the expectation of a continuous random variable.
E[X]=β«ββββxf(x)dx
Since f(x) is non-zero only for 0β€xβ€1, the integral limits change.
E[X]=β«01βx(2x)dx
Step 3: Evaluate the integral.
E[X]=β«01β2x2dx
E[X]=[32x3β]01β
E[X]=(32(1)3β)β(32(0)3β)
E[X]=32β
Answer: \boxed{\frac{2}{3}}
---
---
#
1.3. Properties of Expectation
Expectation has several important properties that simplify calculations, especially when dealing with sums or transformations of random variables.
πProperties of Expectation
Expectation of a constant:E[c]=c
Scalar multiplication:E[aX]=aE[X]
Addition of a constant:E[X+b]=E[X]+b
Linearity of Expectation: For any random variables X1β,X2β,β¦,Xnβ (whether independent or dependent) and constants a1β,a2β,β¦,anβ:
E[i=1βnβaiβXiβ]=i=1βnβaiβE[Xiβ]
A special case is E[X+Y]=E[X]+E[Y].
Expectation of a function of a random variable:
For discrete X: E[g(X)]=βxβg(x)P(X=x) For continuous X: E[g(X)]=β«ββββg(x)f(x)dx
Worked Example (Linearity of Expectation):
Problem: A box contains 10 balls: 3 red and 7 blue. Two balls are drawn without replacement. Let X be the number of red balls drawn. Find E[X].
Solution:
Step 1: Define indicator random variables. Let X1β be an indicator variable for the first ball drawn being red. Let X2β be an indicator variable for the second ball drawn being red.
The variance of a random variable X, denoted by V(X) or Var(X) or Ο2, measures the spread or dispersion of its values around the mean. It is the expected value of the squared deviation from the mean.
πVariance of a Random Variable
V(X)=E[(XβE[X])2]
Alternative (Computational) Formula:
V(X)=E[X2]β(E[X])2
Variables:
X = random variable
E[X] = expected value of X
E[X2] = expected value of X2
When to use: To quantify the spread of a random variable's distribution. The alternative formula is often easier for calculation.
Derivation of V(X)=E[X2]β(E[X])2:
Step 1: Start with the definition of variance.
V(X)=E[(XβE[X])2]
Step 2: Expand the squared term inside the expectation. Let ΞΌ=E[X] for simplicity.
V(X)=E[(XβΞΌ)2]
V(X)=E[X2β2ΞΌX+ΞΌ2]
Step 3: Apply linearity of expectation.
V(X)=E[X2]βE[2ΞΌX]+E[ΞΌ2]
Step 4: Use properties of expectation (E[aX]=aE[X] and E[c]=c).
V(X)=E[X2]β2ΞΌE[X]+ΞΌ2
Step 5: Substitute ΞΌ=E[X] back into the expression.
V(X)=E[X2]β2E[X]E[X]+(E[X])2
V(X)=E[X2]β2(E[X])2+(E[X])2
Step 6: Simplify the expression.
V(X)=E[X2]β(E[X])2
---
Worked Example:
Problem: A fair six-sided die is rolled. Let X be the number rolled. Calculate V(X).
Solution:
Step 1: Recall E[X] from the previous example.
E[X]=3.5
Step 2: Calculate E[X2]. Using the formula E[g(X)]=βxβg(x)P(X=x) with g(X)=X2.
Step 3: Apply the computational formula for variance.
V(X)=E[X2]β(E[X])2
V(X)=691ββ(3.5)2
V(X)=691ββ(27β)2
V(X)=691ββ449β
Step 4: Find a common denominator and simplify.
V(X)=12182ββ12147β
V(X)=1235β
Answer:1235β
---
#
2.1. Properties of Variance
Variance also has several key properties.
πProperties of Variance
Non-negativity:V(X)β₯0
Variance of a constant:V(c)=0
Scalar multiplication and addition of a constant:
V(aX+b)=a2V(X)
Variance of a sum of independent random variables: If X1β,X2β,β¦,Xnβ are independent random variables, then:
V[i=1βnβXiβ]=i=1βnβV(Xiβ)
A special case is V(X+Y)=V(X)+V(Y) if X and Y are independent.
Variance of a sum of dependent random variables: If X and Y are dependent:
V(X+Y)=V(X)+V(Y)+2Cov(X,Y)
where Cov(X,Y)=E[(XβE[X])(YβE[Y])] is the covariance between X and Y.
βIndependence for Variance
Unlike expectation, which is always linear (E[X+Y]=E[X]+E[Y] regardless of independence), the variance of a sum is only the sum of variances if the random variables are independent. If they are dependent, the covariance term must be included.
---
#
3. Standard Deviation
The standard deviation is the square root of the variance and is denoted by Ο. It has the same units as the random variable itself, making it more interpretable than variance in many contexts.
πStandard Deviation
ΟXβ=V(X)β
Variables:
ΟXβ = standard deviation of X
V(X) = variance of X
When to use: To express the spread of data in the original units of the random variable.
---
#
4. Indicator Random Variables
An indicator random variable is a special type of discrete random variable that takes on a value of 1 if a particular event occurs and 0 otherwise. They are incredibly powerful when used with the linearity of expectation, especially in counting problems.
πIndicator Random Variable
For an event A, the indicator random variable IAβ is defined as:
Worked Example (Using Indicator Variables for Expectation):
Problem: In a group of n people, what is the expected number of people who share the same birthday (ignoring leap years)?
Solution:
Step 1: Define indicator variables for each possible pair of people. Let N=(2nβ) be the total number of pairs of people. Let Iijβ be an indicator variable that people i and j share a birthday, for 1β€i<jβ€n.
Step 2: Express the total number of shared birthdays (X) as a sum of indicator variables.
X=1β€i<jβ€nββIijβ
Step 3: Calculate the expectation of a single indicator variable. Assuming each day of the year (365 days) is equally likely for a birthday. The probability that two specific people share a birthday is P(Iijβ=1)=3651β.
E[Iijβ]=P(Iijβ=1)=3651β
Step 4: Apply linearity of expectation.
E[X]=E[1β€i<jβ€nββIijβ]
E[X]=1β€i<jβ€nββE[Iijβ]
Since there are (2nβ) such indicator variables, and each has the same expectation:
E[X]=(2nβ)β 3651β
E[X]=2n(nβ1)ββ 3651β
Answer:730n(nβ1)β
---
#
5. Chebyshev's Inequality
Chebyshev's Inequality provides a bound on the probability that a random variable deviates from its mean by a certain amount. It is a powerful tool because it applies to any probability distribution for which the mean and variance exist, without requiring knowledge of the specific distribution shape.
πChebyshev's Inequality
For any random variable X with finite mean E[X] and finite variance V(X), and for any real number k>0:
P(β£XβE[X]β£β₯k)β€k2V(X)β
Alternative form: Let k=cΟ, where Ο=V(X)β is the standard deviation and c>0.
P(β£XβE[X]β£β₯cΟ)β€c21β
Variables:
X = random variable
E[X] = mean of X
V(X) = variance of X
k = a positive constant representing the deviation from the mean
When to use: To provide a general upper bound on the probability of extreme deviations from the mean when the exact distribution is unknown or complex.
Worked Example:
Problem: The average height of students in a university is 170 cm with a standard deviation of 5 cm. What is the minimum percentage of students whose height is between 160 cm and 180 cm?
Solution:
Step 1: Identify the given values. E[X]=170 cm ΟXβ=5 cm We want to find P(160β€Xβ€180).
Step 2: Rephrase the probability in terms of deviation from the mean. The interval [160,180] is 170Β±10. So, we are interested in P(β£Xβ170β£β€10).
Step 3: Apply Chebyshev's Inequality for the complementary event. Chebyshev's inequality gives an upper bound for P(β£XβE[X]β£β₯k). Here, k=10.
P(β£Xβ170β£β₯10)β€102V(X)β
First, calculate V(X)=ΟX2β=52=25.
P(β£Xβ170β£β₯10)β€10025β
P(β£Xβ170β£β₯10)β€41β
Step 4: Find the probability for the desired interval. The probability of being within the interval is 1βP(β£Xβ170β£β₯10).
P(160β€Xβ€180)=1βP(β£Xβ170β£β₯10)
P(160β€Xβ€180)β₯1β41β
P(160β€Xβ€180)β₯43β
Step 5: Convert to percentage.
43β=0.75=75%
Answer: At least 75% of students have heights between 160 cm and 180 cm.
---
---
Problem-Solving Strategies
π‘CMI Strategy: Linearity of Expectation with Indicators
Many CMI problems involving counting the expected number of "events" (e.g., matching items, shared birthdays, special points) are most efficiently solved using linearity of expectation with indicator random variables.
Define the overall random variable X as the quantity you need to find the expectation of.
Decompose X into a sum of simpler random variables Xiβ. Often, these Xiβ will be indicator variables. For example, if X is the number of items with property A, define Xiβ=1 if item i has property A, and 0 otherwise.
Calculate E[Xiβ] for each individual Xiβ. For an indicator variable IAβ, this is simply P(A).
Apply linearity of expectation:
E[X]=E[βXiβ]=βE[Xiβ]
This works even if the Xiβ are dependent, which is a major advantage.
π‘CMI Strategy: Using V(X)=E[X2]β(E[X])2
This formula is almost always easier for calculating variance than the definition E[(XβE[X])2], especially for complex distributions or when E[X] is not an integer.
First, calculate E[X].
Then, calculate E[X2]. Remember E[X2] is not (E[X])2. For discrete variables, it's βx2P(X=x); for continuous, β«x2f(x)dx.
Finally, subtract (E[X])2 from E[X2].
π‘CMI Strategy: Handling Conditional Information
When a problem provides conditional probabilities or usage statistics (like in server outage problems), use the Law of Total Probability to find unconditional probabilities, which can then be used in expectation calculations. For example, if you need the overall probability of an event A that depends on conditions Biβ:
P(A)=βP(Aβ£Biβ)P(Biβ)
Then, if X is a value associated with A, E[X] might involve these combined probabilities.
---
Common Mistakes
β οΈAvoid These Errors
β Assuming independence for variance of a sum: Students often mistakenly write V(X+Y)=V(X)+V(Y) even when X and Y are dependent.
β Correct approach: Remember that
V(X+Y)=V(X)+V(Y)+2Cov(X,Y)
If X and Y are independent, Cov(X,Y)=0, so V(X+Y)=V(X)+V(Y). Always check for independence.
β Confusing E[X2] with (E[X])2: These are generally not equal (E[X2]β₯(E[X])2 always).
β Correct approach:E[X2] is the expectation of the squared random variable. (E[X])2 is the square of the expected value. Calculate them separately.
β Incorrectly applying linearity of expectation to products:E[XY]ξ =E[X]E[Y] unless X and Y are independent.
β Correct approach: Linearity applies to sums. For products, use
E[XY]=E[X]E[Y]+Cov(X,Y)
If independent, then E[XY]=E[X]E[Y].
β Misinterpreting probability in indicator variable problems: For E[IAβ], the probability P(A) must be calculated correctly, considering all conditions of the event A.
β Correct approach: Carefully define the event A for each indicator variable and calculate its probability precisely. This often involves basic combinatorial probability.
---
Practice Questions
:::question type="NAT" question="A company manufactures light bulbs. The lifespan of a bulb, X, in years, has a probability density function f(x)=Ξ»eβΞ»x for xβ₯0, where Ξ»=0.5. What is the expected lifespan of a bulb in years?" answer="2" hint="Recall the formula for the expectation of a continuous random variable and the properties of the exponential distribution." solution="Step 1: Identify the PDF and its parameter. The PDF is f(x)=0.5eβ0.5x for xβ₯0. This is an exponential distribution with rate parameter Ξ»=0.5.
Step 2: Apply the formula for the expectation of a continuous random variable.
E[X]=β«0ββxf(x)dx
E[X]=β«0ββx(0.5eβ0.5x)dx
This is the mean of an exponential distribution, which is 1/Ξ».
Step 3: Calculate the expectation.
E[X]=Ξ»1β=0.51β=2
The expected lifespan is 2 years. Answer: \boxed{2}" :::
:::question type="MCQ" question="Let X be a discrete random variable with P(X=1)=0.2, P(X=2)=0.3, and P(X=3)=0.5. Which of the following statements about V(X) is correct?" options=["V(X)=0.61","V(X)=0.76","V(X)=1.69","V(X)=2.3"] answer="V(X)=0.61" hint="First calculate E[X] and E[X2], then use the formula V(X)=E[X2]β(E[X])2." solution="Step 1: Calculate E[X].
E[X]=βxP(X=x)
E[X]=(1)(0.2)+(2)(0.3)+(3)(0.5)
E[X]=0.2+0.6+1.5=2.3
Step 2: Calculate E[X2].
E[X2]=βx2P(X=x)
E[X2]=(12)(0.2)+(22)(0.3)+(32)(0.5)
E[X2]=(1)(0.2)+(4)(0.3)+(9)(0.5)
E[X2]=0.2+1.2+4.5=5.9
Step 3: Calculate V(X).
V(X)=E[X2]β(E[X])2
V(X)=5.9β(2.3)2
V(X)=5.9β5.29
V(X)=0.61
Answer: \boxed{0.61}" :::
:::question type="SUB" question="A bag contains 5 red balls and 5 blue balls. Three balls are drawn without replacement. Let Y be the number of blue balls drawn. Calculate E[Y] using indicator random variables." answer="E[Y]=1.5" hint="Define an indicator variable for each draw, then use linearity of expectation." solution="Step 1: Define indicator random variables. Let Y1β, Y2β, Y3β be indicator variables for the first, second, and third ball drawn being blue, respectively.
:::question type="MSQ" question="Let X be a random variable with E[X]=5 and V(X)=4. Which of the following statements are correct?" options=["E[2X+3]=13","V(2X+3)=16","E[X2]=29","P(β£Xβ5β£β₯4)β€1/4"] answer="A,B,C,D" hint="Apply the properties of expectation and variance, and Chebyshev's inequality." solution="Let's evaluate each option:
Option A: E[2X+3]=13 Using linearity of expectation:
E[2X+3]=E[2X]+E[3]
E[2X+3]=2E[X]+3
Given E[X]=5:
E[2X+3]=2(5)+3=10+3=13
This statement is correct.
Option B: V(2X+3)=16 Using properties of variance:
V(aX+b)=a2V(X)
V(2X+3)=22V(X)
Given V(X)=4:
V(2X+3)=4(4)=16
This statement is correct.
Option C: E[X2]=29 Using the computational formula for variance:
V(X)=E[X2]β(E[X])2
We are given V(X)=4 and E[X]=5.
4=E[X2]β(5)2
4=E[X2]β25
E[X2]=4+25=29
This statement is correct.
Option D: P(β£Xβ5β£β₯4)β€1/4 Using Chebyshev's Inequality:
P(β£XβE[X]β£β₯k)β€k2V(X)β
Here, E[X]=5, V(X)=4, and k=4.
P(β£Xβ5β£β₯4)β€424β
P(β£Xβ5β£β₯4)β€164β
P(β£Xβ5β£β₯4)β€41β
This statement is correct.
All statements are correct. Answer: \boxed{A,B,C,D}" :::
:::question type="NAT" question="A discrete random variable Y has P(Y=0)=0.4, P(Y=1)=0.3, and P(Y=2)=0.3. What is the standard deviation of Y (round to two decimal places)?" answer="0.83" hint="Calculate E[Y] and E[Y2], then V(Y), and finally ΟYβ=V(Y)β." solution="Step 1: Calculate E[Y].
E[Y]=βyP(Y=y)
E[Y]=(0)(0.4)+(1)(0.3)+(2)(0.3)
E[Y]=0+0.3+0.6=0.9
Step 2: Calculate E[Y2].
E[Y2]=βy2P(Y=y)
E[Y2]=(02)(0.4)+(12)(0.3)+(22)(0.3)
E[Y2]=(0)(0.4)+(1)(0.3)+(4)(0.3)
E[Y2]=0+0.3+1.2=1.5
Step 3: Calculate V(Y).
V(Y)=E[Y2]β(E[Y])2
V(Y)=1.5β(0.9)2
V(Y)=1.5β0.81
V(Y)=0.69
Step 4: Calculate the standard deviation ΟYβ.
ΟYβ=V(Y)β
ΟYβ=0.69β
ΟYββ0.83066...
Rounding to two decimal places, ΟYβ=0.83. Answer: \boxed{0.83}" :::
---
Summary
βKey Takeaways for CMI
Expectation (E[X]): Represents the long-run average. For discrete X,
E[X]=βxP(X=x)
For continuous X,
E[X]=β«xf(x)dx
Linearity of Expectation:
E[βaiβXiβ]=βaiβE[Xiβ]
is a powerful tool. It holds always, regardless of whether Xiβ are independent or dependent. This is crucial for problems involving sums of indicator variables.
Variance (V(X)): Measures the spread around the mean. The computational formula
V(X)=E[X2]β(E[X])2
is generally preferred.
Properties of Variance:
V(aX+b)=a2V(X)
For independent random variables,
V(βXiβ)=βV(Xiβ)
For dependent variables, covariance terms must be included.
Indicator Random Variables:IAβ=1 if event A occurs, 0 otherwise.
E[IAβ]=P(A)
They simplify complex counting problems when combined with linearity of expectation.
Chebyshev's Inequality:
P(β£XβE[X]β£β₯k)β€k2V(X)β
provides a general bound on deviations from the mean for any distribution with finite mean and variance.
---
What's Next?
π‘Continue Learning
This topic connects to:
Covariance and Correlation: Understanding dependence between random variables, which is essential for calculating variance of sums of dependent variables
V(X+Y)=V(X)+V(Y)+2Cov(X,Y)
Moment Generating Functions (MGFs): MGFs are powerful tools for finding expectations and variances of random variables, especially for sums of independent random variables. They provide an alternative, often simpler, method to derive these moments.
Common Probability Distributions: Knowing the specific formulas for E[X] and V(X) for distributions like Binomial, Poisson, Geometric, Uniform, Normal, and Exponential is critical for applying these concepts in specific scenarios.
Master these connections for comprehensive CMI preparation!
---
π‘Moving Forward
Now that you understand Expectation and Variance, let's explore Standard Distributions which builds on these concepts.
---
Part 4: Standard Distributions
Introduction
Standard distributions are fundamental building blocks in probability theory and statistics, providing models for a wide array of random phenomena encountered in data science. Each distribution describes the probabilities of different outcomes for a specific type of random variable, characterized by its parameters. Understanding these distributions is crucial for modeling real-world data, performing statistical inference, and making informed decisions.
In the CMI exam, a deep understanding of standard discrete and continuous distributions is essential. This includes knowing their probability mass/density functions, cumulative distribution functions, expected values, variances, and how to apply them to calculate probabilities and estimate parameters in various scenarios. Mastery of these concepts forms the bedrock for advanced topics like hypothesis testing, regression analysis, and machine learning algorithms.
πRandom Variable
A random variable is a function that maps outcomes from a sample space to numerical values.
A discrete random variable can take on a finite or countably infinite number of values.
A continuous random variable can take on any value within a given range or interval.
---
Key Concepts
1. Discrete Distributions
Discrete distributions model scenarios where the outcomes are countable.
1.1 Bernoulli Distribution
The Bernoulli distribution models a single trial with two possible outcomes: "success" (usually denoted by 1) or "failure" (usually denoted by 0).
πBernoulli PMF
P(X=x)=px(1βp)1βxforΒ xβ{0,1}
Variables:
X = Bernoulli random variable
p = probability of success (0β€pβ€1)
x = outcome (0 or 1)
When to use: For a single trial with binary outcome.
Properties:
Mean: E[X]=p
Variance: Var(X)=p(1βp)
---
1.2 Binomial Distribution
The Binomial distribution models the number of successes in a fixed number of independent Bernoulli trials.
πBinomial PMF
P(X=k)=(knβ)pk(1βp)nβkforΒ kβ{0,1,β¦,n}
Variables:
X = Binomial random variable
n = number of trials
k = number of successes
p = probability of success in a single trial
(knβ)=k!(nβk)!n!β = binomial coefficient
When to use: When counting the number of successes in a fixed number of independent trials, each with the same probability of success.
Properties:
Mean: E[X]=np
Variance: Var(X)=np(1βp)
Worked Example:
Problem: A fair coin is tossed 10 times. What is the probability of getting exactly 7 heads?
Solution:
Step 1: Identify parameters for Binomial distribution. Here, n=10 (number of tosses), k=7 (number of heads), and p=0.5 (probability of heads for a fair coin).
Step 2: Apply the Binomial PMF.
P(X=7)=(710β)(0.5)7(1β0.5)10β7
Step 3: Calculate the binomial coefficient and simplify.
The Poisson distribution models the number of events occurring in a fixed interval of time or space, given a constant average rate (Ξ») of occurrence and that these events occur independently.
πPoisson PMF
P(X=k)=k!eβλλkβforΒ kβ{0,1,2,β¦}
Variables:
X = Poisson random variable
k = number of events
Ξ» = average rate of events in the interval (Ξ»>0)
When to use: For counts of rare events over a specified interval or region.
Properties:
Mean: E[X]=Ξ»
Variance: Var(X)=Ξ»
β Poisson Approximation to Binomial
When the number of trials n is large (nβ₯20) and the probability of success p is small (pβ€0.05), the Binomial distribution B(n,p) can be approximated by a Poisson distribution with parameter Ξ»=np. This approximation is useful for simplifying calculations when n is large and p is small.
Worked Example:
Problem: A call center receives an average of 5 calls per hour. What is the probability of receiving exactly 3 calls in the next hour?
Solution:
Step 1: Identify the parameter for Poisson distribution. Here, Ξ»=5 (average calls per hour), and k=3 (number of calls).
Step 2: Apply the Poisson PMF.
P(X=3)=3!eβ553β
Step 3: Calculate the terms and simplify.
P(X=3)=60.0067379Γ125β
P(X=3)=60.8422375β
P(X=3)=0.1403729
Answer:0.1403729
---
2. Continuous Distributions
Continuous distributions model scenarios where the outcomes can take any value within a range.
2.1 Uniform Distribution
The Uniform distribution assigns equal probability to all values within a specified interval [a,b].
πUniform PDF
f(x)={bβa1β0βforΒ aβ€xβ€botherwiseβ
Variables:
X = Uniform random variable
a = minimum value
b = maximum value
When to use: When all outcomes within an interval are equally likely.
Problem: A random variable X is uniformly distributed between 0 and 10. What is the probability that X is between 3 and 7?
Solution:
Step 1: Identify parameters. Here, a=0, b=10. We want to find P(3<X<7).
Step 2: Use the PDF or CDF. Using PDF:
P(3<X<7)=β«37βf(x)dx
P(3<X<7)=β«37β10β01βdx
P(3<X<7)=β«37β101βdx
Step 3: Evaluate the integral.
P(3<X<7)=[10xβ]37β
P(3<X<7)=107ββ103β
P(3<X<7)=104β
P(3<X<7)=0.4
Answer:0.4
---
2.2 Exponential Distribution
The Exponential distribution models the time until an event occurs in a Poisson process, where events occur continuously and independently at a constant average rate. It is memoryless.
πExponential PDF
f(x)=Ξ»eβΞ»xforΒ xβ₯0
Variables:
X = Exponential random variable (time to event)
Ξ» = rate parameter (average number of events per unit time, Ξ»>0)
When to use: For modeling waiting times or lifetimes when the rate of occurrence is constant.
πExponential CDF
F(x)=P(Xβ€x)=1βeβΞ»xforΒ xβ₯0
Properties:
Mean: E[X]=Ξ»1β
Variance: Var(X)=Ξ»21β
Memoryless Property: P(X>s+tβ£X>s)=P(X>t). The future waiting time does not depend on past waiting time.
Worked Example:
Problem: The lifespan of a certain electronic component follows an exponential distribution with a mean lifespan of 5 years. What is the probability that a component will last less than 3 years?
Solution:
Step 1: Determine the rate parameter Ξ». Given mean E[X]=5 years. For exponential distribution, E[X]=1/Ξ».
5=Ξ»1β
Ξ»=51β=0.2
Step 2: Use the CDF to find P(X<3).
P(X<3)=F(3)=1βeβΞ»Γ3
P(X<3)=1βeβ0.2Γ3
P(X<3)=1βeβ0.6
P(X<3)=1β0.5488
P(X<3)=0.4512
Answer:0.4512
---
2.3 Normal (Gaussian) Distribution
The Normal distribution is arguably the most important distribution in statistics. It is symmetric, bell-shaped, and characterized by its mean (ΞΌ) and standard deviation (Ο). Many natural phenomena follow this distribution, and it is central to the Central Limit Theorem.
Ο = standard deviation of the distribution (Ο>0)
When to use: For modeling continuous data that clusters around a central value and is symmetric, or when applying the Central Limit Theorem.
Properties:
Mean: E[X]=ΞΌ
Variance: Var(X)=Ο2
Median = Mode = Mean = ΞΌ.
The curve is symmetric about ΞΌ.
The total area under the curve is 1.
π Standard Normal Distribution
A Standard Normal distribution is a Normal distribution with a mean of ΞΌ=0 and a standard deviation of Ο=1. It is typically denoted by ZβΌN(0,1). Its PDF is:
f(z)=2Οβ1βeβ2z2β
The Cumulative Distribution Function (CDF) for the Standard Normal distribution is denoted by Ξ¦(z)=P(Zβ€z). This value is typically found using a Z-table.
πStandardization (Z-score)
Z=ΟXβΞΌβ
Variables:
X = value from a Normal distribution
ΞΌ = mean of X
Ο = standard deviation of X
Z = corresponding value in the Standard Normal distribution
When to use: To convert any Normal random variable X into a Standard Normal random variable Z, allowing the use of a standard Z-table to calculate probabilities.
Worked Example (Probability Calculation):
Problem: The height of adult males in a city is normally distributed with a mean of 175 cm and a standard deviation of 7 cm. What is the probability that a randomly selected male is between 168 cm and 182 cm tall? (Use Ξ¦(1)=0.8413, Ξ¦(β1)=0.1587)
Solution:
Step 1: Identify parameters and values. ΞΌ=175, Ο=7. We want to find P(168<X<182).
Step 2: Standardize the values. For x1β=168:
z1β=7168β175β=7β7β=β1
For x2β=182:
z2β=7182β175β=77β=1
Step 3: Use the Standard Normal CDF (Ξ¦) to find the probability.
P(168<X<182)=P(β1<Z<1)
P(β1<Z<1)=Ξ¦(1)βΞ¦(β1)
P(β1<Z<1)=0.8413β0.1587
P(β1<Z<1)=0.6826
Answer:0.6826
βCentral Limit Theorem (CLT)
For a sufficiently large sample size n, the distribution of the sample mean XΛ of n independent and identically distributed (i.i.d.) random variables, each with mean ΞΌ and finite variance Ο2, will be approximately normally distributed, regardless of the original distribution of the individual variables. The sample mean XΛ will have:
Mean: E[XΛ]=ΞΌ
Standard Deviation: SD(XΛ)=nβΟβ (also called the standard error of the mean) So, XΛβΌN(ΞΌ,nΟ2β) approximately for large n.
Worked Example (CLT and Parameter Estimation):
Problem: Suppose the individual scores on an exam are normally distributed with an unknown mean ΞΌ and standard deviation Ο. A candidate fails if they score below 35% and passes with distinction if they score above 80%. In a large group, 16% fail and 2% pass with distinction. Find ΞΌ and Ο. (Use Ξ¦(β1)=0.16, Ξ¦(2)=0.98)
Solution:
Step 1: Set up equations based on the given probabilities and Z-scores. Let X be the score on the exam. We are given P(X<35)=0.16. Standardize X=35: Z1β=Ο35βΞΌβ. From the Z-table, P(Z<β1)=0.16, so Z1β=β1.
Ο35βΞΌβ=β1(EquationΒ 1)
We are given P(X>80)=0.02. This means P(Xβ€80)=1β0.02=0.98. Standardize X=80: Z2β=Ο80βΞΌβ. From the Z-table, P(Zβ€2)=0.98, so Z2β=2.
Ο80βΞΌβ=2(EquationΒ 2)
Step 2: Solve the system of linear equations. From Equation 1:
35βΞΌ=βΟ
ΞΌβΟ=35(EquationΒ 3)
From Equation 2:
80βΞΌ=2Ο
ΞΌ+2Ο=80(EquationΒ 4)
Subtract Equation 3 from Equation 4:
(ΞΌ+2Ο)β(ΞΌβΟ)=80β35
3Ο=45
Ο=15
Substitute Ο=15 into Equation 3:
ΞΌβ15=35
ΞΌ=35+15
ΞΌ=50
Answer:ΞΌ=50 and Ο=15.
---
2.4 Gamma Distribution
The Gamma distribution is a versatile continuous distribution that generalizes the exponential distribution. It is often used to model waiting times for multiple events or the sum of independent exponentially distributed random variables.
πGamma PDF
f(x)=Ξ(Ξ±)Ξ²Ξ±βxΞ±β1eβΞ²xforΒ x>0
Variables:
X = Gamma random variable
Ξ± = shape parameter (Ξ±>0)
Ξ² = rate parameter (Ξ²>0)
Ξ(Ξ±) = Gamma function, Ξ(z)=β«0ββtzβ1eβtdt. For positive integers, Ξ(n)=(nβ1)!.
When to use: For modeling waiting times (e.g., in queuing theory), or when a variable is a sum of several independent exponential variables.
Properties:
Mean: E[X]=Ξ²Ξ±β
Variance: Var(X)=Ξ²2Ξ±β
If Ξ±=1, the Gamma distribution reduces to the Exponential distribution with rate Ξ².
Worked Example:
Problem: The lifespan of a device follows a Gamma distribution. Historical data suggests the mean lifespan is 6 years and the variance is 12 years2. Find the parameters Ξ± and Ξ² of this distribution.
Solution:
Step 1: Write down the equations for mean and variance in terms of Ξ± and Ξ². E[X]=Ξ²Ξ±β=6 Var(X)=Ξ²2Ξ±β=12
Step 2: Solve the system of equations for Ξ± and Ξ². From the mean equation, Ξ±=6Ξ². Substitute this into the variance equation:
Ξ²26Ξ²β=12
Ξ²6β=12
Ξ²=126β
Ξ²=0.5
Now substitute Ξ²=0.5 back into the equation for Ξ±:
Ξ±=6Γ0.5
Ξ±=3
Answer:Ξ±=3 and Ξ²=0.5.
---
2.5 Beta Distribution
The Beta distribution is a continuous probability distribution defined on the interval [0,1]. It is particularly useful for modeling probabilities or proportions, as its values are naturally constrained within this range.
When to use: For modeling proportions, probabilities, or quantities constrained between 0 and 1 (e.g., success rates, market share).
Properties:
Mean: E[X]=Ξ±+Ξ²Ξ±β
Variance: Var(X)=(Ξ±+Ξ²)2(Ξ±+Ξ²+1)Ξ±Ξ²β
Mode: Mode(X)=Ξ±+Ξ²β2Ξ±β1β (for Ξ±>1,Ξ²>1)
Worked Example:
Problem: The proportion of defective items produced by a machine follows a Beta distribution. From historical data, the mean proportion of defective items is 0.25 and the mode is 0.20. Find the parameters Ξ± and Ξ² of this distribution.
Solution:
Step 1: Write down the equations for mean and mode in terms of Ξ± and Ξ². E[X]=Ξ±+Ξ²Ξ±β=0.25 Mode(X)=Ξ±+Ξ²β2Ξ±β1β=0.20
Step 2: Solve the system of equations. From the mean equation:
Ξ±+Ξ²Ξ±β=0.25
Ξ±=0.25(Ξ±+Ξ²)
Ξ±=0.25Ξ±+0.25Ξ²
0.75Ξ±=0.25Ξ²
Ξ²=3Ξ±(EquationΒ 1)
Substitute Ξ²=3Ξ± into the mode equation:
Ξ±+3Ξ±β2Ξ±β1β=0.20
4Ξ±β2Ξ±β1β=0.20
Ξ±β1=0.20(4Ξ±β2)
Ξ±β1=0.8Ξ±β0.4
Ξ±β0.8Ξ±=1β0.4
0.2Ξ±=0.6
Ξ±=0.20.6β
Ξ±=3
Substitute Ξ±=3 back into Equation 1:
Ξ²=3Γ3
Ξ²=9
Answer:Ξ±=3 and Ξ²=9.
---
Problem-Solving Strategies
π‘CMI Strategy
Identify the Distribution: Carefully read the problem statement to determine which standard distribution best models the scenario. Look for keywords (e.g., "number of successes in n trials" β Binomial; "average rate of events" β Poisson/Exponential; "mean and standard deviation" β Normal; "proportion" β Beta).
Extract Parameters: Identify all given parameters (ΞΌ,Ο,n,p,Ξ»,Ξ±,Ξ²) and what you need to find.
Standardize for Normal: If dealing with a Normal distribution, always standardize the variable to a Z-score to use the standard normal table/CDF.
Use CDF for Range Probabilities: For continuous distributions, P(a<X<b)=F(b)βF(a). For Normal, this becomes Ξ¦(Zbβ)βΞ¦(Zaβ). Remember P(X>x)=1βP(Xβ€x).
Parameter Estimation: If mean/variance/mode are given, set up simultaneous equations to solve for the distribution's parameters (Ξ±,Ξ²,ΞΌ,Ο).
Approximations: Recall when Poisson can approximate Binomial (n large, p small, Ξ»=np).
---
Common Mistakes
β οΈAvoid These Errors
β Confusing PMF and PDF: Using integration for discrete distributions or summing for continuous distributions.
β Correct: Use PMF for discrete (summation for multiple values), PDF for continuous (integration for ranges).
β Incorrect Z-score Calculation: Forgetting to subtract the mean or divide by the standard deviation when standardizing a normal variable.
β Correct: Always use Z=(XβΞΌ)/Ο. For sample mean, use Z=(XΛβΞΌ)/(Ο/nβ).
β Misinterpreting Z-table Values: Directly using Ξ¦(z) for P(Z>z).
β Correct:P(Z>z)=1βΞ¦(z). Use symmetry P(Z<βz)=P(Z>z)=1βΞ¦(z).
β Ignoring Distribution Domain: Calculating probabilities outside the valid range (e.g., negative time for Exponential, values outside [0,1] for Beta).
β Correct: Always respect the domain of the random variable.
β Parameter Estimation Errors: Incorrectly setting up equations for mean/variance/mode for a specific distribution.
β Correct: Memorize or correctly derive the formulas for mean, variance, and mode for each distribution.
β Forgetting n in CLT: When dealing with sample means, failing to divide Ο by nβ for the standard error.
β Correct: The standard deviation of the sample mean is ΟXΛβ=Ο/nβ.
---
Practice Questions
:::question type="MCQ" question="A call center receives calls at an average rate of 20 calls per hour. What is the probability that exactly 15 calls are received in a 30-minute interval?" options=["15!eβ101015β","15!eβ202015β","10!eβ101510β","20!eβ201520β"] answer="A" hint="Adjust the average rate to match the given time interval before applying the Poisson PMF." solution="Step 1: Determine the average rate for the given interval. The average rate is 20 calls per hour. For a 30-minute interval (0.5 hours), the average rate Ξ» will be:
Ξ»=20Β calls/hourΓ0.5Β hours=10Β calls
Step 2: Apply the Poisson Probability Mass Function (PMF). The Poisson PMF is P(X=k)=k!eβλλkβ. Here, Ξ»=10 and k=15.
P(X=15)=15!eβ101015β
The correct option is A."
:::
:::question type="NAT" question="The scores on a standardized test are normally distributed with a mean of 600 and a standard deviation of 100. If a student scores 750, what is their Z-score? Report to two decimal places." answer="1.50" hint="Use the Z-score formula Z=(XβΞΌ)/Ο." solution="Step 1: Identify the given values. X=750 (student's score) ΞΌ=600 (mean score) Ο=100 (standard deviation)
Step 2: Apply the Z-score formula.
Z=ΟXβΞΌβ
Z=100750β600β
Z=100150β
Z=1.5
The Z-score is 1.50." :::
:::question type="MSQ" question="A quality control process inspects batches of 50 items. Each item has a 1% chance of being defective, independently. Which of the following statements are correct?" options=["The number of defective items in a batch follows a Binomial distribution.","The probability of finding exactly 1 defective item in a batch is (150β)(0.01)1(0.99)49.","The mean number of defective items in a batch is 0.5.","The Poisson approximation to this distribution would use Ξ»=50."] answer="A,B,C" hint="Identify the distribution type and its parameters. Check the conditions for Poisson approximation." solution="Statement A: The number of defective items in a fixed number of independent trials (50 items) with a constant probability of success (1% defect) follows a Binomial distribution. This is correct.
Statement B: For a Binomial distribution B(n,p), P(X=k)=(knβ)pk(1βp)nβk. Here n=50, p=0.01, k=1. So P(X=1)=(150β)(0.01)1(0.99)49. This is correct.
Statement C: The mean of a Binomial distribution is E[X]=np. Here E[X]=50Γ0.01=0.5. This is correct.
Statement D: The Poisson approximation to a Binomial distribution uses Ξ»=np. Here Ξ»=50Γ0.01=0.5. The statement says Ξ»=50, which is incorrect. This is incorrect.
Therefore, statements A, B, and C are correct." :::
:::question type="SUB" question="The time (in minutes) a customer spends waiting for a service representative follows an exponential distribution. If 80% of customers wait longer than 5 minutes, what is the average waiting time (in minutes)? Report to two decimal places." answer="22.40" hint="Use the CDF of the exponential distribution and its relationship with the mean." solution="Step 1: Set up the probability statement using the Exponential CDF. Let X be the waiting time. XβΌExp(Ξ»). We are given P(X>5)=0.80. We know P(X>x)=eβΞ»x.
So, eβ5Ξ»=0.80.
Step 2: Solve for Ξ». Take the natural logarithm of both sides:
ln(eβ5Ξ»)=ln(0.80)
β5Ξ»=ln(0.80)
β5Ξ»ββ0.22314355
Ξ»ββ5β0.22314355β
Ξ»β0.04462871
Step 3: Calculate the average waiting time (mean). For an Exponential distribution, the mean is E[X]=1/Ξ».
E[X]=0.044628711β
E[X]β22.4045
Rounding to two decimal places, E[X]β22.40. The average waiting time is 22.40 minutes." :::
---
Chapter Summary
πRandom Variables and Distributions - Key Takeaways
To excel in CMI, a deep understanding of Random Variables and Distributions is fundamental. Here are the most crucial points you must internalize:
Random Variables (RVs): Understand the formal definition of a random variable as a function mapping outcomes from a sample space to real numbers. Differentiate clearly between discrete and continuous random variables and their respective characteristics.
Probability Mass Function (PMF), Probability Density Function (PDF), and Cumulative Distribution Function (CDF):
Know the definitions and properties of PMF (for discrete RVs), PDF (for continuous RVs), and CDF (for both). Master how to calculate probabilities using these functions, including P(aβ€Xβ€b)=FXβ(b)βFXβ(a) for CDFs, and using integration/summation for PDFs/PMFs. Understand the relationship between PDF/PMF and CDF: FXβ(x)=βtβ€xβP(X=t) or FXβ(x)=β«ββxβfXβ(t)dt, and fXβ(x)=FXβ²β(x).
Expectation and Variance:
Memorize the definitions of expectation E[X] and variance Var[X] for both discrete and continuous RVs. Crucially, understand and apply their properties: Linearity of Expectation (E[aX+bY]=aE[X]+bE[Y]) and properties of variance (Var[aX+b]=a2Var[X]). Be proficient in calculating E[g(X)] using the Law of the Unconscious Statistician (LOTUS).
Standard Distributions: Be thoroughly familiar with the key properties (parameters, PMF/PDF, mean, variance, typical shape) of the most common distributions:
Discrete: Bernoulli, Binomial, Poisson, Geometric. Continuous: Uniform, Exponential, Normal (Gaussian), Gamma. Recognize scenarios where each distribution is applicable.
Moment Generating Functions (MGFs):
Understand the definition MXβ(t)=E[etX]. Know how to use MGFs to find moments (E[Xn]=MX(n)β(0)) and, more importantly, to uniquely identify the distribution of an RV. Be familiar with MGFs of standard distributions.
Transformations of Random Variables: Master techniques for finding the PMF/PDF of a new random variable Y=g(X) given the distribution of X. This often involves using the CDF method or the change-of-variable formula for continuous RVs.
---
Chapter Review Questions
:::question type="MCQ" question="Let X be a continuous random variable with probability density function (PDF):
Which of the following statements is TRUE? (I) The constant c=43β. (II) P(X>0)=21β. (III) E[X]=0. (IV) Var[X]=52β. " options=["A) (I) and (II) only" "B) (I), (II) and (III) only" "C) (I), (II), (III) and (IV)" "D) (I), (III) and (IV) only"] answer="B" hint="Remember the properties of a PDF: it must integrate to 1. Also, leverage symmetry to simplify calculations for expectation and probability." solution="Let's analyze each statement:
(I) The constant c=43β: For fXβ(x) to be a valid PDF, β«ββββfXβ(x)dx=1.
β«β11βc(1βx2)dx=1
c[xβ3x3β]β11β=1
c[(1β31β)β(β1β3β1β)]=1
c[(32β)β(β32β)]=1
c[34β]=1βΉc=43β
So, statement (I) is TRUE.
(II) P(X>0)=21β:
P(X>0)=β«01β43β(1βx2)dx
=43β[xβ3x3β]01β
=43β[(1β31β)β(0β0)]
=43β[32β]=21β
Alternatively, since fXβ(x) is symmetric about x=0 (fXβ(x)=fXβ(βx)), P(X>0)=P(X<0)=21β. So, statement (II) is TRUE.
Since (x2βx4) is an even function, we can write:
E[X2]=43ββ 2β«01β(x2βx4)dx
=23β[3x3ββ5x5β]01β
=23β[(31ββ51β)β(0)]
=23β[155β3β]=23β[152β]=51β
So, Var[X]=51β. Therefore, statement (IV) Var[X]=52β is FALSE.
Based on the analysis, statements (I), (II), and (III) are TRUE. The correct option is B." :::
:::question type="NAT" question="A fair six-sided die is rolled repeatedly. Let X be the number of rolls until a '6' appears for the first time. Let Y be the number of rolls until a '6' appears for the second time. Find E[Yβ£X=1]. (Enter your answer as a plain number)." answer="7" hint="Consider the nature of the geometric distribution and the memoryless property. If the first '6' occurs on the 1st roll, how many additional rolls are needed for the second '6'?" solution="Let X be the number of rolls until the first '6' appears. X follows a Geometric distribution with p=1/6. Let Y be the number of rolls until the second '6' appears.
We are asked to find E[Yβ£X=1]. Given that the first '6' appeared on the 1st roll, this means roll 1 was a '6'. Now, we need to find the expected number of additional rolls from roll 2 onwards until the second '6' appears. Let Z be the number of additional rolls needed after the 1st roll for the second '6' to appear. Since die rolls are independent and the probability of rolling a '6' remains p=1/6 for each subsequent roll, Z also follows a Geometric distribution with parameter p=1/6. The expected value of a Geometric distribution (number of trials until the first success) is 1/p. So, E[Z]=1/p=1/(1/6)=6.
The total number of rolls until the second '6' appears, Y, can be expressed as Y=X+Z. Therefore, E[Yβ£X=1]=E[1+Zβ£X=1]. Since 1 is a fixed value in the conditional expectation, and Z is independent of X=1 (due to the memoryless property of the process), we have: E[Yβ£X=1]=1+E[Z] E[Yβ£X=1]=1+6=7.
The expected value is 7." :::
:::question type="MCQ" question="Let X be a random variable with Moment Generating Function (MGF) MXβ(t)=1β3te2tβ for t<31β. Which of the following is the variance of X, Var[X]? " options=["A) 3" "B) 9" "C) 11" "D) 13"] answer="B" hint="Recall that MXβ²β(0)=E[X] and MXβ²β²β(0)=E[X2]. Then use Var[X]=E[X2]β(E[X])2." solution="The Moment Generating Function (MGF) is given by MXβ(t)=e2t(1β3t)β1.
First, we find E[X]=MXβ²β(0). Using the product rule (uv)β²=uβ²v+uvβ²: Let u=e2t and v=(1β3t)β1. Then uβ²=2e2t and vβ²=β1(1β3t)β2(β3)=3(1β3t)β2.
So, MXβ²β(t)=(2e2t)(1β3t)β1+(e2t)(3(1β3t)β2). Now, evaluate at t=0: MXβ²β(0)=(2e0)(1β0)β1+(e0)(3(1β0)β2) MXβ²β(0)=2(1)(1)+1(3)(1)=2+3=5. So, E[X]=5.
Next, we find E[X2]=MXβ²β²β(0). We need to differentiate MXβ²β(t)=2e2t(1β3t)β1+3e2t(1β3t)β2. Let MXβ²β(t)=A(t)+B(t), where A(t)=2e2t(1β3t)β1 and B(t)=3e2t(1β3t)β2.
For A(t): Aβ²(t)=(2e2tβ 2)(1β3t)β1+(2e2t)(β1(1β3t)β2(β3)) Aβ²(t)=4e2t(1β3t)β1+6e2t(1β3t)β2. At t=0: Aβ²(0)=4(1)(1)+6(1)(1)=4+6=10.
For B(t): Bβ²(t)=(3e2tβ 2)(1β3t)β2+(3e2t)(β2(1β3t)β3(β3)) Bβ²(t)=6e2t(1β3t)β2+18e2t(1β3t)β3. At t=0: Bβ²(0)=6(1)(1)+18(1)(1)=6+18=24.
So, MXβ²β²β(0)=Aβ²(0)+Bβ²(0)=10+24=34. Thus, E[X2]=34.
Alternatively, recognize the MGF. The MGF of an Exponential distribution with rate Ξ» is Ξ»βtΞ»β. The MGF of X0ββΌExp(Ξ») is MX0ββ(t)=Ξ»βtΞ»β. The MGF of X1ββΌExp(1/3) is 1/3βt1/3β=1β3t1β. The MGF of Y=X1β+2 is E[et(X1β+2)]=E[etX1βe2t]=e2tE[etX1β]=e2t1β3t1β. So X is distributed as an Exp(1/3) random variable shifted by 2. Let X1ββΌExp(1/3). Then E[X1β]=1/Ξ»=3 and Var[X1β]=1/Ξ»2=9. X=X1β+2. E[X]=E[X1β+2]=E[X1β]+2=3+2=5. Var[X]=Var[X1β+2]=Var[X1β]=9. The variance of X is 9." :::
:::question type="NAT" question="Let X be a continuous random variable uniformly distributed on the interval (0,2). Define a new random variable Y=X2. Find E[Y]. (Enter your answer as a plain number in decimal form, rounded to two decimal places)." answer="1.33" hint="First, determine the PDF of X. Then, use the Law of the Unconscious Statistician (LOTUS) to compute E[Y] without explicitly finding the PDF of Y." solution="The random variable X is uniformly distributed on (0,2). Its PDF is given by:
We want to find E[Y] where Y=X2. Using the Law of the Unconscious Statistician (LOTUS), E[g(X)]=β«ββββg(x)fXβ(x)dx. Here, g(x)=x2.
E[Y]=E[X2]=β«02βx2β 21βdx
=21ββ«02βx2dx
=21β[3x3β]02β
=21β[323ββ303β]
=21β[38β]
=34β
Now, we need to calculate the numerical value and round it to two decimal places. E[Y]=34ββ1.3333... Rounded to two decimal places, E[Y]β1.33. The expected value of Y is 1.33." :::
---
What's Next?
π‘Continue Your CMI Journey
You've mastered Random Variables and Distributions! This chapter is the bedrock for much of advanced probability theory and statistics.
Key connections: Building on Previous Learning: This chapter heavily relies on your understanding of basic probability (sample spaces, events, conditional probability, independence) and calculus (integration, differentiation) for continuous random variables. A solid grasp of set theory is also beneficial for defining events and sample spaces. Paving the Way for Future Chapters: The concepts learned here are foundational for: Joint Distributions: Understanding how multiple random variables interact. Conditional Expectation: A deeper dive into expected values given certain conditions. Limit Theorems: The Law of Large Numbers (LLN) and the Central Limit Theorem (CLT), which are crucial for statistical inference, build directly on the properties of expectation and variance of random variables. Statistical Inference: Chapters on estimation (e.g., maximum likelihood estimation) and hypothesis testing rely on the distributions of sample statistics. * Stochastic Processes: Many advanced topics in probability and applied mathematics begin with discrete-time or continuous-time random variables.
Keep practicing problems that combine these concepts, as CMI questions often integrate knowledge across multiple topics!
π― Key Points to Remember
βMaster the core concepts in Random Variables and Distributions before moving to advanced topics
βPractice with previous year questions to understand exam patterns
βReview short notes regularly for quick revision before exams