Random Variables
Overview
In our preceding discussions, we established the foundational principles of probability theory by examining sample spaces and events. However, to perform a rigorous quantitative analysis of random phenomena, we must bridge the gap between abstract outcomes and numerical values. This chapter introduces the concept of a random variable, a fundamental construct that assigns a numerical value to every possible outcome of a random experiment. The formalization of this concept is a cornerstone of modern statistics and machine learning, allowing us to apply the powerful tools of mathematical analysis to uncertain events.
We shall begin by defining discrete and continuous random variables and exploring their associated probability distributions. Subsequently, we will investigate the essential characteristics of these distributions through measures of central tendency and dispersion. We will learn to compute and interpret quantities such as the expected value () and variance (), which provide concise summaries of a variable's behavior. The chapter then progresses to the study of relationships between multiple random variables, introducing covariance and correlation as measures of linear dependence. A firm grasp of these concepts is indispensable for understanding feature interactions in data analysis and machine learning models, a topic of significant importance for the GATE examination.
Finally, we culminate our study with an examination of conditional expectation and variance. This advanced topic addresses how our knowledge or assumptions about one event or variable can alter our expectations about another. This principle forms the basis for many predictive models and inferential techniques encountered in the Data Science and AI syllabus. A thorough understanding of the material presented herein is therefore critical for solving complex problems and building a robust theoretical foundation for subsequent topics.
---
Chapter Contents
| # | Topic | What You'll Learn |
|---|-------|-------------------|
| 1 | Definition of Random Variables | Formalizing numerical outcomes of random experiments. |
| 2 | Measures of Central Tendency and Dispersion | Quantifying the center and spread of distributions. |
| 3 | Correlation and Covariance | Analyzing linear dependence between random variables. |
| 4 | Conditional Expectation and Variance | Updating expectations with new information provided. |
---
Learning Objectives
After completing this chapter, you will be able to:
- Define a random variable and differentiate between discrete and continuous types.
- Calculate and interpret the expected value, variance, and standard deviation of a random variable.
- Compute the covariance and correlation between two random variables to assess their linear relationship.
- Determine the conditional expectation and variance of a random variable given an event or another variable.
---
We now turn our attention to Definition of Random Variables...
## Part 1: Definition of Random Variables
Introduction
In our study of probability, we are often concerned not with the specific outcomes of an experiment, but rather with some numerical property associated with those outcomes. For instance, in an experiment involving the tossing of two coins, we might be more interested in the number of heads that appear than in the exact sequence of heads and tails. This need to associate a numerical value with each outcome of a random experiment leads us to the fundamental concept of a random variable.
A random variable provides a means of mapping the, often non-numerical, outcomes in a sample space to a set of real numbers. This transformation is crucial as it allows us to apply the powerful tools of calculus and mathematical analysis to the study of probability and statistics. We can analyze distributions, calculate expected values, and determine variances, all of which are central to data analysis and inference. Understanding the formal definition and classification of random variables is the first essential step in this direction.
A random variable, typically denoted by a capital letter such as , is a function that assigns a real number to each outcome in the sample space of a random experiment. Formally, it is a mapping .
We use a capital letter, , to represent the random variable as a function, and a lowercase letter, , to represent a specific value that the random variable can take. The set of all possible values of is called the range or support of the random variable.
---
Key Concepts
The most fundamental classification of random variables is based on the nature of the values they can assume. This leads to two primary types: discrete and continuous random variables.
#
## 1. Discrete Random Variables
A random variable is said to be discrete if its range is finite or countably infinite. This means that the variable can only take on a specific, separated set of values. There are "gaps" between the possible values.
Consider the experiment of rolling a standard six-sided die. The sample space is . If we define a random variable as the outcome of the roll, then can take values from the set . Since this set is finite, is a discrete random variable. Similarly, the number of defective items in a batch of 100 is a discrete random variable, as it can take integer values from 0 to 100.
Worked Example:
Problem: An experiment consists of tossing two fair coins. Let the random variable be defined as the number of heads observed. Determine the sample space and the set of possible values for .
Solution:
Step 1: Define the sample space .
The sample space consists of all possible outcomes of tossing two coins. Let H denote Heads and T denote Tails.
Step 2: Apply the function to each outcome in .
The random variable counts the number of heads in each outcome.
For the outcome , the number of heads is 2. So, .
For the outcome , the number of heads is 1. So, .
For the outcome , the number of heads is 1. So, .
For the outcome , the number of heads is 0. So, .
Step 3: List the set of all possible values for .
The range of the random variable is the set of all unique numerical values it can take.
Answer: The sample space is and the random variable can take values from the set . Since this set is finite, is a discrete random variable.
---
#
## 2. Continuous Random Variables
A random variable is said to be continuous if its range is an interval or a collection of intervals on the real number line. This means the variable can take on any value within a given range, and there are uncountably infinite possible values.
For example, if we define a random variable as the height of a randomly selected student, can take any value within a certain range, say . It is not restricted to integer values; a height of cm is perfectly possible. Other examples include temperature, weight, and time. For a continuous random variable, the probability of it taking any single specific value is zero, i.e., . We instead focus on the probability that the variable falls within a certain interval.
---
Problem-Solving Strategies
When presented with a problem, the first step is to determine whether the random variable is discrete or continuous.
- Ask: Can I count the possible outcomes? If the values are countable (e.g., number of successes, number of arrivals, results of a die roll), it is discrete.
- Ask: Is the variable measured? If the value is obtained by measurement (e.g., height, weight, time, temperature), it can take any value in an interval and is continuous.
This initial classification dictates the entire subsequent approach, including the type of probability distribution (PMF vs. PDF) to be used.
---
---
Common Mistakes
- β Confusing a random variable with an algebraic variable. A random variable is a function that maps outcomes to numbers. An algebraic variable is simply an unknown quantity.
- β Assuming all numerical variables are continuous. The number of cars passing a toll booth in an hour is numerical, but it is discrete (0, 1, 2, ...). It cannot be 2.5. Always check if the values are countable or if they fall within a continuous range.
- β Incorrectly defining the range. For an experiment of drawing 3 balls from a bag of 5 red and 5 blue balls, if is the number of red balls drawn, the range is , not all real numbers from 0 to 3.
---
Practice Questions
:::question type="MCQ" question="Which of the following is an example of a discrete random variable?" options=["The height of a building", "The time taken to complete a race", "The number of defective items in a shipment of 1000 items", "The temperature of a room in Celsius"] answer="The number of defective items in a shipment of 1000 items" hint="A discrete variable's values can be counted. Which of the options represents a counted quantity rather than a measured one?" solution="Let's analyze the options.
- The height of a building is a measurement and can take any value within a range, making it continuous.
- The time taken to complete a race is also a measurement and is continuous.
- The number of defective items is a count (0, 1, 2, ..., 1000). This set of values is finite and countable. Therefore, it is a discrete random variable.
- The temperature of a room is a measurement and is continuous.
:::
:::question type="NAT" question="A box contains 4 red and 3 green balls. An experiment consists of drawing 2 balls from the box without replacement. Let the random variable Y be the number of green balls drawn. What is the number of distinct values that Y can take?" answer="3" hint="Consider the minimum and maximum number of green balls you can possibly draw in this experiment." solution="Let Y be the number of green balls drawn.
The experiment involves drawing 2 balls from a total of 7.
- We could draw 0 green balls (meaning 2 red balls are drawn). This is possible since there are 4 red balls. So, Y can be 0.
- We could draw 1 green ball and 1 red ball. This is possible. So, Y can be 1.
- We could draw 2 green balls. This is possible since there are 3 green balls. So, Y can be 2.
- We cannot draw 3 green balls, as we are only drawing 2 balls in total.
The set of possible values for Y is .
The number of distinct values is 3.
Answer: \boxed{3}"
:::
:::question type="MSQ" question="Let S be the sample space of a random experiment. Let X be a random variable defined as a function . Which of the following statements are ALWAYS true?" options=["X is always a discrete random variable.", "The range of X is a subset of the real numbers.", "If the sample space S is finite, then X must be a discrete random variable.", "If X is a continuous random variable, the sample space S must be infinite."] answer="The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable." hint="Review the fundamental definition of a random variable and the properties of discrete vs. continuous variables." solution="Let's evaluate each statement:
Based on the fundamental definitions, statements 2 and 3 are the most direct and fundamental consequences of the definition of a random variable.
Answer: \boxed{\text{The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable.}}"
:::
---
Summary
- A Random Variable is a function that maps outcomes from a sample space to the set of real numbers .
- The primary classification is between Discrete Random Variables (countable values, e.g., number of defects) and Continuous Random Variables (values in an interval, e.g., height or weight).
- The first step in any random variable problem is to correctly identify its type, as this determines the entire analytical approach.
---
What's Next?
This topic is the foundation for understanding how we model random phenomena numerically. Your understanding will be deepened by studying:
- Probability Distributions: How probabilities are assigned to the values of a random variable. This involves learning about Probability Mass Functions (PMF) for discrete variables and Probability Density Functions (PDF) for continuous variables.
- Expectation and Variance: How to calculate the central tendency (mean) and spread (variance) of a random variable, which are crucial measures for summarizing its behavior.
Mastering the definition and classification of random variables is essential before proceeding to these more advanced concepts.
---
Now that you understand Definition of Random Variables, let's explore Measures of Central Tendency and Dispersion which builds on these concepts.
---
Part 2: Measures of Central Tendency and Dispersion
Introduction
In the study of random variables, it is often insufficient to know only the full probability distribution. For many applications in data analysis and statistical inference, we require concise numerical summaries that describe the essential features of the distribution. These summaries are broadly categorized into two types: measures of central tendency and measures of dispersion.
Measures of central tendency aim to identify a single value that represents the "center" or "typical" value of a random variable. The most common of these is the mean, or expected value, which provides a long-run average of the outcomes. Measures of dispersion, conversely, quantify the variability or spread of the random variable's possible values around this central point. The primary measures here are the variance and its square root, the standard deviation. A thorough understanding of these measures is not merely a procedural exercise; it is fundamental to interpreting probabilistic models and making informed decisions based on data. In the context of the GATE examination, questions frequently test the direct calculation of these measures as well as their known properties for standard probability distributions.
A summary statistic is a single number that is computed from a probability distribution (or a sample of data) to summarize a specific characteristic of that distribution. Measures of central tendency and dispersion are the most fundamental types of summary statistics.
---
---
Measures of Central Tendency
These measures provide a single value that attempts to describe a set of data by identifying the central position within that set of data.
1. Mean (Expected Value)
The most important measure of central tendency is the mean, also known as the expected value. For a random variable , its expected value, denoted as or , represents the weighted average of all possible values that can take, where the weights are the corresponding probabilities.
The expected value of a random variable is the long-run average value of repetitions of the experiment it represents. It is the center of mass of the probability distribution.
For a discrete random variable with a set of possible values and a probability mass function (PMF) , the expected value is the sum of each value multiplied by its probability.
Variables:
- : A possible value of the random variable .
- : The probability that takes the value , i.e., .
- : The sample space or set of all possible values for .
When to use: When given the PMF of a discrete random variable and asked to find its mean.
For a continuous random variable with a probability density function (PDF) , the expected value is found by integrating the product of and over the entire range of .
Variables:
- : A variable representing the values of the random variable .
- : The probability density function of .
When to use: When given the PDF of a continuous random variable and asked to find its mean.
Worked Example:
Problem: A discrete random variable has the following probability mass function: , , and . Calculate the mean of .
Solution:
Step 1: Identify the values of and their corresponding probabilities .
The values are , , .
The probabilities are , , .
Step 2: Apply the formula for the expected value of a discrete random variable.
Step 3: Substitute the values and compute the sum.
Step 4: Calculate the final result.
Answer: The mean of the random variable is .
2. Median
The median is the value that separates the higher half from the lower half of a probability distribution. For a continuous random variable with cumulative distribution function (CDF) , the median is the value such that .
The median of a random variable is any value such that and . For a continuous random variable with CDF , it is the value for which .
We observe that the median is less sensitive to extreme values (outliers) than the mean, a property known as robustness.
3. Mode
The mode of a random variable is the value at which its PMF or PDF takes its maximum value. A distribution can have one mode (unimodal), two modes (bimodal), or more (multimodal).
The mode of a random variable is the value that is most likely to occur. For a discrete random variable, it is the value that maximizes the PMF . For a continuous random variable, it is the value that maximizes the PDF .
---
Measures of Dispersion
While central tendency tells us about the location of a distribution, measures of dispersion describe its spread or variability.
1. Variance
Variance is the most common measure of dispersion. It quantifies the spread of a random variable's values around its mean. Specifically, it is the expected value of the squared deviation from the mean.
The variance of a random variable with mean , denoted as or , is defined as . A small variance indicates that the data points tend to be very close to the mean, while a high variance indicates that the data points are spread out over a wider range.
While the definition is intuitive, a more practical formula, often called the computational formula, is used for calculations. We can derive it as follows:
By linearity of expectation,
Since is a constant, and .
Variables:
- : The expected value of the square of the random variable.
- : The square of the expected value of the random variable.
When to use: This is the preferred formula for most GATE problems as it simplifies calculation. It requires computing two expectations: and .
2. Standard Deviation
The standard deviation is simply the positive square root of the variance. Its primary advantage is that it is expressed in the same units as the random variable, making it more interpretable than the variance.
The standard deviation of a random variable , denoted by , is the positive square root of its variance.
Worked Example:
Problem: For the discrete random variable from the previous example with , , and , calculate the variance and standard deviation. We already found that .
Solution:
Step 1: Calculate using its definition for a discrete random variable.
Step 2: Substitute the values.
Step 3: Apply the computational formula for variance.
Step 4: Substitute the calculated values of and .
Step 5: Calculate the standard deviation by taking the square root of the variance.
Answer: The variance is and the standard deviation is .
---
Mean and Variance of Standard Distributions
For the GATE exam, it is imperative to know the mean and variance of several standard probability distributions by heart. Direct application of these formulas can save significant time.
| Distribution | Parameters | Mean () | Variance () |
| :--- | :--- | :--- | :--- |
| Binomial | (trials), (success prob.) | | |
| Poisson | (rate) | | |
| Uniform (Continuous) | (min), (max) | | |
| Exponential | (rate) | | |
| Normal | (mean), (variance) | | |
| Standard Normal | , | | |
The properties of the Poisson and Standard Normal distributions are frequently tested.
- For a Poisson random variable, the mean and variance are identical: .
- For a Standard Normal random variable, the mean is and the variance is .
---
Problem-Solving Strategies
A strategic approach is essential for solving problems under time constraints.
Before starting a lengthy calculation from a PMF or PDF, always check if the problem describes a standard distribution (Binomial, Poisson, etc.). If it does, you can use the known formulas for mean and variance directly, which is significantly faster than calculating from first principles.
For a transformed random variable , where and are constants:
These properties are extremely useful for simplifying problems. Note that the additive constant does not affect the variance, as it simply shifts the distribution without changing its spread.
---
Common Mistakes
Awareness of common pitfalls can prevent the loss of valuable marks.
- β Confusing Standard Deviation and Variance: Students often provide the variance when the standard deviation is asked, or vice-versa.
- β Incorrectly Applying Variance Properties: A frequent error is to write .
- β Error in Computational Formula: Forgetting to square the mean in the variance formula, i.e., using instead of .
---
Practice Questions
:::question type="MCQ" question="A random variable has the probability mass function , , and . What is the variance of the random variable ?" options=["17/36","17/9","34/9","5/6"] answer="17/9" hint="First, find the variance of using the computational formula . Then, use the property ." solution="
Step 1: Calculate the mean of , .
Step 2: Calculate .
Step 3: Calculate the variance of , .
Step 4: Calculate the variance of .
Result: The variance of is .
"
:::
:::question type="NAT" question="A continuous random variable has a probability density function given by for , and otherwise. Calculate the mean of ." answer="1.5" hint="Use the formula for the expected value of a continuous random variable: over the defined range." solution="
Step 1: Set up the integral for the expected value, .
Step 2: Substitute the given PDF and adjust the integration limits.
Step 3: Simplify the integrand.
Step 4: Evaluate the integral.
Step 5: Substitute the limits of integration.
Result:
:::question type="MSQ" question="Let be a random variable with mean and variance . Let a new random variable be defined as . Which of the following statements is/are correct?" options=["The mean of Y is -15.","The variance of Y is -8.","The variance of Y is 16.","The standard deviation of Y is 4."] answer="The mean of Y is -15.,The variance of Y is 16.,The standard deviation of Y is 4." hint="Apply the properties of expectation and variance for a linear transformation. and ." solution="
Let us analyze each statement.
The transformation is . Here, and .
Statement 1: The mean of Y is -15.
Using the property of expectation:
Substituting :
Thus, this statement is correct.
Statement 2: The variance of Y is -8.
Variance can never be negative. Thus, this statement is incorrect.
Statement 3: The variance of Y is 16.
Using the property of variance:
Substituting :
Thus, this statement is correct.
Statement 4: The standard deviation of Y is 4.
The standard deviation is the square root of the variance.
Thus, this statement is correct.
Therefore, the correct options are: "The mean of Y is -15.", "The variance of Y is 16.", and "The standard deviation of Y is 4."
"
:::
:::question type="NAT" question="The number of defects on a semiconductor wafer follows a Poisson distribution with a mean of 2 defects per wafer. What is the standard deviation of the number of defects per wafer?" answer="1.414" hint="Recall the key property of a Poisson distribution regarding its mean and variance." solution="
Step 1: Identify the distribution and its parameter.
The problem states that the number of defects follows a Poisson distribution. The mean is given as 2 defects per wafer. For a Poisson distribution, the parameter is equal to the mean.
So, .
Step 2: Recall the formula for the variance of a Poisson distribution.
For a Poisson random variable with parameter , the variance is given by:
Step 3: Calculate the variance.
Step 4: Calculate the standard deviation.
The standard deviation is the square root of the variance.
Step 5: Compute the numerical value.
Result: The standard deviation, rounded to three decimal places, is 1.414.
"
:::
:::question type="MCQ" question="A fair six-sided die is rolled 10 times. Let be the number of times the number '4' appears. What is the variance of ?" options=["50/36","5/3","25/18","5/6"] answer="25/18" hint="Recognize that this scenario describes a Binomial experiment. Identify the parameters and , and then use the formula for the variance of a Binomial distribution." solution="
Step 1: Identify the type of random variable.
The experiment consists of a fixed number of independent trials (). Each trial has two outcomes: success (rolling a '4') or failure (not rolling a '4'). The probability of success is constant for each trial. This is the definition of a Binomial experiment.
Step 2: Determine the parameters of the Binomial distribution.
The number of trials is .
The probability of success, , is the probability of rolling a '4' on a fair die, which is .
The probability of failure is .
Step 3: Apply the formula for the variance of a Binomial random variable.
The variance of a Binomial distribution is given by , or .
Step 4: Substitute the parameters and calculate the variance.
Step 5: Simplify the fraction.
Result: The variance of is .
"
:::
---
Summary
A firm grasp of central tendency and dispersion is non-negotiable for success in the Probability and Statistics section of the GATE exam. These concepts form the bedrock upon which more complex statistical ideas are built.
- Mean and Variance Definitions: Be fluent in calculating the mean () and variance () for both discrete (using summation) and continuous (using integration) random variables.
- The Computational Formula is Key: For variance calculations, almost always use
- Memorize Standard Distributions: You must be able to instantly recall the mean and variance for Binomial, Poisson, Uniform, and Normal distributions. Questions often test these properties directly, and recognizing them saves critical time. The facts that for Poisson and for the Standard Normal are particularly high-yield.
It is faster and less prone to error than the definitional formula.
---
What's Next?
Mastery of these fundamental measures prepares you for more advanced topics in probability theory.
This topic connects to:
- Probability Distributions: The mean and variance are defining characteristics of any probability distribution. A deep understanding of PMFs and PDFs is required to derive these measures from first principles.
- Covariance and Correlation: These concepts extend the idea of variance to two random variables. Covariance measures how two variables change together, building directly on the concepts of expectation and deviation from the mean.
- Chebyshev's Inequality: This powerful theorem uses the mean and standard deviation to provide a bound on the probability that a random variable lies a certain distance from its mean, regardless of the underlying distribution.
Master these connections for comprehensive GATE preparation!
---
Now that you understand Measures of Central Tendency and Dispersion, let's explore Correlation and Covariance which builds on these concepts.
---
Part 3: Correlation and Covariance
Introduction
In our study of random variables, we have thus far focused primarily on the properties of a single variable, such as its mean and variance. The variance, in particular, quantifies the spread or dispersion of a variable's distribution around its mean. However, in many practical applications, we are interested in understanding the relationship between two or more random variables. Do they tend to move in the same direction, in opposite directions, or is there no discernible pattern to their joint behavior?
This chapter introduces two fundamental statistical measures that address this question: covariance and correlation. Covariance provides a measure of the joint variability of two random variables, indicating the direction of their linear relationship. Correlation, a standardized version of covariance, goes a step further by also quantifying the strength of this linear association. A firm grasp of these concepts is indispensable, as they form the bedrock of more advanced topics such as regression analysis and portfolio theory, and are frequently tested in the GATE examination.
The covariance between two random variables, and , with expected values and , is defined as the expected value of the product of their deviations from their respective means.
---
Key Concepts
1. Understanding and Calculating Covariance
The definition of covariance, , provides significant intuition. Consider the term . If and tend to be simultaneously above their means or simultaneously below their means, this product will be positive on average. This results in a positive covariance, suggesting a positive linear relationship. Conversely, if one variable tends to be above its mean when the other is below its mean, the product will be negative on average, yielding a negative covariance. If there is no consistent linear pattern, the positive and negative products will tend to cancel out, resulting in a covariance near zero.
While the definitional formula is useful for conceptual understanding, a more practical formula is often used for computation, especially with discrete random variables. We can expand the definition:
This leads to the widely used computational formula.
Variables:
- = The expected value of the product of the random variables and .
- = The expected value of .
- = The expected value of .
When to use: This formula is almost always more convenient for calculation, particularly for discrete random variables where a joint probability mass function is available.
Worked Example:
Problem: A fair six-sided die is rolled. Let be a random variable that is 1 if the outcome is even and 0 if it is odd. Let be a random variable that is 1 if the outcome is greater than 3, and 0 otherwise. Calculate the covariance between and .
Solution:
Step 1: Define the sample space and probabilities.
The sample space is , with each outcome having a probability of .
Step 2: Determine the values of and for each outcome and calculate and .
- For : for ; for .
- For : for ; for .
, .
, .
Step 3: Determine the value of the product for each outcome and calculate .
We need to find the outcomes where . This occurs only when both (even) and (greater than 3). The outcomes satisfying this are .
Thus, . For all other outcomes, .
Step 4: Apply the computational formula for covariance.
Answer: The covariance between and is .
---
2. Properties of Covariance
Understanding the properties of covariance is critical for simplifying complex expressions, a common requirement in GATE problems.
Notice that the additive constants and do not affect the covariance, as they do not change the spread of the variables.
Worked Example:
Problem: Let be a random variable with . Let . Calculate .
Solution:
Step 1: Identify the relationship between and .
We are given as a linear transformation of : .
Step 2: Apply the property of covariance under linear transformations.
We need to find .
Let . The general property is .
Step 3: Use the relationship between covariance and variance.
We know that .
Step 4: Substitute the given value of .
We are given .
Answer: The covariance between and is .
---
#
## 3. Correlation Coefficient
A significant limitation of covariance is that its magnitude is scale-dependent. If we change the units of from meters to centimeters, the covariance will increase by a factor of 100, even though the underlying relationship between the variables has not changed. To overcome this, we use the correlation coefficient, which is a normalized measure.
The Pearson correlation coefficient between two random variables, and , is their covariance divided by the product of their standard deviations.
The correlation coefficient, , is a dimensionless quantity that always lies in the range .
- : Perfect positive linear relationship.
- : Perfect negative linear relationship.
- : No linear relationship.
- Values between 0 and 1 indicate the strength of a positive linear relationship.
- Values between -1 and 0 indicate the strength of a negative linear relationship.
π
Correlation Coefficient
Variables:
- = Covariance of and .
- = Variance of .
- = Variance of .
Application: Use this to find a standardized measure of linear association, which is independent of the units of the variables.
A non-zero correlation between two variables does not, by itself, imply that one variable causes the other. There could be a third, unobserved variable (a confounding variable) influencing both, or the relationship could be purely coincidental.
---
#
## 4. Variance of Sums of Random Variables
Covariance plays a crucial role in determining the variance of a sum or difference of random variables.
Let us derive the formula for :
By linearity of expectation:
This simplifies to:
Similarly, for the difference:
A special and very important case arises when and are independent. If two variables are independent, their covariance is zero. (The converse is not always true). For independent variables, the formulas simplify significantly:
---
Problem-Solving Strategies
For problems involving discrete random variables derived from an experiment (like coin tosses or dice rolls), follow a systematic procedure:
- List Outcomes: Enumerate all possible outcomes of the experiment and their probabilities.
- Create a Joint Table: Construct a table listing each outcome, its probability, and the corresponding values of
,X X , and the productY Y .X Y XY - Calculate Marginal PMFs: From the joint table, determine the probability mass functions (PMFs) for
andX X individually.Y Y - Compute Expectations: Calculate
andE [ X ] E[X] using their PMFs.E [ Y ] E[Y] - Compute
: Calculate the expected value of the productE [ X Y ] E[XY] directly from the joint table created in Step 2.X Y XY - Apply Formula: Substitute the computed values into
.C o v ( X , Y ) = E [ X Y ] β E [ X ] E [ Y ] Cov(X, Y) = E[XY] - E[X]E[Y]
---
Common Mistakes
- β Confusing Zero Correlation with Independence.
- β Incorrectly Applying Constants in Formulas.
- β Assuming Variance of a Sum is the Sum of Variances.
---
Practice Questions
:::question type="MCQ" question="Let
Step 1: Calculate the expected value of
Since
Step 2: Calculate the variance of
The given term
Let's calculate
Since
Step 3: Calculate the covariance term from its definition.
The given term
Let's calculate
Using properties,
Since
Answer: \boxed{The definition of
"
:::
:::question type="NAT" question="Two balls are drawn with replacement from an urn containing 3 red balls and 2 blue balls. Let the random variable
Step 1: Establish the relationship between
Since two balls are drawn in total, the number of red balls (
Step 2: Use the property of variance of a constant.
The variance of a constant is zero.
Step 3: Expand
Step 4: Equate the expressions from Step 2 and Step 3.
This implies:
Step 5: Calculate
The drawing of each ball is a Bernoulli trial. Let a "success" be drawing a red ball. The probability of success is
The variance of a binomial distribution is
Similarly,
Step 6: Substitute the variances into the equation for covariance.
Answer: \boxed{-0.48}
"
:::
:::question type="MSQ" question="Let
Let's evaluate each option.
Option A:
The formula for correlation is
So, statement A is correct.
Option B:
The formula is
So, statement B is correct.
Option C:
The formula is
So, statement C is correct.
Option D:
Using the property
Here
So, statement D is correct.
Answer: \boxed{A, B, C}
"
:::
---
Summary
- Covariance measures the direction of a linear relationship. For computations, always use the formula
.C o v ( X , Y ) = E [ X Y ] β E [ X ] E [ Y ] Cov(X, Y) = E[XY] - E[X]E[Y] - Correlation is the normalized version of covariance,
, which measures both the strength and direction of the linear relationship, bounded between -1 and 1.Ο ( X , Y ) = C o v ( X , Y ) Ο X Ο Y \rho(X, Y) = \frac{Cov(X, Y)}{\sigma_X \sigma_Y} - Properties of Linear Transformations are frequently tested. Remember that
andV a r ( a X + b ) = a 2 V a r ( X ) Var(aX+b) = a^2 Var(X) .C o v ( a X + b , c Y + d ) = a c β C o v ( X , Y ) Cov(aX+b, cY+d) = ac \cdot Cov(X, Y) - The Variance of a Sum depends on covariance:
. This simplifies only if the variables are uncorrelated.V a r ( X Β± Y ) = V a r ( X ) + V a r ( Y ) Β± 2 C o v ( X , Y ) Var(X \pm Y) = Var(X) + Var(Y) \pm 2Cov(X, Y)
---
---
What's Next?
A solid understanding of covariance and correlation is a prerequisite for several advanced topics in data analysis and statistics.
- Linear Regression: Correlation is the foundation of simple linear regression, which seeks to model the linear relationship between a dependent and an independent variable. The slope of the regression line is directly related to the covariance and variance of the variables.
- Multivariate Distributions: In the study of distributions involving multiple random variables (e.g., the Multivariate Normal Distribution), the relationships between pairs of variables are described by a covariance matrix, a critical component of the distribution's parameterization.
---
Now that you understand Correlation and Covariance, let's explore Conditional Expectation and Variance which builds on these concepts.
---
Part 4: Conditional Expectation and Variance
Introduction
In our study of random variables, we often seek to understand the relationship between them. While concepts like covariance and correlation provide a measure of linear association, they do not capture the full picture. Conditional expectation offers a more powerful and nuanced tool. It allows us to determine the expected value of one random variable, given that we have observed the outcome of another. This concept is fundamental to prediction and estimation, forming the bedrock of modern statistical modeling and machine learning.
Consider a scenario where we wish to predict a student's final exam score (
This chapter will formally define conditional expectation and its counterpart, conditional variance. We will explore their properties, most notably the Law of Total Expectation and the Law of Total Variance, which provide elegant methods for decomposing complex problems. Furthermore, we shall see how these concepts can be applied to solve intricate problems involving sequences of random events, a common feature in GATE questions.
Let
It is crucial to distinguish between
---
Key Concepts
1. Conditional Expectation for Discrete Random Variables
When dealing with discrete random variables, the conditional expectation is computed as a weighted average, where the weights are given by the conditional probability mass function (PMF).
First, we must define the conditional PMF of
Here,
With this conditional PMF, we can now define the conditional expectation.
Variables:
= A possible value of the random variabley y .Y Y = The conditional PMF ofp Y β£ X ( y β£ x ) p_{Y|X}(y|x) givenY Y .X = x X=x
When to use: Use this formula when both
Worked Example:
Problem: The joint PMF of two discrete random variables
| | Y=0 | Y=1 | Y=2 |
| :--- | :--- | :--- | :--- |
| X=0 | 0.1 | 0.2 | 0.1 |
| X=1 | 0.3 | 0.1 | 0.2 |
Calculate the conditional expectation
Solution:
Step 1: Find the marginal PMF of
Step 2: Determine the conditional PMF of
For
For
For
(As a check, we note that
Step 3: Apply the formula for conditional expectation.
Step 4: Compute the final value.
Answer:
---
2. Conditional Expectation for Continuous Random Variables
The logic for continuous variables is analogous to the discrete case, with sums replaced by integrals and PMFs by probability density functions (PDFs).
The conditional PDF of
where
Variables:
= A possible value of the random variabley y .Y Y = The conditional PDF off Y β£ X ( y β£ x ) f_{Y|X}(y|x) givenY Y .X = x X=x
When to use: Use this when
Worked Example:
Problem: Let the joint PDF of random variables
Calculate
Solution:
Step 1: Find the marginal PDF of
This is valid for
Step 2: Evaluate the marginal PDF at the given condition,
Step 3: Determine the conditional PDF
This conditional PDF is valid for
Step 4: Apply the formula for conditional expectation.
Step 5: Compute the integral.
Step 6: Simplify to find the final answer.
Answer:
---
3. Properties of Conditional Expectation
The true power of conditional expectation is revealed through its properties, which simplify complex calculations and provide deep theoretical insights.
The Law of Total Expectation (Tower Property)
This is arguably the most important property of conditional expectation and is frequently tested in GATE. It states that the expected value of the conditional expectation of
Variables:
is a random variable, as it is a function of the random variableE [ Y β£ X ] E[Y|X] .X X denotes taking the expectation of this new random variable.E [ E [ Y β£ X ] ] E[E[Y|X]]
When to use: This law is used to find an unconditional expectation when it is easier to first compute the expectation by conditioning on another variable. It is also a fundamental identity tested directly.
Let us demonstrate this for the continuous case.
& = \int_{-\infty}^{\infty} \left( \int_{-\infty}^{\infty} y \cdot f_{Y|X}(y|x) \, dy \right) f_X(x) \, dx \\
\text{Since } f_{Y|X}(y|x) \cdot f_X(x) & = f_{X,Y}(x,y), \text{ we have:} \\
E[E[Y|X]] & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} y \cdot f_{X,Y}(x,y) \, dy \, dx \\
\text{By changing the order of integration:} \\
E[E[Y|X]] & = \int_{-\infty}^{\infty} y \left( \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx \right) \, dy \\
\text{The inner integral is the marginal PDF of Y, } f_Y(y). \\
E[E[Y|X]] & = \int_{-\infty}^{\infty} y \cdot f_Y(y) \, dy = E[Y]\end{aligned}
This elegant result confirms the property.
Other Key Properties
---
---
#
## 4. Conditional Variance and The Law of Total Variance
Similar to expectation, we can define the variance of a random variable conditional on the value of another.
The conditional variance of
A more convenient computational form is:
Like conditional expectation,
This leads to a decomposition formula for variance, analogous to the Law of Total Expectation.
Variables:
is the total variance ofV a r ( Y ) Var(Y) .Y Y is the expected conditional variance. It represents the average amount of variance remaining inE [ V a r ( Y β£ X ) ] E[Var(Y|X)] even after we knowY Y .X X is the variance of the conditional expectation. It represents the portion of the variance inV a r ( E [ Y β£ X ] ) Var(E[Y|X]) that is explained by the variability ofY Y .X X
When to use: This formula is extremely useful in situations where a random variable's variance is influenced by another random process. It breaks down the total variance into components.
---
#
## 5. Application: Recurrence Relations for Expected Values
A powerful application of conditional expectation is in solving problems involving sequences of trials, such as finding the expected number of steps to reach a certain state. This technique was implicitly tested in GATE. The core idea is to condition on the outcome of the first step.
Let
Worked Example:
Problem: A fair coin is flipped repeatedly. What is the expected number of flips required to see the pattern HT (Heads followed by Tails)?
Solution:
Step 1: Define the states and the expected values from each state.
Let
Let
Step 2: Set up the equation for
The first flip is either H (with probability 1/2) or T (with probability 1/2).
If the first flip is T, we have wasted one flip and are back to the start. The expected number of additional flips is
If the first flip is H, we have used one flip and are now in state H. The expected number of additional flips is
Step 3: Set up the equation for
From state H, the next flip is either T (probability 1/2) or H (probability 1/2).
If the next flip is T, we have achieved the pattern HT. The process stops. Total additional flips: 1.
If the next flip is H, we have wasted a flip but are still in a state where the last flip was H. The expected number of additional flips is still
Step 4: Solve the system of linear equations. First, solve for
Step 5: Substitute the value of
Answer: The expected number of flips is
---
Problem-Solving Strategies
For any problem of the form "Find
- Identify the type: Are the variables discrete or continuous?
- Find the marginal: Calculate the marginal distribution of the conditioning variable (
orp X ( x ) p_X(x) ).f X ( x ) f_X(x) - Find the conditional distribution: Use the formula
(or its continuous counterpart). This is the most critical step.p Y β£ X ( y β£ x ) = p X , Y ( x , y ) / p X ( x ) p_{Y|X}(y|x) = p_{X,Y}(x,y)/p_X(x) - Integrate/Sum: Apply the definition of expectation using the conditional distribution you just found:
.E [ g ( Y ) β£ X = x ] = β« g ( y ) f Y β£ X ( y β£ x ) d y E[g(Y)|X=x] = \int g(y) f_{Y|X}(y|x) dy
This structured approach prevents errors in calculation, especially with complex integration bounds.
For problems asking for the "expected number of trials until...", the key is to define states based on the progress towards the goal.
- Let
be the expected number of additional steps from stateE i E_i .i i - The initial state's expectation is what you want to find (e.g.,
).E 0 E_0 - For each state, write an equation for
by conditioning on the outcome of the next trial. The equation will be of the formE i E_i .E i = 1 + β j P ( transitionΒ toΒ j ) E j E_i = 1 + \sum_j P(\text{transition to } j) E_j - The "success" state has an expected additional number of steps of 0.
- Solve the resulting system of linear equations.
---
Common Mistakes
- β Confusing
andE [ Y β£ X ] E[Y|X] : Students often forget thatE [ Y β£ X = x ] E[Y|X=x] is a random variable (a function ofE [ Y β£ X ] E[Y|X] ), whileX X is a specific value (a function of the numberE [ Y β£ X = x ] E[Y|X=x] ). This distinction is crucial for understanding the Law of Total Expectation,x x .E [ E [ Y β£ X ] ] = E [ Y ] E[E[Y|X]] = E[Y]
- β Incorrect Marginalization: A frequent error in continuous problems is using incorrect limits when integrating the joint PDF to find the marginal PDF. Always carefully check the support of the joint PDF. For example, if
, the integral for0 < y < x 0 < y < x must be overf X ( x ) f_X(x) fromy y to0 0 , not over the full range ofx x .y y
- β Misinterpreting the Law of Total Variance: A common mistake is to write
instead of the correctV a r ( Y ) = V a r ( E [ Y β£ X ] ) + V a r ( V a r ( Y β£ X ) ) Var(Y) = Var(E[Y|X]) + Var(Var(Y|X)) . Remember, you take the expectation of the conditional variance, not its variance.V a r ( Y ) = V a r ( E [ Y β£ X ] ) + E [ V a r ( Y β£ X ) ] Var(Y) = Var(E[Y|X]) + E[Var(Y|X)]
---
Practice Questions
:::question type="MCQ" question="Let
Therefore, we have:
In this specific case,
Answer: \boxed{g(x) E[Y|X=x]}"
:::
:::question type="NAT" question="The joint PDF of two random variables
Step 1: Find the marginal PDF
The limits for
This is for
Step 2: Find the conditional PDF
This is for
Step 3: Calculate
First, find the conditional PDF for
The range for
Step 4: Integrate to find the conditional expectation.
Result:
The value is approximately 0.3333... Rounding to 2 decimal places gives 0.33.
Answer: \boxed{0.33}"
:::
:::question type="MSQ" question="Let
- Option A: This is the Law of Total Expectation (or Tower Property), which is a fundamental property and is always true.
- Option B: This is an incorrect statement of the Law of Total Variance. The correct law is
. The second term should be the expectation of the conditional variance, not the variance of the conditional variance. So, this statement is false.V a r ( Y ) = V a r ( E [ Y β£ X ] ) + E [ V a r ( Y β£ X ) ] Var(Y) = Var(E[Y|X]) + E[Var(Y|X)] - Option C: Using the linearity property and the 'taking out what is known' property:
. SinceE [ X + Y β£ X ] = E [ X β£ X ] + E [ Y β£ X ] E[X+Y|X] = E[X|X] + E[Y|X] , the statement simplifies toE [ X β£ X ] = X E[X|X]=X . This is correct.X + E [ Y β£ X ] X + E[Y|X] - Option D: Using the 'taking out what is known' property, we get
. IfE [ X Y β£ X ] = X E [ Y β£ X ] E[XY|X] = X E[Y|X] andX X are independent, thenY Y . Substituting this in, we getE [ Y β£ X ] = E [ Y ] E[Y|X] = E[Y] . This statement is correct.E [ X Y β£ X ] = X E [ Y ] E[XY|X] = X E[Y]
Therefore, options A, C, and D are always true.
Answer: \boxed{A, C, D}"
:::
:::question type="NAT" question="A coin has a probability
Step 1: Define states.
Let
Let
Let
Step 2: Formulate the equation for
From the start, the first flip is either H (with probability
- If T: We use 1 flip and are back to the start. Total flips:
.1 + E 1+E - If H: We use 1 flip and move to state H. Total flips:
.1 + E H 1+E_H
Step 3: Formulate the equation for
From state H, the next flip is either T (with probability
- If T: We use 1 flip and the pattern HT is achieved. The process stops. Total additional flips: 1.
- If H: We use 1 flip and are still in state H (the last flip was H). Total additional flips:
.1 + E H 1+E_H
Step 4: Solve for
Since
Step 5: Substitute
Since
Step 6: Calculate the value using
Result:
The expected number of flips is 4.5.
Answer: \boxed{4.5}"
:::
---
Summary
- Law of Total Expectation: The most fundamental property is
. This allows breaking down a complex expectation calculation by conditioning on a suitable random variable. It is frequently tested directly as a theoretical question.E [ Y ] = E [ E [ Y β£ X ] ] E[Y] = E[E[Y|X]]
- Calculation Procedure: For computational problems involving
, always follow the three-step process: find the marginal distribution ofE [ Y β£ X = x ] E[Y|X=x] , then find the conditional distribution ofX X givenY Y , and finally compute the expectation using that conditional distribution. Be meticulous with the limits of integration or summation.X X
- Recurrence Relations: For problems asking for the expected time or trials to an event, the method of conditioning on the first step is extremely effective. Define states representing progress towards the goal and set up a system of linear equations for the expected values from each state.
---
---
What's Next?
This topic serves as a foundation for several advanced areas in data analysis and probability.
- Regression Analysis: The conditional expectation
is precisely the regression function ofE [ Y β£ X = x ] E[Y|X=x] onY Y . It represents the best possible prediction ofX X givenY Y , in the sense of minimizing mean squared error.X X
- Markov Chains: The state-based approach we used for recurrence problems is the essence of analyzing discrete-time Markov chains. The concept of conditioning on the previous state is central to the Markov property.
- Bayesian Statistics: Conditional distributions are the heart of Bayesian inference. Bayes' theorem is used to update our belief about a parameter (a conditional distribution) after observing data.
---
Chapter Summary
In this chapter, we have introduced the fundamental concept of a random variable and the mathematical tools used to characterize its behavior. A thorough understanding of these principles is essential for subsequent topics in probability and statistics. The most critical concepts to retain are as follows:
- Fundamental Definition: A random variable is a function that assigns a numerical value to each outcome in the sample space of a random experiment. We distinguish between discrete random variables, which take on a countable number of values and are described by a Probability Mass Function (PMF), and continuous random variables, which take values in an interval and are described by a Probability Density Function (PDF).
- Expectation: The expected value, or mean, of a random variable
, denotedX X , represents its long-term average. It is the center of mass of the probability distribution. For a discrete variable,E [ X ] E[X]
and for a continuous variable,
- Variance and Standard Deviation: Variance,
, is the primary measure of the dispersion or spread of a distribution around its mean.Var β‘ ( X ) \operatorname{Var}(X)
The standard deviation,
- Covariance and Correlation: For two random variables
andX X , covariance,Y Y , measures their joint variability.Cov β‘ ( X , Y ) \operatorname{Cov}(X, Y)
The correlation coefficient,
normalizes this measure to the range
- Properties of Expectation and Variance: The linearity of expectation,
is a universally applicable and powerful tool. The variance of a linear combination is given by
The covariance term vanishes if and only if the variables are uncorrelated.
- Conditional Expectation: The conditional expectation
is the expected value ofE [ X β£ Y = y ] E[X|Y=y] given that the random variableX X has taken the specific valueY Y . A cornerstone result is the Law of Total Expectation,y y
which allows us to compute an expectation by conditioning on another related variable.
---
Chapter Review Questions
:::question type="MCQ" question="Let
We are asked to find the variance of
In our case,
We are given
where
Given
Given
Now, we can find the covariance:
Finally, we substitute all the values back into the variance formula for
Therefore, the variance of
"
:::
:::question type="NAT" question="A continuous random variable
Let
We want to find the conditional expectation
First, we calculate the probability of this event:
The conditional PDF of
For
And
Now, we can compute the conditional expectation:
The value of the conditional expectation is
"
:::
:::question type="MCQ" question="A fair six-sided die is rolled. Let
The random variable
First, we calculate the expected value of
The question asks for the value of
We recognize that
We can calculate
Now, we can find the variance:
As a decimal,
Rounding to three decimal places, the answer is
"
:::
:::question type="NAT" question="Two discrete random variables
To find the covariance
1. Calculate marginal PMFs and expectations
The marginal PMF of
The expected value of
The marginal PMF of
The expected value of
2. Calculate
The expectation of the product
The only terms that are non-zero are when both
3. Calculate Covariance:
Now we substitute the computed values into the covariance formula:
The covariance is
"
:::
---
What's Next?
Having completed this chapter on Random Variables, you have established a firm foundation in the language and mathematics used to describe and analyze random phenomena. These concepts are not an endpoint but rather a critical stepping stone in your preparation.
Key connections:
- Relation to Previous Chapters: This chapter builds directly upon the fundamentals of Set Theory and Probability. The sample spaces and events we studied previously are now mapped to numerical values, allowing us to use the tools of calculus and algebra. The axioms of probability provide the rigorous underpinning for the properties of PMFs and PDFs.
- Foundation for Future Chapters: The concepts mastered here are indispensable for the chapters that follow: