Random Variables

Overview

In our preceding discussions, we established the foundational principles of probability theory by examining sample spaces and events. However, to perform a rigorous quantitative analysis of random phenomena, we must bridge the gap between abstract outcomes and numerical values. This chapter introduces the concept of a random variable, a fundamental construct that assigns a numerical value to every possible outcome of a random experiment. The formalization of this concept is a cornerstone of modern statistics and machine learning, allowing us to apply the powerful tools of mathematical analysis to uncertain events.

We shall begin by defining discrete and continuous random variables and exploring their associated probability distributions. Subsequently, we will investigate the essential characteristics of these distributions through measures of central tendency and dispersion. We will learn to compute and interpret quantities such as the expected value ( $E[X]$ ) and variance ( $\operatorname{Var}(X)$ ), which provide concise summaries of a variable's behavior. The chapter then progresses to the study of relationships between multiple random variables, introducing covariance and correlation as measures of linear dependence. A firm grasp of these concepts is indispensable for understanding feature interactions in data analysis and machine learning models, a topic of significant importance for the GATE examination.

Finally, we culminate our study with an examination of conditional expectation and variance. This advanced topic addresses how our knowledge or assumptions about one event or variable can alter our expectations about another. This principle forms the basis for many predictive models and inferential techniques encountered in the Data Science and AI syllabus. A thorough understanding of the material presented herein is therefore critical for solving complex problems and building a robust theoretical foundation for subsequent topics.

---

Chapter Contents

| # | Topic | What You'll Learn |
|---|-------|-------------------|
| 1 | Definition of Random Variables | Formalizing numerical outcomes of random experiments. |
| 2 | Measures of Central Tendency and Dispersion | Quantifying the center and spread of distributions. |
| 3 | Correlation and Covariance | Analyzing linear dependence between random variables. |
| 4 | Conditional Expectation and Variance | Updating expectations with new information provided. |

---

Learning Objectives

❗ By the End of This Chapter

After completing this chapter, you will be able to:

Define a random variable and differentiate between discrete and continuous types.

Calculate and interpret the expected value, variance, and standard deviation of a random variable.

Compute the covariance and correlation between two random variables to assess their linear relationship.

Determine the conditional expectation and variance of a random variable given an event or another variable.

---

We now turn our attention to Definition of Random Variables...

Part 1: Definition of Random Variables

Introduction

In our study of probability, we are often concerned not with the specific outcomes of an experiment, but rather with some numerical property associated with those outcomes. For instance, in an experiment involving the tossing of two coins, we might be more interested in the number of heads that appear than in the exact sequence of heads and tails. This need to associate a numerical value with each outcome of a random experiment leads us to the fundamental concept of a random variable.

A random variable provides a means of mapping the, often non-numerical, outcomes in a sample space to a set of real numbers. This transformation is crucial as it allows us to apply the powerful tools of calculus and mathematical analysis to the study of probability and statistics. We can analyze distributions, calculate expected values, and determine variances, all of which are central to data analysis and inference. Understanding the formal definition and classification of random variables is the first essential step in this direction.

📖 Random Variable

A random variable, typically denoted by a capital letter such as $X$ , is a function that assigns a real number to each outcome in the sample space $S$ of a random experiment. Formally, it is a mapping $X: S \to \mathbb{R}$ .

We use a capital letter, $X$ , to represent the random variable as a function, and a lowercase letter, $x$ , to represent a specific value that the random variable can take. The set of all possible values of $X$ is called the range or support of the random variable.

---

Key Concepts

The most fundamental classification of random variables is based on the nature of the values they can assume. This leads to two primary types: discrete and continuous random variables.

1. Discrete Random Variables

A random variable is said to be discrete if its range is finite or countably infinite. This means that the variable can only take on a specific, separated set of values. There are "gaps" between the possible values.

Consider the experiment of rolling a standard six-sided die. The sample space is $S = \{1, 2, 3, 4, 5, 6\}$ . If we define a random variable $X$ as the outcome of the roll, then $X$ can take values from the set $\{1, 2, 3, 4, 5, 6\}$ . Since this set is finite, $X$ is a discrete random variable. Similarly, the number of defective items in a batch of 100 is a discrete random variable, as it can take integer values from 0 to 100.

Sample Space (S)

HH
HT
TH
TT

Real Numbers (ℝ)

2

1

0

X = No. of Heads

Worked Example:

Problem: An experiment consists of tossing two fair coins. Let the random variable $X$ be defined as the number of heads observed. Determine the sample space $S$ and the set of possible values for $X$ .

Solution:

Step 1: Define the sample space $S$ .
The sample space consists of all possible outcomes of tossing two coins. Let H denote Heads and T denote Tails.

S = \{HH, HT, TH, TT\}

Step 2: Apply the function $X$ to each outcome in $S$ .
The random variable $X$ counts the number of heads in each outcome.

For the outcome $HH$ , the number of heads is 2. So, $X(HH) = 2$ .

For the outcome $HT$ , the number of heads is 1. So, $X(HT) = 1$ .

For the outcome $TH$ , the number of heads is 1. So, $X(TH) = 1$ .

For the outcome $TT$ , the number of heads is 0. So, $X(TT) = 0$ .

Step 3: List the set of all possible values for $X$ .
The range of the random variable $X$ is the set of all unique numerical values it can take.

Range(X) = \{0, 1, 2\}

Answer: The sample space is $S = \{HH, HT, TH, TT\}$ and the random variable $X$ can take values from the set $\{0, 1, 2\}$ . Since this set is finite, $X$ is a discrete random variable.

---

2. Continuous Random Variables

A random variable is said to be continuous if its range is an interval or a collection of intervals on the real number line. This means the variable can take on any value within a given range, and there are uncountably infinite possible values.

For example, if we define a random variable $Y$ as the height of a randomly selected student, $Y$ can take any value within a certain range, say $[150 \text{ cm}, 190 \text{ cm}]$ . It is not restricted to integer values; a height of $175.342$ cm is perfectly possible. Other examples include temperature, weight, and time. For a continuous random variable, the probability of it taking any single specific value is zero, i.e., $P(Y=y) = 0$ . We instead focus on the probability that the variable falls within a certain interval.

---

Problem-Solving Strategies

💡 GATE Strategy: Identify the Variable Type

When presented with a problem, the first step is to determine whether the random variable is discrete or continuous.

Ask: Can I count the possible outcomes? If the values are countable (e.g., number of successes, number of arrivals, results of a die roll), it is discrete.

Ask: Is the variable measured? If the value is obtained by measurement (e.g., height, weight, time, temperature), it can take any value in an interval and is continuous.

This initial classification dictates the entire subsequent approach, including the type of probability distribution (PMF vs. PDF) to be used.

---

Common Mistakes

⚠️ Avoid These Errors

❌ Confusing a random variable with an algebraic variable. A random variable $X$ is a function that maps outcomes to numbers. An algebraic variable $x$ is simply an unknown quantity.

❌ Assuming all numerical variables are continuous. The number of cars passing a toll booth in an hour is numerical, but it is discrete (0, 1, 2, ...). It cannot be 2.5. Always check if the values are countable or if they fall within a continuous range.

❌ Incorrectly defining the range. For an experiment of drawing 3 balls from a bag of 5 red and 5 blue balls, if $X$ is the number of red balls drawn, the range is $\{0, 1, 2, 3\}$ , not all real numbers from 0 to 3.

---

Practice Questions

:::question type="MCQ" question="Which of the following is an example of a discrete random variable?" options=["The height of a building", "The time taken to complete a race", "The number of defective items in a shipment of 1000 items", "The temperature of a room in Celsius"] answer="The number of defective items in a shipment of 1000 items" hint="A discrete variable's values can be counted. Which of the options represents a counted quantity rather than a measured one?" solution="Let's analyze the options.

The height of a building is a measurement and can take any value within a range, making it continuous.

The time taken to complete a race is also a measurement and is continuous.

The number of defective items is a count (0, 1, 2, ..., 1000). This set of values is finite and countable. Therefore, it is a discrete random variable.

The temperature of a room is a measurement and is continuous.

Answer: \boxed{\text{The number of defective items in a shipment of 1000 items}}"
:::

:::question type="NAT" question="A box contains 4 red and 3 green balls. An experiment consists of drawing 2 balls from the box without replacement. Let the random variable Y be the number of green balls drawn. What is the number of distinct values that Y can take?" answer="3" hint="Consider the minimum and maximum number of green balls you can possibly draw in this experiment." solution="Let Y be the number of green balls drawn.
The experiment involves drawing 2 balls from a total of 7.

We could draw 0 green balls (meaning 2 red balls are drawn). This is possible since there are 4 red balls. So, Y can be 0.

We could draw 1 green ball and 1 red ball. This is possible. So, Y can be 1.

We could draw 2 green balls. This is possible since there are 3 green balls. So, Y can be 2.

We cannot draw 3 green balls, as we are only drawing 2 balls in total.

The set of possible values for Y is

\{0, 1, 2\}

.
The number of distinct values is 3.
Answer: \boxed{3}"
:::

:::question type="MSQ" question="Let S be the sample space of a random experiment. Let X be a random variable defined as a function $X: S \to \mathbb{R}$ . Which of the following statements are ALWAYS true?" options=["X is always a discrete random variable.", "The range of X is a subset of the real numbers.", "If the sample space S is finite, then X must be a discrete random variable.", "If X is a continuous random variable, the sample space S must be infinite."] answer="The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable." hint="Review the fundamental definition of a random variable and the properties of discrete vs. continuous variables." solution="Let's evaluate each statement:

'X is always a discrete random variable.' This is false, as continuous random variables (e.g., height, temperature) exist.

'The range of X is a subset of the real numbers.' This is true by the formal definition of a random variable, which maps outcomes from the sample space

S

to the set of real numbers

\mathbb{R}

'If the sample space S is finite, then X must be a discrete random variable.' This is true. A function defined on a finite set can only produce a finite number of distinct outputs. A random variable with a finite range is, by definition, discrete.

'If X is a continuous random variable, the sample space S must be infinite.' This is true. A continuous random variable has an uncountable range (an interval). A function cannot map a finite or countably infinite domain to an uncountable range. Therefore, the sample space

S

must be infinite.

Based on the fundamental definitions, statements 2 and 3 are the most direct and fundamental consequences of the definition of a random variable.

Answer: \boxed{\text{The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable.}}"
:::

---

Summary

❗ Key Takeaways for GATE

A Random Variable is a function that maps outcomes from a sample space $S$ to the set of real numbers $\mathbb{R}$ .

The primary classification is between Discrete Random Variables (countable values, e.g., number of defects) and Continuous Random Variables (values in an interval, e.g., height or weight).

The first step in any random variable problem is to correctly identify its type, as this determines the entire analytical approach.

---

What's Next?

💡 Continue Learning

This topic is the foundation for understanding how we model random phenomena numerically. Your understanding will be deepened by studying:

Probability Distributions: How probabilities are assigned to the values of a random variable. This involves learning about Probability Mass Functions (PMF) for discrete variables and Probability Density Functions (PDF) for continuous variables.

Expectation and Variance: How to calculate the central tendency (mean) and spread (variance) of a random variable, which are crucial measures for summarizing its behavior.

Mastering the definition and classification of random variables is essential before proceeding to these more advanced concepts.

---

💡 Moving Forward

Now that you understand Definition of Random Variables, let's explore Measures of Central Tendency and Dispersion which builds on these concepts.

---

Part 2: Measures of Central Tendency and Dispersion

Introduction

In the study of random variables, it is often insufficient to know only the full probability distribution. For many applications in data analysis and statistical inference, we require concise numerical summaries that describe the essential features of the distribution. These summaries are broadly categorized into two types: measures of central tendency and measures of dispersion.

Measures of central tendency aim to identify a single value that represents the "center" or "typical" value of a random variable. The most common of these is the mean, or expected value, which provides a long-run average of the outcomes. Measures of dispersion, conversely, quantify the variability or spread of the random variable's possible values around this central point. The primary measures here are the variance and its square root, the standard deviation. A thorough understanding of these measures is not merely a procedural exercise; it is fundamental to interpreting probabilistic models and making informed decisions based on data. In the context of the GATE examination, questions frequently test the direct calculation of these measures as well as their known properties for standard probability distributions.

📖 Summary Statistic

A summary statistic is a single number that is computed from a probability distribution (or a sample of data) to summarize a specific characteristic of that distribution. Measures of central tendency and dispersion are the most fundamental types of summary statistics.

---

Measures of Central Tendency

These measures provide a single value that attempts to describe a set of data by identifying the central position within that set of data.

1. Mean (Expected Value)

The most important measure of central tendency is the mean, also known as the expected value. For a random variable $X$ , its expected value, denoted as $E[X]$ or $\mu_X$ , represents the weighted average of all possible values that $X$ can take, where the weights are the corresponding probabilities.

📖 Expected Value (Mean)

The expected value of a random variable is the long-run average value of repetitions of the experiment it represents. It is the center of mass of the probability distribution.

For a discrete random variable $X$ with a set of possible values $S$ and a probability mass function (PMF) $p(x)$ , the expected value is the sum of each value multiplied by its probability.

📐 Expected Value of a Discrete Random Variable

E[X] = \mu_X = \sum_{x \in S} x \cdot p(x)

Variables:

$x$ : A possible value of the random variable $X$ .

$p(x)$ : The probability that $X$ takes the value $x$ , i.e., $P(X=x)$ .

$S$ : The sample space or set of all possible values for $X$ .

When to use: When given the PMF of a discrete random variable and asked to find its mean.

For a continuous random variable $X$ with a probability density function (PDF) $f(x)$ , the expected value is found by integrating the product of $x$ and $f(x)$ over the entire range of $X$ .

📐 Expected Value of a Continuous Random Variable

E[X] = \mu_X = \int_{-\infty}^{\infty} x \cdot f(x) \,dx

Variables:

$x$ : A variable representing the values of the random variable $X$ .

$f(x)$ : The probability density function of $X$ .

When to use: When given the PDF of a continuous random variable and asked to find its mean.

Worked Example:

Problem: A discrete random variable $X$ has the following probability mass function: $P(X=1) = 0.2$ , $P(X=2) = 0.5$ , and $P(X=3) = 0.3$ . Calculate the mean of $X$ .

Solution:

Step 1: Identify the values of $x$ and their corresponding probabilities $p(x)$ .
The values are $x_1=1$ , $x_2=2$ , $x_3=3$ .
The probabilities are $p(1)=0.2$ , $p(2)=0.5$ , $p(3)=0.3$ .

Step 2: Apply the formula for the expected value of a discrete random variable.

E[X] = \sum_{i=1}^{3} x_i \cdot p(x_i)

Step 3: Substitute the values and compute the sum.

E[X] = (1 \times 0.2) + (2 \times 0.5) + (3 \times 0.3)

E[X] = 0.2 + 1.0 + 0.9

Step 4: Calculate the final result.

E[X] = 2.1

Answer: The mean of the random variable $X$ is $2.1$ .

2. Median

The median is the value that separates the higher half from the lower half of a probability distribution. For a continuous random variable $X$ with cumulative distribution function (CDF) $F_X(x)$ , the median $m$ is the value such that $P(X \le m) = 0.5$ .

📖 Median

The median $m$ of a random variable $X$ is any value such that $P(X \le m) \ge 0.5$ and $P(X \ge m) \ge 0.5$ . For a continuous random variable with CDF $F_X(x)$ , it is the value $m$ for which $F_X(m) = 0.5$ .

We observe that the median is less sensitive to extreme values (outliers) than the mean, a property known as robustness.

3. Mode

The mode of a random variable is the value at which its PMF or PDF takes its maximum value. A distribution can have one mode (unimodal), two modes (bimodal), or more (multimodal).

📖 Mode

The mode of a random variable $X$ is the value that is most likely to occur. For a discrete random variable, it is the value $x$ that maximizes the PMF $p(x)$ . For a continuous random variable, it is the value $x$ that maximizes the PDF $f(x)$ .

---

Measures of Dispersion

While central tendency tells us about the location of a distribution, measures of dispersion describe its spread or variability.

1. Variance

Variance is the most common measure of dispersion. It quantifies the spread of a random variable's values around its mean. Specifically, it is the expected value of the squared deviation from the mean.

📖 Variance

The variance of a random variable $X$ with mean $\mu = E[X]$ , denoted as $Var(X)$ or $\sigma^2$ , is defined as $Var(X) = E[(X - \mu)^2]$ . A small variance indicates that the data points tend to be very close to the mean, while a high variance indicates that the data points are spread out over a wider range.

While the definition is intuitive, a more practical formula, often called the computational formula, is used for calculations. We can derive it as follows:

Var(X) = E[(X - \mu)^2]

= E[X^2 - 2\mu X + \mu^2]

By linearity of expectation,

= E[X^2] - E[2\mu X] + E[\mu^2]

Since $\mu$ is a constant, $E[2\mu X] = 2\mu E[X] = 2\mu^2$ and $E[\mu^2] = \mu^2$ .

= E[X^2] - 2\mu^2 + \mu^2

= E[X^2] - \mu^2

= E[X^2] - (E[X])^2

📐 Variance (Computational Formula)

Var(X) = \sigma^2 = E[X^2] - (E[X])^2

Variables:

$E[X^2]$ : The expected value of the square of the random variable.

$(E[X])^2$ : The square of the expected value of the random variable.

When to use: This is the preferred formula for most GATE problems as it simplifies calculation. It requires computing two expectations:

E[X]

and

E[X^2]

2. Standard Deviation

The standard deviation is simply the positive square root of the variance. Its primary advantage is that it is expressed in the same units as the random variable, making it more interpretable than the variance.

📖 Standard Deviation

The standard deviation of a random variable $X$ , denoted by $\sigma$ , is the positive square root of its variance.

\sigma = \sqrt{Var(X)}

Worked Example:

Problem: For the discrete random variable $X$ from the previous example with $P(X=1) = 0.2$ , $P(X=2) = 0.5$ , and $P(X=3) = 0.3$ , calculate the variance and standard deviation. We already found that $E[X] = 2.1$ .

Solution:

Step 1: Calculate $E[X^2]$ using its definition for a discrete random variable.

E[X^2] = \sum_{i=1}^{3} x_i^2 \cdot p(x_i)

Step 2: Substitute the values.

E[X^2] = (1^2 \times 0.2) + (2^2 \times 0.5) + (3^2 \times 0.3)

E[X^2] = (1 \times 0.2) + (4 \times 0.5) + (9 \times 0.3)

E[X^2] = 0.2 + 2.0 + 2.7

E[X^2] = 4.9

Step 3: Apply the computational formula for variance.

Var(X) = E[X^2] - (E[X])^2

Step 4: Substitute the calculated values of $E[X^2]$ and $E[X]$ .

Var(X) = 4.9 - (2.1)^2

Var(X) = 4.9 - 4.41

Var(X) = 0.49

Step 5: Calculate the standard deviation by taking the square root of the variance.

\sigma = \sqrt{Var(X)} = \sqrt{0.49}

\sigma = 0.7

Answer: The variance is $0.49$ and the standard deviation is $0.7$ .

---

Mean and Variance of Standard Distributions

For the GATE exam, it is imperative to know the mean and variance of several standard probability distributions by heart. Direct application of these formulas can save significant time.

| Distribution | Parameters | Mean ( $E[X]$ ) | Variance ( $Var(X)$ ) |
| :--- | :--- | :--- | :--- |
| Binomial | $n$ (trials), $p$ (success prob.) | $np$ | $np(1-p)$ |
| Poisson | $\lambda$ (rate) | $\lambda$ | $\lambda$ |
| Uniform (Continuous) | $a$ (min), $b$ (max) | $\frac{a+b}{2}$ | $\frac{(b-a)^2}{12}$ |
| Exponential | $\lambda$ (rate) | $\frac{1}{\lambda}$ | $\frac{1}{\lambda^2}$ |
| Normal | $\mu$ (mean), $\sigma^2$ (variance) | $\mu$ | $\sigma^2$ |
| Standard Normal | $\mu=0$ , $\sigma^2=1$ | $0$ | $1$ |

❗ Must Remember

The properties of the Poisson and Standard Normal distributions are frequently tested.

For a Poisson random variable, the mean and variance are identical: $E[X] = Var(X) = \lambda$ .

For a Standard Normal random variable, the mean is $0$ and the variance is $1$ .

---

Problem-Solving Strategies

A strategic approach is essential for solving problems under time constraints.

💡 GATE Strategy: Recognize the Distribution

Before starting a lengthy calculation from a PMF or PDF, always check if the problem describes a standard distribution (Binomial, Poisson, etc.). If it does, you can use the known formulas for mean and variance directly, which is significantly faster than calculating from first principles.

💡 GATE Strategy: Use Properties of Expectation and Variance

For a transformed random variable $Y = aX + b$ , where $a$ and $b$ are constants:

$E[Y] = E[aX + b] = aE[X] + b$

$Var(Y) = Var(aX + b) = a^2 Var(X)$

These properties are extremely useful for simplifying problems. Note that the additive constant

b

does not affect the variance, as it simply shifts the distribution without changing its spread.

---

Common Mistakes

Awareness of common pitfalls can prevent the loss of valuable marks.

⚠️ Avoid These Errors

❌ Confusing Standard Deviation and Variance: Students often provide the variance when the standard deviation is asked, or vice-versa.

✅ Always double-check what the question asks for. Remember

\sigma = \sqrt{\sigma^2}

❌ Incorrectly Applying Variance Properties: A frequent error is to write $Var(aX+b) = aVar(X)+b$ .

✅ The correct property is

Var(aX+b) = a^2 Var(X)

. The constant

b

shifts the mean but not the spread, and the scaling factor

a

is squared.

❌ Error in Computational Formula: Forgetting to square the mean in the variance formula, i.e., using $E[X^2] - E[X]$ instead of $E[X^2] - (E[X])^2$ .

✅ Always be careful to square the entire term

E[X]

---

Practice Questions

:::question type="MCQ" question="A random variable $X$ has the probability mass function $P(X=0) = 1/3$ , $P(X=1) = 1/2$ , and $P(X=2) = 1/6$ . What is the variance of the random variable $Y = 2X - 3$ ?" options=["17/36","17/9","34/9","5/6"] answer="17/9" hint="First, find the variance of $X$ using the computational formula $Var(X) = E[X^2] - (E[X])^2$ . Then, use the property $Var(aX+b) = a^2Var(X)$ ." solution="
Step 1: Calculate the mean of $X$ , $E[X]$ .

E[X] = \left(0 \times \frac{1}{3}\right) + \left(1 \times \frac{1}{2}\right) + \left(2 \times \frac{1}{6}\right)

E[X] = 0 + \frac{1}{2} + \frac{1}{3} = \frac{5}{6}

Step 2: Calculate $E[X^2]$ .

E[X^2] = \left(0^2 \times \frac{1}{3}\right) + \left(1^2 \times \frac{1}{2}\right) + \left(2^2 \times \frac{1}{6}\right)

E[X^2] = 0 + \frac{1}{2} + \frac{4}{6} = \frac{1}{2} + \frac{2}{3} = \frac{7}{6}

Step 3: Calculate the variance of $X$ , $Var(X)$ .

Var(X) = E[X^2] - (E[X])^2

Var(X) = \frac{7}{6} - \left(\frac{5}{6}\right)^2 = \frac{7}{6} - \frac{25}{36}

Var(X) = \frac{42 - 25}{36} = \frac{17}{36}

Step 4: Calculate the variance of $Y = 2X - 3$ .

Var(Y) = Var(2X - 3) = 2^2 Var(X)

Var(Y) = 4 \times \frac{17}{36} = \frac{17}{9}

Result: The variance of $Y$ is $17/9$ .
"
:::

:::question type="NAT" question="A continuous random variable $X$ has a probability density function given by $f(x) = \frac{3}{8}x^2$ for $0 \le x \le 2$ , and $f(x)=0$ otherwise. Calculate the mean of $X$ ." answer="1.5" hint="Use the formula for the expected value of a continuous random variable: $E[X] = \int x \cdot f(x) \,dx$ over the defined range." solution="
Step 1: Set up the integral for the expected value, $E[X]$ .

E[X] = \int_{-\infty}^{\infty} x \cdot f(x) \,dx

Step 2: Substitute the given PDF and adjust the integration limits.

E[X] = \int_{0}^{2} x \cdot \left(\frac{3}{8}x^2\right) \,dx

Step 3: Simplify the integrand.

E[X] = \frac{3}{8} \int_{0}^{2} x^3 \,dx

Step 4: Evaluate the integral.

E[X] = \frac{3}{8} \left[ \frac{x^4}{4} \right]_{0}^{2}

Step 5: Substitute the limits of integration.

E[X] = \frac{3}{8} \left( \frac{2^4}{4} - \frac{0^4}{4} \right)

E[X] = \frac{3}{8} \left( \frac{16}{4} - 0 \right)

E[X] = \frac{3}{8} \times 4

Result:

E[X] = \frac{12}{8} = \frac{3}{2} = 1.5

" :::

:::question type="MSQ" question="Let $X$ be a random variable with mean $E[X] = 10$ and variance $Var(X) = 4$ . Let a new random variable be defined as $Y = 5 - 2X$ . Which of the following statements is/are correct?" options=["The mean of Y is -15.","The variance of Y is -8.","The variance of Y is 16.","The standard deviation of Y is 4."] answer="The mean of Y is -15.,The variance of Y is 16.,The standard deviation of Y is 4." hint="Apply the properties of expectation and variance for a linear transformation. $E[aX+b] = aE[X]+b$ and $Var(aX+b) = a^2Var(X)$ ." solution="
Let us analyze each statement.
The transformation is $Y = 5 - 2X$ . Here, $a = -2$ and $b = 5$ .

Statement 1: The mean of Y is -15.
Using the property of expectation:

E[Y] = E[5 - 2X] = 5 - 2E[X]

Substituting

E[X] = 10

E[Y] = 5 - 2(10) = 5 - 20 = -15

Thus, this statement is correct.

Statement 2: The variance of Y is -8.
Variance can never be negative. Thus, this statement is incorrect.

Statement 3: The variance of Y is 16.
Using the property of variance:

Var(Y) = Var(5 - 2X) = (-2)^2 Var(X)

Substituting

Var(X) = 4

Var(Y) = 4 \times 4 = 16

Thus, this statement is correct.

Statement 4: The standard deviation of Y is 4.
The standard deviation is the square root of the variance.

SD(Y) = \sqrt{Var(Y)} = \sqrt{16} = 4

Thus, this statement is correct.

Therefore, the correct options are: "The mean of Y is -15.", "The variance of Y is 16.", and "The standard deviation of Y is 4."
"
:::

:::question type="NAT" question="The number of defects on a semiconductor wafer follows a Poisson distribution with a mean of 2 defects per wafer. What is the standard deviation of the number of defects per wafer?" answer="1.414" hint="Recall the key property of a Poisson distribution regarding its mean and variance." solution="
Step 1: Identify the distribution and its parameter.
The problem states that the number of defects follows a Poisson distribution. The mean is given as 2 defects per wafer. For a Poisson distribution, the parameter $\lambda$ is equal to the mean.
So, $\lambda = 2$ .

Step 2: Recall the formula for the variance of a Poisson distribution.
For a Poisson random variable $X$ with parameter $\lambda$ , the variance is given by:

Var(X) = \lambda

Step 3: Calculate the variance.

Var(X) = 2

Step 4: Calculate the standard deviation.
The standard deviation $\sigma$ is the square root of the variance.

\sigma = \sqrt{Var(X)} = \sqrt{2}

Step 5: Compute the numerical value.

\sigma \approx 1.41421...

Result: The standard deviation, rounded to three decimal places, is 1.414.
"
:::

:::question type="MCQ" question="A fair six-sided die is rolled 10 times. Let $X$ be the number of times the number '4' appears. What is the variance of $X$ ?" options=["50/36","5/3","25/18","5/6"] answer="25/18" hint="Recognize that this scenario describes a Binomial experiment. Identify the parameters $n$ and $p$ , and then use the formula for the variance of a Binomial distribution." solution="
Step 1: Identify the type of random variable.
The experiment consists of a fixed number of independent trials ( $n=10$ ). Each trial has two outcomes: success (rolling a '4') or failure (not rolling a '4'). The probability of success is constant for each trial. This is the definition of a Binomial experiment.

Step 2: Determine the parameters of the Binomial distribution.
The number of trials is $n = 10$ .
The probability of success, $p$ , is the probability of rolling a '4' on a fair die, which is $p = 1/6$ .
The probability of failure is $q = 1 - p = 1 - 1/6 = 5/6$ .

Step 3: Apply the formula for the variance of a Binomial random variable.
The variance of a Binomial distribution is given by $Var(X) = np(1-p)$ , or $npq$ .

Step 4: Substitute the parameters and calculate the variance.

Var(X) = 10 \times \frac{1}{6} \times \frac{5}{6}

Var(X) = \frac{50}{36}

Step 5: Simplify the fraction.

Var(X) = \frac{25}{18}

Result: The variance of $X$ is $25/18$ .
"
:::

---

Summary

A firm grasp of central tendency and dispersion is non-negotiable for success in the Probability and Statistics section of the GATE exam. These concepts form the bedrock upon which more complex statistical ideas are built.

❗ Key Takeaways for GATE

Mean and Variance Definitions: Be fluent in calculating the mean ( $E[X]$ ) and variance ( $Var(X)$ ) for both discrete (using summation) and continuous (using integration) random variables.

The Computational Formula is Key: For variance calculations, almost always use

Var(X) = E[X^2] - (E[X])^2

Memorize Standard Distributions: You must be able to instantly recall the mean and variance for Binomial, Poisson, Uniform, and Normal distributions. Questions often test these properties directly, and recognizing them saves critical time. The facts that $E[X]=Var(X)=\lambda$ for Poisson and $E[X]=0, Var(X)=1$ for the Standard Normal are particularly high-yield.

---

What's Next?

Mastery of these fundamental measures prepares you for more advanced topics in probability theory.

💡 Continue Learning

This topic connects to:

Probability Distributions: The mean and variance are defining characteristics of any probability distribution. A deep understanding of PMFs and PDFs is required to derive these measures from first principles.

Covariance and Correlation: These concepts extend the idea of variance to two random variables. Covariance measures how two variables change together, building directly on the concepts of expectation and deviation from the mean.

Chebyshev's Inequality: This powerful theorem uses the mean and standard deviation to provide a bound on the probability that a random variable lies a certain distance from its mean, regardless of the underlying distribution.

Master these connections for comprehensive GATE preparation!

---

💡 Moving Forward

Now that you understand Measures of Central Tendency and Dispersion, let's explore Correlation and Covariance which builds on these concepts.

---

Part 3: Correlation and Covariance

Introduction

In our study of random variables, we have thus far focused primarily on the properties of a single variable, such as its mean and variance. The variance, in particular, quantifies the spread or dispersion of a variable's distribution around its mean. However, in many practical applications, we are interested in understanding the relationship between two or more random variables. Do they tend to move in the same direction, in opposite directions, or is there no discernible pattern to their joint behavior?

This chapter introduces two fundamental statistical measures that address this question: covariance and correlation. Covariance provides a measure of the joint variability of two random variables, indicating the direction of their linear relationship. Correlation, a standardized version of covariance, goes a step further by also quantifying the strength of this linear association. A firm grasp of these concepts is indispensable, as they form the bedrock of more advanced topics such as regression analysis and portfolio theory, and are frequently tested in the GATE examination.

📖 Covariance

The covariance between two random variables, $X$ and $Y$ , with expected values $E[X] = \mu_X$ and $E[Y] = \mu_Y$ , is defined as the expected value of the product of their deviations from their respective means.

Cov(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]

---

Key Concepts

1. Understanding and Calculating Covariance

The definition of covariance, $E[(X - \mu_X)(Y - \mu_Y)]$ , provides significant intuition. Consider the term $(X - \mu_X)(Y - \mu_Y)$ . If $X$ and $Y$ tend to be simultaneously above their means or simultaneously below their means, this product will be positive on average. This results in a positive covariance, suggesting a positive linear relationship. Conversely, if one variable tends to be above its mean when the other is below its mean, the product will be negative on average, yielding a negative covariance. If there is no consistent linear pattern, the positive and negative products will tend to cancel out, resulting in a covariance near zero.

Positive Covariance

Negative Covariance

Zero Covariance

While the definitional formula is useful for conceptual understanding, a more practical formula is often used for computation, especially with discrete random variables. We can expand the definition:

\begin{aligned}Cov(X, Y) & = E[XY - X\mu_Y - Y\mu_X + \mu_X\mu_Y] \\ & = E[XY] - E[X\mu_Y] - E[Y\mu_X] + E[\mu_X\mu_Y] \\ & = E[XY] - \mu_Y E[X] - \mu_X E[Y] + \mu_X\mu_Y \\ & = E[XY] - \mu_Y \mu_X - \mu_X \mu_Y + \mu_X\mu_Y \\ & = E[XY] - \mu_X \mu_Y\end{aligned}

This leads to the widely used computational formula.

📐 Computational Formula for Covariance

Cov(X, Y) = E[XY] - E[X]E[Y]

Variables:

$E[XY]$ = The expected value of the product of the random variables $X$ and $Y$ .

$E[X]$ = The expected value of $X$ .

$E[Y]$ = The expected value of $Y$ .

When to use: This formula is almost always more convenient for calculation, particularly for discrete random variables where a joint probability mass function is available.

Worked Example:

Problem: A fair six-sided die is rolled. Let $X$ be a random variable that is 1 if the outcome is even and 0 if it is odd. Let $Y$ be a random variable that is 1 if the outcome is greater than 3, and 0 otherwise. Calculate the covariance between $X$ and $Y$ .

Solution:

Step 1: Define the sample space and probabilities.
The sample space is $S = \{1, 2, 3, 4, 5, 6\}$ , with each outcome having a probability of $1/6$ .

Step 2: Determine the values of $X$ and $Y$ for each outcome and calculate $E[X]$ and $E[Y]$ .

For $X$ : $X=1$ for $\{2, 4, 6\}$ ; $X=0$ for $\{1, 3, 5\}$ .

For $Y$ : $Y=1$ for $\{4, 5, 6\}$ ; $Y=0$ for $\{1, 2, 3\}$ .

P(X=1) = 3/6 = 1/2

P(X=0) = 3/6 = 1/2

E[X] = \left(1 \cdot \frac{1}{2}\right) + \left(0 \cdot \frac{1}{2}\right) = \frac{1}{2}

$P(Y=1) = 3/6 = 1/2$ , $P(Y=0) = 3/6 = 1/2$ .

E[Y] = \left(1 \cdot \frac{1}{2}\right) + \left(0 \cdot \frac{1}{2}\right) = \frac{1}{2}

Step 3: Determine the value of the product $XY$ for each outcome and calculate $E[XY]$ .
We need to find the outcomes where $XY=1$ . This occurs only when both $X=1$ (even) and $Y=1$ (greater than 3). The outcomes satisfying this are $\{4, 6\}$ .
Thus, $P(XY=1) = P(\{4, 6\}) = 2/6 = 1/3$ . For all other outcomes, $XY=0$ .

E[XY] = \left(1 \cdot P(XY=1)\right) + \left(0 \cdot P(XY=0)\right) = 1 \cdot \frac{1}{3} = \frac{1}{3}

Step 4: Apply the computational formula for covariance.

Cov(X, Y) = E[XY] - E[X]E[Y]

Cov(X, Y) = \frac{1}{3} - \left(\frac{1}{2}\right)\left(\frac{1}{2}\right)

Cov(X, Y) = \frac{1}{3} - \frac{1}{4}

Cov(X, Y) = \frac{4 - 3}{12} = \frac{1}{12}

Answer: The covariance between $X$ and $Y$ is $1/12$ .

---

2. Properties of Covariance

Understanding the properties of covariance is critical for simplifying complex expressions, a common requirement in GATE problems.

Symmetry:

Cov(X, Y) = Cov(Y, X)

Covariance with a Constant: The covariance of a random variable with a constant is always zero.

Cov(X, c) = 0

Relationship with Variance: The covariance of a random variable with itself is its variance.

Cov(X, X) = E[X \cdot X] - E[X]E[X] = E[X^2] - (E[X])^2 = Var(X)

Effect of Scaling and Shifting (Linear Transformations): For constants

a, b, c, d

Cov(aX + b, cY + d) = ac \cdot Cov(X, Y)

Notice that the additive constants

b

and

d

do not affect the covariance, as they do not change the spread of the variables.

Bilaterality:

Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z)

Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z)

Worked Example:

Problem: Let $X$ be a random variable with $Var(X) = 9$ . Let $Y = 5 - 2X$ . Calculate $Cov(X, Y)$ .

Solution:

Step 1: Identify the relationship between $X$ and $Y$ .
We are given $Y$ as a linear transformation of $X$ : $Y = -2X + 5$ .

Step 2: Apply the property of covariance under linear transformations.
We need to find $Cov(X, Y) = Cov(X, -2X + 5)$ .
Let $a=1, b=0, c=-2, d=5$ . The general property is $Cov(aX + b, cX + d) = ac \cdot Cov(X, X)$ .

Cov(X, -2X + 5) = (1)(-2) \cdot Cov(X, X)

Step 3: Use the relationship between covariance and variance.
We know that $Cov(X, X) = Var(X)$ .

Cov(X, Y) = -2 \cdot Var(X)

Step 4: Substitute the given value of $Var(X)$ .
We are given $Var(X) = 9$ .

Cov(X, Y) = -2 \cdot 9 = -18

Answer: The covariance between $X$ and $Y$ is $-18$ .

---

3. Correlation Coefficient

A significant limitation of covariance is that its magnitude is scale-dependent. If we change the units of $X$ from meters to centimeters, the covariance will increase by a factor of 100, even though the underlying relationship between the variables has not changed. To overcome this, we use the correlation coefficient, which is a normalized measure.

📖 Pearson Correlation Coefficient (ρ)

The Pearson correlation coefficient between two random variables, $X$ and $Y$ , is their covariance divided by the product of their standard deviations.

\rho(X, Y) = \frac{Cov(X, Y)}{\sigma_X \sigma_Y}

The correlation coefficient, $\rho$ , is a dimensionless quantity that always lies in the range $[-1, 1]$ .

$\rho = +1$ : Perfect positive linear relationship.

$\rho = -1$ : Perfect negative linear relationship.

$\rho = 0$ : No linear relationship.

Values between 0 and 1 indicate the strength of a positive linear relationship.

Values between -1 and 0 indicate the strength of a negative linear relationship.

📐
Correlation Coefficient

\rho(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}

Variables:

$Cov(X, Y)$ = Covariance of $X$ and $Y$ .

$Var(X)$ = Variance of $X$ .

$Var(Y)$ = Variance of $Y$ .

Application: Use this to find a standardized measure of linear association, which is independent of the units of the variables.

❗ Correlation vs. Causation

A non-zero correlation between two variables does not, by itself, imply that one variable causes the other. There could be a third, unobserved variable (a confounding variable) influencing both, or the relationship could be purely coincidental.

---

4. Variance of Sums of Random Variables

Covariance plays a crucial role in determining the variance of a sum or difference of random variables.

Let us derive the formula for $Var(X+Y)$ :

\begin{aligned}Var(X+Y) & = E[((X+Y) - E[X+Y])^2] \\ & = E[((X - E[X]) + (Y - E[Y]))^2] \\ & = E[(X - E[X])^2 + (Y - E[Y])^2 + 2(X - E[X])(Y - E[Y])]\end{aligned}

By linearity of expectation:

\begin{aligned}Var(X+Y) & = E[(X - E[X])^2] + E[(Y - E[Y])^2] + 2E[(X - E[X])(Y - E[Y])]\end{aligned}

This simplifies to:

Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y)

Similarly, for the difference:

Var(X-Y) = Var(X) + Var(Y) - 2Cov(X, Y)

A special and very important case arises when $X$ and $Y$ are independent. If two variables are independent, their covariance is zero. (The converse is not always true). For independent variables, the formulas simplify significantly:

Var(X \pm Y) = Var(X) + Var(Y) \quad \text{(if

---

Problem-Solving Strategies

💡 GATE Strategy: Discrete Covariance Calculation

For problems involving discrete random variables derived from an experiment (like coin tosses or dice rolls), follow a systematic procedure:

List Outcomes: Enumerate all possible outcomes of the experiment and their probabilities.

Create a Joint Table: Construct a table listing each outcome, its probability, and the corresponding values of $X$ , $Y$ , and the product $XY$ .

Calculate Marginal PMFs: From the joint table, determine the probability mass functions (PMFs) for $X$ and $Y$ individually.

Compute Expectations: Calculate $E[X]$ and $E[Y]$ using their PMFs.

Compute $E[XY]$ : Calculate the expected value of the product $XY$ directly from the joint table created in Step 2.

Apply Formula: Substitute the computed values into $Cov(X, Y) = E[XY] - E[X]E[Y]$ .

---

Common Mistakes

⚠️ Avoid These Errors

❌ Confusing Zero Correlation with Independence.

✅ Zero correlation or covariance implies only the absence of a linear relationship. Variables can have a strong non-linear relationship and still have zero correlation. However, if two variables are independent, their covariance is always zero.

❌ Incorrectly Applying Constants in Formulas.

✅ Remember that variance scales quadratically while covariance scales linearly.

Var(aX) = a^2 Var(X)

Cov(aX, Y) = a \cdot Cov(X, Y)

❌ Assuming Variance of a Sum is the Sum of Variances.

✅ The formula

Var(X+Y) = Var(X) + Var(Y)

is only valid if

X

and

Y

are uncorrelated (

Cov(X,Y)=0

). For the general case, you must include the covariance term:

Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y)

---

Practice Questions

:::question type="MCQ" question="Let $W$ be a random variable with $E[W]=0$ and $Var(W)=1$ . A new random variable $V$ is defined as $V = 3 - 4W$ . Given that $E[(V - E[V])^2] = 16$ and $E[(V - E[V])W] = -4$ , which of the following statements is consistent?" options=["The definition of $V$ is consistent with the given expected values.","The definition of $V$ is inconsistent because the variance is incorrect.","The definition of $V$ is inconsistent because the covariance term is incorrect.","The data is insufficient to determine consistency."] answer="The definition of $V$ is consistent with the given expected values." hint="Use the properties of expectation, variance, and covariance on the linear transformation $V = 3 - 4W$ and check if the results match the given values." solution="
Step 1: Calculate the expected value of $V$ from its definition.

E[V] = E[3 - 4W] = 3 - 4E[W]

Since

E[W]=0

, we have

E[V] = 3

Step 2: Calculate the variance of $V$ from its definition.
The given term $E[(V - E[V])^2]$ is the definition of $Var(V)$ . So, we are given $Var(V)=16$ .
Let's calculate $Var(V)$ from the transformation:

Var(V) = Var(3 - 4W) = (-4)^2 Var(W)

Since

Var(W)=1

, we have

Var(V) = 16 \cdot 1 = 16

. This matches the given information.

Step 3: Calculate the covariance term from its definition.
The given term $E[(V - E[V])W]$ is a form of covariance. Since $E[W]=0$ , this is $E[(V - E[V])(W - E[W])] = Cov(V, W)$ . So, we are given $Cov(V, W) = -4$ .
Let's calculate $Cov(V, W)$ from the transformation:

Cov(V, W) = Cov(3 - 4W, W) = Cov(-4W, W)

Using properties,

Cov(-4W, W) = -4 \cdot Cov(W, W) = -4 \cdot Var(W)

.
Since

Var(W)=1

, we have

Cov(V, W) = -4 \cdot 1 = -4

. This also matches the given information.

Answer: \boxed{The definition of $V$ is consistent with the given expected values.}
"
:::

:::question type="NAT" question="Two balls are drawn with replacement from an urn containing 3 red balls and 2 blue balls. Let the random variable $X$ be the number of red balls drawn, and let the random variable $Y$ be the number of blue balls drawn. The value of the covariance of $X$ and $Y$ , $Cov(X, Y)$ , is ______ (rounded off to two decimal places)." answer="-0.48" hint="Notice that $X+Y=2$ is a constant. Use this relationship and the properties of variance and covariance." solution="
Step 1: Establish the relationship between $X$ and $Y$ .
Since two balls are drawn in total, the number of red balls ( $X$ ) plus the number of blue balls ( $Y$ ) must equal 2.

X + Y = 2

Step 2: Use the property of variance of a constant.
The variance of a constant is zero.

Var(X+Y) = Var(2) = 0

Step 3: Expand $Var(X+Y)$ using the formula involving covariance.

Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y)

Step 4: Equate the expressions from Step 2 and Step 3.

Var(X) + Var(Y) + 2Cov(X, Y) = 0

This implies:

Cov(X, Y) = -\frac{1}{2}(Var(X) + Var(Y))

Step 5: Calculate $Var(X)$ and $Var(Y)$ .
The drawing of each ball is a Bernoulli trial. Let a "success" be drawing a red ball. The probability of success is $p = 3/5 = 0.6$ . Since we draw $n=2$ balls with replacement, $X$ follows a binomial distribution $B(n=2, p=0.6)$ .
The variance of a binomial distribution is $np(1-p)$ .

Var(X) = 2 \cdot (0.6) \cdot (1 - 0.6) = 2 \cdot 0.6 \cdot 0.4 = 0.48

Similarly,

Y

is the number of blue balls, so it follows

B(n=2, p=2/5=0.4)

Var(Y) = 2 \cdot (0.4) \cdot (1 - 0.4) = 2 \cdot 0.4 \cdot 0.6 = 0.48

Step 6: Substitute the variances into the equation for covariance.

Cov(X, Y) = -\frac{1}{2}(0.48 + 0.48)

Cov(X, Y) = -\frac{1}{2}(0.96)

Cov(X, Y) = -0.48

Answer: \boxed{-0.48}
"
:::

:::question type="MSQ" question="Let $X$ and $Y$ be two random variables such that $Var(X)=16$ , $Var(Y)=25$ , and the correlation coefficient $\rho(X, Y)=-0.5$ . Which of the following statements are correct?" options=[" $Cov(X, Y) = -10$ "," $Var(X+Y) = 21$ "," $Var(X-Y) = 61$ "," $Cov(3X, 2-Y) = 30$ "] answer="A,B,C" hint="Calculate each value using the fundamental formulas for covariance, correlation, and variance of sums/differences." solution="
Let's evaluate each option.

Option A: $Cov(X, Y)$
The formula for correlation is $\rho(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}$ .

Cov(X, Y) = \rho(X, Y) \cdot \sqrt{Var(X)Var(Y)}

Cov(X, Y) = -0.5 \cdot \sqrt{16 \cdot 25} = -0.5 \cdot \sqrt{400} = -0.5 \cdot 20 = -10

So, statement A is correct.

Option B: $Var(X+Y)$
The formula is $Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y)$ .

Var(X+Y) = 16 + 25 + 2(-10) = 41 - 20 = 21

So, statement B is correct.

Option C: $Var(X-Y)$
The formula is $Var(X-Y) = Var(X) + Var(Y) - 2Cov(X, Y)$ .

Var(X-Y) = 16 + 25 - 2(-10) = 41 + 20 = 61

So, statement C is correct.

Option D: $Cov(3X, 2-Y)$
Using the property $Cov(aX+b, cY+d) = ac \cdot Cov(X, Y)$ .
Here $a=3, b=0, c=-1, d=2$ .

Cov(3X, 2-Y) = (3)(-1) \cdot Cov(X, Y) = -3 \cdot (-10) = 30

So, statement D is correct.

Answer: \boxed{A, B, C}
"
:::

---

Summary

❗ Key Takeaways for GATE

Covariance measures the direction of a linear relationship. For computations, always use the formula $Cov(X, Y) = E[XY] - E[X]E[Y]$ .

Correlation is the normalized version of covariance, $\rho(X, Y) = \frac{Cov(X, Y)}{\sigma_X \sigma_Y}$ , which measures both the strength and direction of the linear relationship, bounded between -1 and 1.

Properties of Linear Transformations are frequently tested. Remember that $Var(aX+b) = a^2 Var(X)$ and $Cov(aX+b, cY+d) = ac \cdot Cov(X, Y)$ .

The Variance of a Sum depends on covariance: $Var(X \pm Y) = Var(X) + Var(Y) \pm 2Cov(X, Y)$ . This simplifies only if the variables are uncorrelated.

---

What's Next?

💡 Continue Learning

A solid understanding of covariance and correlation is a prerequisite for several advanced topics in data analysis and statistics.

Linear Regression: Correlation is the foundation of simple linear regression, which seeks to model the linear relationship between a dependent and an independent variable. The slope of the regression line is directly related to the covariance and variance of the variables.
Multivariate Distributions: In the study of distributions involving multiple random variables (e.g., the Multivariate Normal Distribution), the relationships between pairs of variables are described by a covariance matrix, a critical component of the distribution's parameterization.

---

💡 Moving Forward

Now that you understand Correlation and Covariance, let's explore Conditional Expectation and Variance which builds on these concepts.

---

Part 4: Conditional Expectation and Variance

Introduction

In our study of random variables, we often seek to understand the relationship between them. While concepts like covariance and correlation provide a measure of linear association, they do not capture the full picture. Conditional expectation offers a more powerful and nuanced tool. It allows us to determine the expected value of one random variable, given that we have observed the outcome of another. This concept is fundamental to prediction and estimation, forming the bedrock of modern statistical modeling and machine learning.

Consider a scenario where we wish to predict a student's final exam score ( $Y$ ) based on their mid-term score ( $X$ ). The unconditional expectation, $E[Y]$ , gives us the average final score over all students. However, a much more accurate prediction can be made if we know the student's mid-term score, say $X=x$ . The value we would then be interested in is the conditional expectation, $E[Y|X=x]$ , which represents the average final score specifically for students who scored $x$ on the mid-term.

This chapter will formally define conditional expectation and its counterpart, conditional variance. We will explore their properties, most notably the Law of Total Expectation and the Law of Total Variance, which provide elegant methods for decomposing complex problems. Furthermore, we shall see how these concepts can be applied to solve intricate problems involving sequences of random events, a common feature in GATE questions.

📖 Conditional Expectation

Let $X$ and $Y$ be two random variables. The conditional expectation of $Y$ given that $X$ has taken the value $x$ , denoted by $E[Y|X=x]$ , is the expected value of $Y$ computed with respect to its conditional probability distribution given $X=x$ .

It is crucial to distinguish between $E[Y|X=x]$ , which is a function of $x$ , and $E[Y|X]$ , which is a random variable because its value depends on the random outcome of $X$ .

---

Key Concepts

1. Conditional Expectation for Discrete Random Variables

When dealing with discrete random variables, the conditional expectation is computed as a weighted average, where the weights are given by the conditional probability mass function (PMF).

First, we must define the conditional PMF of $Y$ given $X=x$ .

p_{Y|X}(y|x) = P(Y=y | X=x) = \frac{P(X=x, Y=y)}{P(X=x)} = \frac{p_{X,Y}(x,y)}{p_X(x)}

Here, $p_{X,Y}(x,y)$ is the joint PMF of $X$ and $Y$ , and $p_X(x)$ is the marginal PMF of $X$ , provided that $p_X(x) > 0$ .

With this conditional PMF, we can now define the conditional expectation.

📐 Conditional Expectation (Discrete)

E[Y|X=x] = \sum_{y} y \cdot p_{Y|X}(y|x)

Variables:

$y$ = A possible value of the random variable $Y$ .

$p_{Y|X}(y|x)$ = The conditional PMF of $Y$ given $X=x$ .

When to use: Use this formula when both

X

and

Y

are discrete random variables and you are given their joint PMF.

Worked Example:

Problem: The joint PMF of two discrete random variables $X$ and $Y$ is given by the following table:

| | Y=0 | Y=1 | Y=2 |
| :--- | :--- | :--- | :--- |
| X=0 | 0.1 | 0.2 | 0.1 |
| X=1 | 0.3 | 0.1 | 0.2 |

Calculate the conditional expectation $E[Y|X=1]$ .

Solution:

Step 1: Find the marginal PMF of $X$ , $p_X(x)$ , specifically for $x=1$ . We sum the probabilities across the row for $X=1$ .

p_X(1) = P(X=1) = P(X=1, Y=0) + P(X=1, Y=1) + P(X=1, Y=2)

p_X(1) = 0.3 + 0.1 + 0.2 = 0.6

Step 2: Determine the conditional PMF of $Y$ given $X=1$ for each possible value of $Y$ .

For $Y=0$ :

P(Y=0|X=1) = \frac{P(X=1, Y=0)}{P(X=1)} = \frac{0.3}{0.6} = 0.5

For $Y=1$ :

P(Y=1|X=1) = \frac{P(X=1, Y=1)}{P(X=1)} = \frac{0.1}{0.6} = \frac{1}{6}

For $Y=2$ :

P(Y=2|X=1) = \frac{P(X=1, Y=2)}{P(X=1)} = \frac{0.2}{0.6} = \frac{1}{3}

(As a check, we note that $0.5 + 1/6 + 1/3 = 3/6 + 1/6 + 2/6 = 1$ , as required for a valid PMF).

Step 3: Apply the formula for conditional expectation.

E[Y|X=1] = \sum_{y} y \cdot P(Y=y|X=1)

E[Y|X=1] = (0 \cdot 0.5) + (1 \cdot \frac{1}{6}) + (2 \cdot \frac{1}{3})

Step 4: Compute the final value.

E[Y|X=1] = 0 + \frac{1}{6} + \frac{2}{3} = \frac{1}{6} + \frac{4}{6} = \frac{5}{6}

Answer: $E[Y|X=1] = \frac{5}{6} \approx 0.833$

---

2. Conditional Expectation for Continuous Random Variables

The logic for continuous variables is analogous to the discrete case, with sums replaced by integrals and PMFs by probability density functions (PDFs).

The conditional PDF of $Y$ given $X=x$ is defined as:

f_{Y|X}(y|x) = \frac{f_{X,Y}(x,y)}{f_X(x)}

where $f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx$ is the marginal PDF of $X$ , assuming $f_X(x) > 0$ .

📐 Conditional Expectation (Continuous)

E[Y|X=x] = \int_{-\infty}^{\infty} y \cdot f_{Y|X}(y|x) \, dy

Variables:

$y$ = A possible value of the random variable $Y$ .

$f_{Y|X}(y|x)$ = The conditional PDF of $Y$ given $X=x$ .

When to use: Use this when

X

and

Y

are continuous random variables with a given joint PDF. This is a common pattern in GATE questions.

Worked Example:

Problem: Let the joint PDF of random variables $X$ and $Y$ be

f_{X,Y}(x,y) = \begin{cases} \frac{x+y}{3}, & 0 < x < 1, \quad 0 < y < 2 \\ 0, & \text{otherwise} \end{cases}

Calculate

E[X|Y=1]

Solution:

Step 1: Find the marginal PDF of $Y$ , $f_Y(y)$ , by integrating the joint PDF with respect to $x$ .

f_Y(y) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx = \int_{0}^{1} \frac{x+y}{3} \, dx

f_Y(y) = \frac{1}{3} \left[ \frac{x^2}{2} + yx \right]_0^1 = \frac{1}{3} \left( \frac{1}{2} + y \right)

This is valid for $0 < y < 2$ .

Step 2: Evaluate the marginal PDF at the given condition, $y=1$ .

f_Y(1) = \frac{1}{3} \left( \frac{1}{2} + 1 \right) = \frac{1}{3} \left( \frac{3}{2} \right) = \frac{1}{2}

Step 3: Determine the conditional PDF $f_{X|Y}(x|y=1)$ .

f_{X|Y}(x|1) = \frac{f_{X,Y}(x,1)}{f_Y(1)}

f_{X|Y}(x|1) = \frac{(x+1)/3}{1/2} = \frac{2(x+1)}{3}

This conditional PDF is valid for $0 < x < 1$ .

Step 4: Apply the formula for conditional expectation.

E[X|Y=1] = \int_{-\infty}^{\infty} x \cdot f_{X|Y}(x|1) \, dx

E[X|Y=1] = \int_{0}^{1} x \cdot \frac{2(x+1)}{3} \, dx = \frac{2}{3} \int_{0}^{1} (x^2+x) \, dx

Step 5: Compute the integral.

E[X|Y=1] = \frac{2}{3} \left[ \frac{x^3}{3} + \frac{x^2}{2} \right]_0^1

E[X|Y=1] = \frac{2}{3} \left( (\frac{1}{3} + \frac{1}{2}) - 0 \right) = \frac{2}{3} \left( \frac{2+3}{6} \right) = \frac{2}{3} \cdot \frac{5}{6}

Step 6: Simplify to find the final answer.

E[X|Y=1] = \frac{10}{18} = \frac{5}{9}

Answer: $E[X|Y=1] = \frac{5}{9}$

---

3. Properties of Conditional Expectation

The true power of conditional expectation is revealed through its properties, which simplify complex calculations and provide deep theoretical insights.

The Law of Total Expectation (Tower Property)

This is arguably the most important property of conditional expectation and is frequently tested in GATE. It states that the expected value of the conditional expectation of $Y$ given $X$ is simply the expected value of $Y$ .

📐 Law of Total Expectation

E[Y] = E[E[Y|X]]

Variables:

$E[Y|X]$ is a random variable, as it is a function of the random variable $X$ .

$E[E[Y|X]]$ denotes taking the expectation of this new random variable.

When to use: This law is used to find an unconditional expectation when it is easier to first compute the expectation by conditioning on another variable. It is also a fundamental identity tested directly.

Let us demonstrate this for the continuous case.

\begin{aligned}E[E[Y|X]] & = \int_{-\infty}^{\infty} E[Y|X=x] \cdot f_X(x) \, dx \\ & = \int_{-\infty}^{\infty} \left( \int_{-\infty}^{\infty} y \cdot f_{Y|X}(y|x) \, dy \right) f_X(x) \, dx \\\text{Since } f_{Y|X}(y|x) \cdot f_X(x) & = f_{X,Y}(x,y), \text{ we have:} \\E[E[Y|X]] & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} y \cdot f_{X,Y}(x,y) \, dy \, dx \\\text{By changing the order of integration:} \\E[E[Y|X]] & = \int_{-\infty}^{\infty} y \left( \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx \right) \, dy \\\text{The inner integral is the marginal PDF of Y, } f_Y(y). \\E[E[Y|X]] & = \int_{-\infty}^{\infty} y \cdot f_Y(y) \, dy = E[Y]\end{aligned}

This elegant result confirms the property.

Other Key Properties

Linearity:

E[aY + bZ | X] = aE[Y|X] + bE[Z|X]

for constants

a, b

Taking Out What Is Known: For any function

g

E[g(X)Y | X] = g(X)E[Y|X]

. This is because, given

X

, the value of

g(X)

is no longer random; it is a known constant. A special case is

E[g(X)|X] = g(X)

Independence: If

X

and

Y

are independent, then

E[Y|X] = E[Y]

. Knowing the value of

X

provides no new information about

Y

---

4. Conditional Variance and The Law of Total Variance

Similar to expectation, we can define the variance of a random variable conditional on the value of another.

📖 Conditional Variance

The conditional variance of $Y$ given $X=x$ is defined as:

Var(Y|X=x) = E[(Y - E[Y|X=x])^2 | X=x]

A more convenient computational form is:

Var(Y|X=x) = E[Y^2|X=x] - (E[Y|X=x])^2

Like conditional expectation,

Var(Y|X)

is a random variable whose value depends on

X

This leads to a decomposition formula for variance, analogous to the Law of Total Expectation.

📐 Law of Total Variance

Var(Y) = E[Var(Y|X)] + Var(E[Y|X])

Variables:

$Var(Y)$ is the total variance of $Y$ .

$E[Var(Y|X)]$ is the expected conditional variance. It represents the average amount of variance remaining in $Y$ even after we know $X$ .

$Var(E[Y|X])$ is the variance of the conditional expectation. It represents the portion of the variance in $Y$ that is explained by the variability of $X$ .

When to use: This formula is extremely useful in situations where a random variable's variance is influenced by another random process. It breaks down the total variance into components.

---

5. Application: Recurrence Relations for Expected Values

A powerful application of conditional expectation is in solving problems involving sequences of trials, such as finding the expected number of steps to reach a certain state. This technique was implicitly tested in GATE. The core idea is to condition on the outcome of the first step.

Let $E$ be the expected number of trials until a target event occurs. We can write:

E = \sum_{\text{outcomes } i} E[\text{Trials} | \text{first outcome is } i] \cdot P(\text{first outcome is } i)

Worked Example:

Problem: A fair coin is flipped repeatedly. What is the expected number of flips required to see the pattern HT (Heads followed by Tails)?

Solution:

Step 1: Define the states and the expected values from each state.
Let $E$ be the expected number of flips from the start.
Let $E_H$ be the expected number of additional flips required, given that we have just seen a Head.

Step 2: Set up the equation for $E$ by conditioning on the first flip.
The first flip is either H (with probability 1/2) or T (with probability 1/2).
If the first flip is T, we have wasted one flip and are back to the start. The expected number of additional flips is $E$ . Total flips: $1+E$ .
If the first flip is H, we have used one flip and are now in state H. The expected number of additional flips is $E_H$ . Total flips: $1+E_H$ .

E = \frac{1}{2}(1 + E) + \frac{1}{2}(1 + E_H)

Step 3: Set up the equation for $E_H$ by conditioning on the next flip.
From state H, the next flip is either T (probability 1/2) or H (probability 1/2).
If the next flip is T, we have achieved the pattern HT. The process stops. Total additional flips: 1.
If the next flip is H, we have wasted a flip but are still in a state where the last flip was H. The expected number of additional flips is still $E_H$ . Total additional flips: $1+E_H$ .

E_H = \frac{1}{2}(1) + \frac{1}{2}(1 + E_H)

Step 4: Solve the system of linear equations. First, solve for $E_H$ .

E_H = \frac{1}{2} + \frac{1}{2} + \frac{1}{2}E_H

E_H = 1 + \frac{1}{2}E_H

\frac{1}{2}E_H = 1 \implies E_H = 2

Step 5: Substitute the value of $E_H$ back into the equation for $E$ .

E = \frac{1}{2}(1 + E) + \frac{1}{2}(1 + 2)

E = \frac{1}{2} + \frac{1}{2}E + \frac{3}{2}

E = 2 + \frac{1}{2}E

\frac{1}{2}E = 2 \implies E = 4

Answer: The expected number of flips is $\boxed{4}$ .

---

Problem-Solving Strategies

💡 GATE Strategy: Step-by-Step Conditioning

For any problem of the form "Find $E[g(Y)|X=x]$ ", follow this sequence:

Identify the type: Are the variables discrete or continuous?

Find the marginal: Calculate the marginal distribution of the conditioning variable ( $p_X(x)$ or $f_X(x)$ ).

Find the conditional distribution: Use the formula $p_{Y|X}(y|x) = p_{X,Y}(x,y)/p_X(x)$ (or its continuous counterpart). This is the most critical step.

Integrate/Sum: Apply the definition of expectation using the conditional distribution you just found: $E[g(Y)|X=x] = \int g(y) f_{Y|X}(y|x) dy$ .

This structured approach prevents errors in calculation, especially with complex integration bounds.

💡 Recurrence Strategy: Define States

For problems asking for the "expected number of trials until...", the key is to define states based on the progress towards the goal.

Let $E_i$ be the expected number of additional steps from state $i$ .

The initial state's expectation is what you want to find (e.g., $E_0$ ).

For each state, write an equation for $E_i$ by conditioning on the outcome of the next trial. The equation will be of the form $E_i = 1 + \sum_j P(\text{transition to } j) E_j$ .

The "success" state has an expected additional number of steps of 0.

Solve the resulting system of linear equations.

---

Common Mistakes

⚠️ Avoid These Errors

❌ Confusing $E[Y|X]$ and $E[Y|X=x]$ : Students often forget that $E[Y|X]$ is a random variable (a function of $X$ ), while $E[Y|X=x]$ is a specific value (a function of the number $x$ ). This distinction is crucial for understanding the Law of Total Expectation, $E[E[Y|X]] = E[Y]$ .

❌ Incorrect Marginalization: A frequent error in continuous problems is using incorrect limits when integrating the joint PDF to find the marginal PDF. Always carefully check the support of the joint PDF. For example, if $0 < y < x$ , the integral for $f_X(x)$ must be over $y$ from $0$ to $x$ , not over the full range of $y$ .

❌ Misinterpreting the Law of Total Variance: A common mistake is to write $Var(Y) = Var(E[Y|X]) + Var(Var(Y|X))$ instead of the correct $Var(Y) = Var(E[Y|X]) + E[Var(Y|X)]$ . Remember, you take the expectation of the conditional variance, not its variance.

✅ The correct formula is

Var(Y) = E[Var(Y|X)] + Var(E[Y|X])

---

Practice Questions

:::question type="MCQ" question="Let $X$ be a random variable and $g(X) = X^2$ . Which of the following is always equal to $E[g(X)Y | X=x]$ ?" options=[" $g(x) E[Y]$ "," $g(x) E[Y|X=x]$ "," $E[Y] E[g(X)|X=x]$ "," $g(E[X|Y=y])E[Y]$ "] answer=" $g(x) E[Y|X=x]$ " hint="Recall the 'taking out what is known' property of conditional expectation. When we condition on $X=x$ , any function of $x$ is treated as a constant." solution="The property of conditional expectation states that for any function $g$ , $E[g(X)Y | X] = g(X)E[Y|X]$ . When we evaluate this at a specific point $X=x$ , the random variable $g(X)$ becomes the constant $g(x)$ .
Therefore, we have:

E[g(X)Y | X=x] = g(x)E[Y|X=x]

In this specific case,

g(x) = x^2

. So,

E[X^2 Y | X=x] = x^2 E[Y|X=x]

.
Answer: \boxed{g(x) E[Y|X=x]}"
:::

:::question type="NAT" question="The joint PDF of two random variables $X$ and $Y$ is given by $f(x,y) = 8xy$ for $0 < y < x < 1$ , and 0 otherwise. Calculate the value of $E[Y|X=0.5]$ . (Round off to 2 decimal places)." answer="0.33" hint="First, find the marginal PDF $f_X(x)$ . Then find the conditional PDF $f_{Y|X}(y|x)$ . Finally, integrate $y \cdot f_{Y|X}(y|0.5)$ over the appropriate range of $y$ ." solution="
Step 1: Find the marginal PDF $f_X(x)$ .
The limits for $y$ are from $0$ to $x$ .

f_X(x) = \int_0^x 8xy \, dy = 8x \left[ \frac{y^2}{2} \right]_0^x = 8x \left( \frac{x^2}{2} \right) = 4x^3

This is for

0 < x < 1

Step 2: Find the conditional PDF $f_{Y|X}(y|x)$ .

f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)} = \frac{8xy}{4x^3} = \frac{2y}{x^2}

This is for

0 < y < x

Step 3: Calculate $E[Y|X=0.5]$ .
First, find the conditional PDF for $x=0.5$ .

f_{Y|X}(y|0.5) = \frac{2y}{(0.5)^2} = \frac{2y}{0.25} = 8y

The range for

y

0 < y < 0.5

Step 4: Integrate to find the conditional expectation.

E[Y|X=0.5] = \int_0^{0.5} y \cdot f_{Y|X}(y|0.5) \, dy = \int_0^{0.5} y \cdot (8y) \, dy

E[Y|X=0.5] = \int_0^{0.5} 8y^2 \, dy = 8 \left[ \frac{y^3}{3} \right]_0^{0.5}

E[Y|X=0.5] = \frac{8}{3} (0.5)^3 = \frac{8}{3} \cdot \frac{1}{8} = \frac{1}{3}

Result:
The value is approximately 0.3333... Rounding to 2 decimal places gives 0.33.
Answer: \boxed{0.33}"
:::

:::question type="MSQ" question="Let $X$ and $Y$ be random variables. Which of the following statements are always true?" options=[" $E[E[Y|X]] = E[Y]$ "," $Var(Y) = Var(E[Y|X]) + Var(Var(Y|X))$ "," $E[X+Y|X] = X + E[Y|X]$ "," $If X and Y are independent,$ E[XY|X] = X E[Y]$"] answer="A,C,D" hint="Carefully check the standard properties of conditional expectation and variance. Pay close attention to the Law of Total Expectation and Law of Total Variance." solution="

Option A: This is the Law of Total Expectation (or Tower Property), which is a fundamental property and is always true.

Option B: This is an incorrect statement of the Law of Total Variance. The correct law is $Var(Y) = Var(E[Y|X]) + E[Var(Y|X)]$ . The second term should be the expectation of the conditional variance, not the variance of the conditional variance. So, this statement is false.

Option C: Using the linearity property and the 'taking out what is known' property: $E[X+Y|X] = E[X|X] + E[Y|X]$ . Since $E[X|X]=X$ , the statement simplifies to $X + E[Y|X]$ . This is correct.

Option D: Using the 'taking out what is known' property, we get $E[XY|X] = X E[Y|X]$ . If $X$ and $Y$ are independent, then $E[Y|X] = E[Y]$ . Substituting this in, we get $E[XY|X] = X E[Y]$ . This statement is correct.

Therefore, options A, C, and D are always true.
Answer: \boxed{A, C, D}"
:::

:::question type="NAT" question="A coin has a probability $p=2/3$ of landing Heads (H). It is flipped repeatedly until the pattern Heads-Tails (HT) appears for the first time. What is the expected number of flips required?" answer="4.5" hint="Set up recurrence relations. Let $E$ be the expected flips from the start, and $E_H$ be the expected additional flips after one Head. Solve the system of two linear equations in terms of $p$ ." solution="
Step 1: Define states.
Let $E$ be the expected number of flips from the start.
Let $E_H$ be the expected additional flips, given the last flip was H.
Let $p=2/3$ be the probability of Heads (H) and $q=1-p=1/3$ be the probability of Tails (T).

Step 2: Formulate the equation for $E$ .
From the start, the first flip is either H (with probability $p$ ) or T (with probability $q$ ).

If T: We use 1 flip and are back to the start. Total flips: $1+E$ .

If H: We use 1 flip and move to state H. Total flips: $1+E_H$ .

E = q(1+E) + p(1+E_H) = 1 + qE + pE_H

Step 3: Formulate the equation for $E_H$ .
From state H, the next flip is either T (with probability $q$ ) or H (with probability $p$ ).

If T: We use 1 flip and the pattern HT is achieved. The process stops. Total additional flips: 1.

If H: We use 1 flip and are still in state H (the last flip was H). Total additional flips: $1+E_H$ .

E_H = q(1) + p(1+E_H) = 1 + pE_H

Step 4: Solve for $E_H$ .

E_H - pE_H = 1

E_H(1-p) = 1

Since

1-p=q

qE_H = 1 \implies E_H = \frac{1}{q}

Step 5: Substitute $E_H$ into the equation for $E$ .

E = 1 + qE + p\left(\frac{1}{q}\right)

E - qE = 1 + \frac{p}{q}

E(1-q) = \frac{q+p}{q}

Since

1-q=p

and

q+p=1

pE = \frac{1}{q}

E = \frac{1}{pq}

Step 6: Calculate the value using $p=2/3$ and $q=1/3$ .

E = \frac{1}{(2/3)(1/3)} = \frac{1}{2/9} = \frac{9}{2} = 4.5

Result:
The expected number of flips is 4.5.
Answer: \boxed{4.5}"
:::

---

Summary

❗ Key Takeaways for GATE

Law of Total Expectation: The most fundamental property is $E[Y] = E[E[Y|X]]$ . This allows breaking down a complex expectation calculation by conditioning on a suitable random variable. It is frequently tested directly as a theoretical question.

Calculation Procedure: For computational problems involving $E[Y|X=x]$ , always follow the three-step process: find the marginal distribution of $X$ , then find the conditional distribution of $Y$ given $X$ , and finally compute the expectation using that conditional distribution. Be meticulous with the limits of integration or summation.

Recurrence Relations: For problems asking for the expected time or trials to an event, the method of conditioning on the first step is extremely effective. Define states representing progress towards the goal and set up a system of linear equations for the expected values from each state.

---

What's Next?

💡 Continue Learning

This topic serves as a foundation for several advanced areas in data analysis and probability.

Regression Analysis: The conditional expectation $E[Y|X=x]$ is precisely the regression function of $Y$ on $X$ . It represents the best possible prediction of $Y$ given $X$ , in the sense of minimizing mean squared error.

Markov Chains: The state-based approach we used for recurrence problems is the essence of analyzing discrete-time Markov chains. The concept of conditioning on the previous state is central to the Markov property.

Bayesian Statistics: Conditional distributions are the heart of Bayesian inference. Bayes' theorem is used to update our belief about a parameter (a conditional distribution) after observing data.

---

Chapter Summary

📖 Random Variables - Key Takeaways

In this chapter, we have introduced the fundamental concept of a random variable and the mathematical tools used to characterize its behavior. A thorough understanding of these principles is essential for subsequent topics in probability and statistics. The most critical concepts to retain are as follows:

Fundamental Definition: A random variable is a function that assigns a numerical value to each outcome in the sample space of a random experiment. We distinguish between discrete random variables, which take on a countable number of values and are described by a Probability Mass Function (PMF), and continuous random variables, which take values in an interval and are described by a Probability Density Function (PDF).

Expectation: The expected value, or mean, of a random variable $X$ , denoted $E[X]$ , represents its long-term average. It is the center of mass of the probability distribution. For a discrete variable,

E[X] = \sum_{i} x_i P(X=x_i)

and for a continuous variable,

E[X] = \int_{-\infty}^{\infty} x f_X(x) dx

Variance and Standard Deviation: Variance, $\operatorname{Var}(X)$ , is the primary measure of the dispersion or spread of a distribution around its mean.

\operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2

The standard deviation,

\sigma_X = \sqrt{\operatorname{Var}(X)}

, provides this measure in the same units as the random variable itself.

Covariance and Correlation: For two random variables $X$ and $Y$ , covariance, $\operatorname{Cov}(X, Y)$ , measures their joint variability.

\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]

The correlation coefficient,

\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}

normalizes this measure to the range

[-1, 1]

, indicating the strength and direction of the linear relationship between them. It is crucial to remember that if

X

and

Y

are independent, then

\operatorname{Cov}(X, Y) = 0

, but the converse is not generally true.

Properties of Expectation and Variance: The linearity of expectation,

E[aX + bY] = aE[X] + bE[Y]

is a universally applicable and powerful tool. The variance of a linear combination is given by

\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y)

The covariance term vanishes if and only if the variables are uncorrelated.

Conditional Expectation: The conditional expectation $E[X|Y=y]$ is the expected value of $X$ given that the random variable $Y$ has taken the specific value $y$ . A cornerstone result is the Law of Total Expectation,

E[X] = E[E[X|Y]]

which allows us to compute an expectation by conditioning on another related variable.

---

Chapter Review Questions

:::question type="MCQ" question="Let $X$ and $Y$ be two random variables with $E[X]=2$ , $E[Y]=3$ , $\operatorname{Var}(X)=1$ , and $\operatorname{Var}(Y)=4$ . If the correlation coefficient between them is $\rho_{XY} = 0.5$ , what is the variance of the random variable $Z = 2X - Y$ ?" options=["4","8","12","16"] answer="A" hint="Recall the formula for the variance of a linear combination of two random variables: $\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y)$ . You will first need to find the covariance from the given correlation coefficient." solution="
We are asked to find the variance of $Z = 2X - Y$ . The general formula for the variance of a linear combination $aX + bY$ is:

\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y)

In our case,

a=2

and

b=-1

. The formula becomes:

\operatorname{Var}(2X - Y) = (2)^2\operatorname{Var}(X) + (-1)^2\operatorname{Var}(Y) + 2(2)(-1)\operatorname{Cov}(X, Y)

\operatorname{Var}(Z) = 4\operatorname{Var}(X) + \operatorname{Var}(Y) - 4\operatorname{Cov}(X, Y)

We are given

\operatorname{Var}(X)=1

and

\operatorname{Var}(Y)=4

. To find

\operatorname{Cov}(X, Y)

, we use the definition of the correlation coefficient:

\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}

where

\sigma_X = \sqrt{\operatorname{Var}(X)}

and

\sigma_Y = \sqrt{\operatorname{Var}(Y)}

.
Given

\operatorname{Var}(X)=1

, we have

\sigma_X = \sqrt{1} = 1

.
Given

\operatorname{Var}(Y)=4

, we have

\sigma_Y = \sqrt{4} = 2

.
Now, we can find the covariance:

\operatorname{Cov}(X, Y) = \rho_{XY} \sigma_X \sigma_Y = (0.5)(1)(2) = 1

Finally, we substitute all the values back into the variance formula for

Z

\operatorname{Var}(Z) = 4(1) + 4 - 4(1)

\operatorname{Var}(Z) = 4 + 4 - 4 = 4

Therefore, the variance of

Z

\boxed{4}

.
"
:::

:::question type="NAT" question="A continuous random variable $X$ is uniformly distributed over the interval $[1, 3]$ . The value of the conditional expectation $E[X | X > 2]$ is ____." answer="2.5" hint="First, find the conditional PDF of $X$ given the event $A = \{X > 2\}$ . The formula is $f_{X|A}(x) = f_X(x) / P(A)$ for $x$ in the conditioned range." solution="
Let $X \sim U(1, 3)$ . The probability density function (PDF) of $X$ is:

f_X(x) = \begin{cases} \frac{1}{3-1} = \frac{1}{2} & \text{for } 1 \le x \le 3 \\ 0 & \text{otherwise} \end{cases}

We want to find the conditional expectation

E[X | X > 2]

. Let the event be

A = \{X > 2\}

.
First, we calculate the probability of this event:

P(A) = P(X > 2) = \int_{2}^{3} f_X(x) dx = \int_{2}^{3} \frac{1}{2} dx = \frac{1}{2} [x]_2^3 = \frac{1}{2}(3 - 2) = \frac{1}{2}

The conditional PDF of

X

given

A

is defined as:

f_{X|A}(x) = \frac{f_X(x)}{P(A)} \quad \text{for } x \in A

For

2 < x \le 3

, the conditional PDF is:

f_{X|A}(x) = \frac{1/2}{1/2} = 1

And

f_{X|A}(x) = 0

otherwise. This means that given

X>2

X

is uniformly distributed on the interval

(2, 3]

.
Now, we can compute the conditional expectation:

E[X | X > 2] = \int_{-\infty}^{\infty} x f_{X|A}(x) dx = \int_{2}^{3} x \cdot 1 dx

E[X | X > 2] = \left[ \frac{x^2}{2} \right]_2^3 = \frac{3^2}{2} - \frac{2^2}{2} = \frac{9}{2} - \frac{4}{2} = \frac{5}{2} = 2.5

The value of the conditional expectation is

\boxed{2.5}

.
"
:::

:::question type="MCQ" question="A fair six-sided die is rolled. Let $X$ be the random variable for the outcome. What is the expected value of the random variable $Y = (X-3.5)^2$ ?" options=["2.917","3.5","1.708","2.5"] answer="A" hint="The value $E[(X-\mu)^2]$ is, by definition, the variance of $X$ . First, calculate the mean (expected value) of the die roll." solution="
The random variable $X$ represents the outcome of a fair six-sided die roll. The possible values for $X$ are $\{1, 2, 3, 4, 5, 6\}$ , each with probability $P(X=x) = 1/6$ .

First, we calculate the expected value of $X$ , denoted $\mu_X$ or $E[X]$ :

E[X] = \sum_{x=1}^{6} x \cdot P(X=x) = \frac{1}{6}(1+2+3+4+5+6) = \frac{21}{6} = 3.5

The question asks for the value of

E[Y] = E[(X-3.5)^2]

.
We recognize that

3.5

is the mean of

X

. Therefore, we are being asked to calculate

E[(X - E[X])^2]

. This is the definition of the variance of

X

\operatorname{Var}(X)

\operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2

We can calculate

\operatorname{Var}(X)

by first finding

E[X^2]

E[X^2] = \sum_{x=1}^{6} x^2 \cdot P(X=x) = \frac{1}{6}(1^2+2^2+3^2+4^2+5^2+6^2)

E[X^2] = \frac{1}{6}(1+4+9+16+25+36) = \frac{91}{6}

Now, we can find the variance:

\operatorname{Var}(X) = E[X^2] - (E[X])^2 = \frac{91}{6} - (3.5)^2 = \frac{91}{6} - \left(\frac{7}{2}\right)^2

\operatorname{Var}(X) = \frac{91}{6} - \frac{49}{4} = \frac{2 \cdot 91 - 3 \cdot 49}{12} = \frac{182 - 147}{12} = \frac{35}{12}

As a decimal,

\frac{35}{12} \approx 2.9166...

.
Rounding to three decimal places, the answer is

\boxed{2.917}

.
"
:::

:::question type="NAT" question="Two discrete random variables $X$ and $Y$ have a joint PMF given by $P(X=1, Y=0)=0.1$ , $P(X=1, Y=1)=0.4$ , $P(X=2, Y=0)=0.3$ , and $P(X=2, Y=1)=0.2$ . The covariance, $\operatorname{Cov}(X, Y)$ , is ____." answer="-0.1" hint="Use the formula $\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]$ . You will need to calculate the marginal distributions to find $E[X]$ and $E[Y]$ first." solution="
To find the covariance $\operatorname{Cov}(X, Y)$ , we use the formula $\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]$ . We must compute each term separately.

1. Calculate marginal PMFs and expectations $E[X]$ and $E[Y]$ :
The marginal PMF of $X$ is found by summing over the values of $Y$ :
$P(X=1) = P(X=1, Y=0) + P(X=1, Y=1) = 0.1 + 0.4 = 0.5$
$P(X=2) = P(X=2, Y=0) + P(X=2, Y=1) = 0.3 + 0.2 = 0.5$
The expected value of $X$ is:
$E[X] = 1 \cdot P(X=1) + 2 \cdot P(X=2) = 1(0.5) + 2(0.5) = 0.5 + 1.0 = 1.5$ .

The marginal PMF of $Y$ is found by summing over the values of $X$ :
$P(Y=0) = P(X=1, Y=0) + P(X=2, Y=0) = 0.1 + 0.3 = 0.4$
$P(Y=1) = P(X=1, Y=1) + P(X=2, Y=1) = 0.4 + 0.2 = 0.6$
The expected value of $Y$ is:
$E[Y] = 0 \cdot P(Y=0) + 1 \cdot P(Y=1) = 0(0.4) + 1(0.6) = 0.6$ .

2. Calculate $E[XY]$ :
The expectation of the product $XY$ is calculated using the joint PMF:

E[XY] = \sum_{x,y} xy \cdot P(X=x, Y=y)

The only terms that are non-zero are when both

x \ne 0

and

y \ne 0

. In this case, that is when

y=1

E[XY] = (1)(1)P(X=1, Y=1) + (2)(1)P(X=2, Y=1)

E[XY] = (1)(0.4) + (2)(0.2) = 0.4 + 0.4 = 0.8

3. Calculate Covariance:
Now we substitute the computed values into the covariance formula:

\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]

\operatorname{Cov}(X, Y) = 0.8 - (1.5)(0.6)

\operatorname{Cov}(X, Y) = 0.8 - 0.9 = -0.1

The covariance is

\boxed{-0.1}

.
"
:::

---

What's Next?

💡 Continue Your GATE Journey

Having completed this chapter on Random Variables, you have established a firm foundation in the language and mathematics used to describe and analyze random phenomena. These concepts are not an endpoint but rather a critical stepping stone in your preparation.

Key connections:

Relation to Previous Chapters: This chapter builds directly upon the fundamentals of Set Theory and Probability. The sample spaces and events we studied previously are now mapped to numerical values, allowing us to use the tools of calculus and algebra. The axioms of probability provide the rigorous underpinning for the properties of PMFs and PDFs.

Foundation for Future Chapters: The concepts mastered here are indispensable for the chapters that follow:

- Standard Probability Distributions: Our next step is to study specific, named families of random variables (such as Binomial, Poisson, Normal, and Exponential distributions). These are mathematical models for common real-world processes, and each is defined by its PMF or PDF, mean, and variance—the very concepts we have just explored. - Statistics and Estimation: In statistics, we often work with a sample of data to make inferences about a larger population. The sample mean and sample variance are themselves random variables. Understanding their expected values and variances is crucial for topics like parameter estimation and hypothesis testing. - Stochastic Processes: A stochastic process, such as a Markov Chain, is a sequence of random variables, typically indexed by time. A deep understanding of the properties of a single random variable is the essential prerequisite for analyzing systems that evolve randomly over time.

Random Variables

Random Variables

Overview

Chapter Contents

Learning Objectives

Part 1: Definition of Random Variables

Introduction

Key Concepts

1. Discrete Random Variables

2. Continuous Random Variables

Problem-Solving Strategies

Common Mistakes

Practice Questions

Summary

What's Next?

Part 2: Measures of Central Tendency and Dispersion

Introduction

Measures of Central Tendency

1. Mean (Expected Value)

2. Median

3. Mode

Measures of Dispersion

1. Variance

2. Standard Deviation

Mean and Variance of Standard Distributions

Problem-Solving Strategies

Common Mistakes

Practice Questions

Summary

What's Next?

Part 3: Correlation and Covariance

Introduction

Key Concepts

1. Understanding and Calculating Covariance

2. Properties of Covariance

3. Correlation Coefficient

4. Variance of Sums of Random Variables

Problem-Solving Strategies

Common Mistakes

Practice Questions

Summary

What's Next?

Part 4: Conditional Expectation and Variance

Introduction

Key Concepts

1. Conditional Expectation for Discrete Random Variables

2. Conditional Expectation for Continuous Random Variables

3. Properties of Conditional Expectation

The Law of Total Expectation (Tower Property)

Other Key Properties

4. Conditional Variance and The Law of Total Variance

5. Application: Recurrence Relations for Expected Values

Problem-Solving Strategies

Common Mistakes

Practice Questions

Summary

What's Next?

Chapter Summary

Chapter Review Questions

What's Next?

🎯 Key Points to Remember

Related Topics in Probability and Statistics

Hypothesis Testing

Estimation and Confidence Intervals

Sampling Distributions and the Central Limit Theorem

Continuous Probability Distributions

More Resources

Study Notes

Short Notes

Test Series

Mock Tests

Previous Year Papers

Chapter-wise PYQs

Chapter Practice

Why Choose MastersUp?

AI-Powered Plans

15,000+ Questions

Smart Analytics

Bookmark & Revise