100% FREE Updated: Mar 2026 Probability and Statistics Random Variables and Distributions

Random Variables

Comprehensive study notes on Random Variables for GATE DA preparation. This chapter covers key concepts, formulas, and examples needed for your exam.

Random Variables

Overview

In our preceding discussions, we established the foundational principles of probability theory by examining sample spaces and events. However, to perform a rigorous quantitative analysis of random phenomena, we must bridge the gap between abstract outcomes and numerical values. This chapter introduces the concept of a random variable, a fundamental construct that assigns a numerical value to every possible outcome of a random experiment. The formalization of this concept is a cornerstone of modern statistics and machine learning, allowing us to apply the powerful tools of mathematical analysis to uncertain events.

We shall begin by defining discrete and continuous random variables and exploring their associated probability distributions. Subsequently, we will investigate the essential characteristics of these distributions through measures of central tendency and dispersion. We will learn to compute and interpret quantities such as the expected value (E[X]E[X]) and variance (Var⁑(X)\operatorname{Var}(X)), which provide concise summaries of a variable's behavior. The chapter then progresses to the study of relationships between multiple random variables, introducing covariance and correlation as measures of linear dependence. A firm grasp of these concepts is indispensable for understanding feature interactions in data analysis and machine learning models, a topic of significant importance for the GATE examination.

Finally, we culminate our study with an examination of conditional expectation and variance. This advanced topic addresses how our knowledge or assumptions about one event or variable can alter our expectations about another. This principle forms the basis for many predictive models and inferential techniques encountered in the Data Science and AI syllabus. A thorough understanding of the material presented herein is therefore critical for solving complex problems and building a robust theoretical foundation for subsequent topics.

---

Chapter Contents

| # | Topic | What You'll Learn |
|---|-------|-------------------|
| 1 | Definition of Random Variables | Formalizing numerical outcomes of random experiments. |
| 2 | Measures of Central Tendency and Dispersion | Quantifying the center and spread of distributions. |
| 3 | Correlation and Covariance | Analyzing linear dependence between random variables. |
| 4 | Conditional Expectation and Variance | Updating expectations with new information provided. |

---

Learning Objectives

❗ By the End of This Chapter

After completing this chapter, you will be able to:

  • Define a random variable and differentiate between discrete and continuous types.

  • Calculate and interpret the expected value, variance, and standard deviation of a random variable.

  • Compute the covariance and correlation between two random variables to assess their linear relationship.

  • Determine the conditional expectation and variance of a random variable given an event or another variable.

---

We now turn our attention to Definition of Random Variables...
## Part 1: Definition of Random Variables

Introduction

In our study of probability, we are often concerned not with the specific outcomes of an experiment, but rather with some numerical property associated with those outcomes. For instance, in an experiment involving the tossing of two coins, we might be more interested in the number of heads that appear than in the exact sequence of heads and tails. This need to associate a numerical value with each outcome of a random experiment leads us to the fundamental concept of a random variable.

A random variable provides a means of mapping the, often non-numerical, outcomes in a sample space to a set of real numbers. This transformation is crucial as it allows us to apply the powerful tools of calculus and mathematical analysis to the study of probability and statistics. We can analyze distributions, calculate expected values, and determine variances, all of which are central to data analysis and inference. Understanding the formal definition and classification of random variables is the first essential step in this direction.

πŸ“– Random Variable

A random variable, typically denoted by a capital letter such as XX, is a function that assigns a real number to each outcome in the sample space SS of a random experiment. Formally, it is a mapping X:S→RX: S \to \mathbb{R}.

We use a capital letter, XX, to represent the random variable as a function, and a lowercase letter, xx, to represent a specific value that the random variable can take. The set of all possible values of XX is called the range or support of the random variable.

---

Key Concepts

The most fundamental classification of random variables is based on the nature of the values they can assume. This leads to two primary types: discrete and continuous random variables.

#
## 1. Discrete Random Variables

A random variable is said to be discrete if its range is finite or countably infinite. This means that the variable can only take on a specific, separated set of values. There are "gaps" between the possible values.

Consider the experiment of rolling a standard six-sided die. The sample space is S={1,2,3,4,5,6}S = \{1, 2, 3, 4, 5, 6\}. If we define a random variable XX as the outcome of the roll, then XX can take values from the set {1,2,3,4,5,6}\{1, 2, 3, 4, 5, 6\}. Since this set is finite, XX is a discrete random variable. Similarly, the number of defective items in a batch of 100 is a discrete random variable, as it can take integer values from 0 to 100.



Sample Space (S)

HH
HT
TH
TT

Real Numbers (ℝ)

2

1

0

X = No. of Heads








Worked Example:

Problem: An experiment consists of tossing two fair coins. Let the random variable XX be defined as the number of heads observed. Determine the sample space SS and the set of possible values for XX.

Solution:

Step 1: Define the sample space SS.
The sample space consists of all possible outcomes of tossing two coins. Let H denote Heads and T denote Tails.

S={HH,HT,TH,TT}S = \{HH, HT, TH, TT\}

Step 2: Apply the function XX to each outcome in SS.
The random variable XX counts the number of heads in each outcome.

For the outcome HHHH, the number of heads is 2. So, X(HH)=2X(HH) = 2.

For the outcome HTHT, the number of heads is 1. So, X(HT)=1X(HT) = 1.

For the outcome THTH, the number of heads is 1. So, X(TH)=1X(TH) = 1.

For the outcome TTTT, the number of heads is 0. So, X(TT)=0X(TT) = 0.

Step 3: List the set of all possible values for XX.
The range of the random variable XX is the set of all unique numerical values it can take.

Range(X)={0,1,2}Range(X) = \{0, 1, 2\}

Answer: The sample space is S={HH,HT,TH,TT}S = \{HH, HT, TH, TT\} and the random variable XX can take values from the set {0,1,2}\{0, 1, 2\}. Since this set is finite, XX is a discrete random variable.

---

#
## 2. Continuous Random Variables

A random variable is said to be continuous if its range is an interval or a collection of intervals on the real number line. This means the variable can take on any value within a given range, and there are uncountably infinite possible values.

For example, if we define a random variable YY as the height of a randomly selected student, YY can take any value within a certain range, say [150Β cm,190Β cm][150 \text{ cm}, 190 \text{ cm}]. It is not restricted to integer values; a height of 175.342175.342 cm is perfectly possible. Other examples include temperature, weight, and time. For a continuous random variable, the probability of it taking any single specific value is zero, i.e., P(Y=y)=0P(Y=y) = 0. We instead focus on the probability that the variable falls within a certain interval.

---

Problem-Solving Strategies

πŸ’‘ GATE Strategy: Identify the Variable Type

When presented with a problem, the first step is to determine whether the random variable is discrete or continuous.

  • Ask: Can I count the possible outcomes? If the values are countable (e.g., number of successes, number of arrivals, results of a die roll), it is discrete.

  • Ask: Is the variable measured? If the value is obtained by measurement (e.g., height, weight, time, temperature), it can take any value in an interval and is continuous.

This initial classification dictates the entire subsequent approach, including the type of probability distribution (PMF vs. PDF) to be used.

---

---

Common Mistakes

⚠️ Avoid These Errors
    • ❌ Confusing a random variable with an algebraic variable. A random variable XX is a function that maps outcomes to numbers. An algebraic variable xx is simply an unknown quantity.
    • ❌ Assuming all numerical variables are continuous. The number of cars passing a toll booth in an hour is numerical, but it is discrete (0, 1, 2, ...). It cannot be 2.5. Always check if the values are countable or if they fall within a continuous range.
    • ❌ Incorrectly defining the range. For an experiment of drawing 3 balls from a bag of 5 red and 5 blue balls, if XX is the number of red balls drawn, the range is {0,1,2,3}\{0, 1, 2, 3\}, not all real numbers from 0 to 3.

---

Practice Questions

:::question type="MCQ" question="Which of the following is an example of a discrete random variable?" options=["The height of a building", "The time taken to complete a race", "The number of defective items in a shipment of 1000 items", "The temperature of a room in Celsius"] answer="The number of defective items in a shipment of 1000 items" hint="A discrete variable's values can be counted. Which of the options represents a counted quantity rather than a measured one?" solution="Let's analyze the options.

  • The height of a building is a measurement and can take any value within a range, making it continuous.

  • The time taken to complete a race is also a measurement and is continuous.

  • The number of defective items is a count (0, 1, 2, ..., 1000). This set of values is finite and countable. Therefore, it is a discrete random variable.

  • The temperature of a room is a measurement and is continuous.

Answer: \boxed{\text{The number of defective items in a shipment of 1000 items}}"
:::

:::question type="NAT" question="A box contains 4 red and 3 green balls. An experiment consists of drawing 2 balls from the box without replacement. Let the random variable Y be the number of green balls drawn. What is the number of distinct values that Y can take?" answer="3" hint="Consider the minimum and maximum number of green balls you can possibly draw in this experiment." solution="Let Y be the number of green balls drawn.
The experiment involves drawing 2 balls from a total of 7.

  • We could draw 0 green balls (meaning 2 red balls are drawn). This is possible since there are 4 red balls. So, Y can be 0.

  • We could draw 1 green ball and 1 red ball. This is possible. So, Y can be 1.

  • We could draw 2 green balls. This is possible since there are 3 green balls. So, Y can be 2.

  • We cannot draw 3 green balls, as we are only drawing 2 balls in total.


The set of possible values for Y is {0,1,2}\{0, 1, 2\}.
The number of distinct values is 3.
Answer: \boxed{3}"
:::

:::question type="MSQ" question="Let S be the sample space of a random experiment. Let X be a random variable defined as a function X:S→RX: S \to \mathbb{R}. Which of the following statements are ALWAYS true?" options=["X is always a discrete random variable.", "The range of X is a subset of the real numbers.", "If the sample space S is finite, then X must be a discrete random variable.", "If X is a continuous random variable, the sample space S must be infinite."] answer="The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable." hint="Review the fundamental definition of a random variable and the properties of discrete vs. continuous variables." solution="Let's evaluate each statement:

  • 'X is always a discrete random variable.' This is false, as continuous random variables (e.g., height, temperature) exist.

  • 'The range of X is a subset of the real numbers.' This is true by the formal definition of a random variable, which maps outcomes from the sample space SS to the set of real numbers R\mathbb{R}.

  • 'If the sample space S is finite, then X must be a discrete random variable.' This is true. A function defined on a finite set can only produce a finite number of distinct outputs. A random variable with a finite range is, by definition, discrete.

  • 'If X is a continuous random variable, the sample space S must be infinite.' This is true. A continuous random variable has an uncountable range (an interval). A function cannot map a finite or countably infinite domain to an uncountable range. Therefore, the sample space SS must be infinite.
  • Based on the fundamental definitions, statements 2 and 3 are the most direct and fundamental consequences of the definition of a random variable.

    Answer: \boxed{\text{The range of X is a subset of the real numbers.,If the sample space S is finite, then X must be a discrete random variable.}}"
    :::

    ---

    Summary

    ❗ Key Takeaways for GATE

    • A Random Variable is a function that maps outcomes from a sample space SS to the set of real numbers R\mathbb{R}.

    • The primary classification is between Discrete Random Variables (countable values, e.g., number of defects) and Continuous Random Variables (values in an interval, e.g., height or weight).

    • The first step in any random variable problem is to correctly identify its type, as this determines the entire analytical approach.

    ---

    What's Next?

    πŸ’‘ Continue Learning

    This topic is the foundation for understanding how we model random phenomena numerically. Your understanding will be deepened by studying:

      • Probability Distributions: How probabilities are assigned to the values of a random variable. This involves learning about Probability Mass Functions (PMF) for discrete variables and Probability Density Functions (PDF) for continuous variables.

      • Expectation and Variance: How to calculate the central tendency (mean) and spread (variance) of a random variable, which are crucial measures for summarizing its behavior.


    Mastering the definition and classification of random variables is essential before proceeding to these more advanced concepts.

    ---

    πŸ’‘ Moving Forward

    Now that you understand Definition of Random Variables, let's explore Measures of Central Tendency and Dispersion which builds on these concepts.

    ---

    Part 2: Measures of Central Tendency and Dispersion

    Introduction

    In the study of random variables, it is often insufficient to know only the full probability distribution. For many applications in data analysis and statistical inference, we require concise numerical summaries that describe the essential features of the distribution. These summaries are broadly categorized into two types: measures of central tendency and measures of dispersion.

    Measures of central tendency aim to identify a single value that represents the "center" or "typical" value of a random variable. The most common of these is the mean, or expected value, which provides a long-run average of the outcomes. Measures of dispersion, conversely, quantify the variability or spread of the random variable's possible values around this central point. The primary measures here are the variance and its square root, the standard deviation. A thorough understanding of these measures is not merely a procedural exercise; it is fundamental to interpreting probabilistic models and making informed decisions based on data. In the context of the GATE examination, questions frequently test the direct calculation of these measures as well as their known properties for standard probability distributions.

    πŸ“– Summary Statistic

    A summary statistic is a single number that is computed from a probability distribution (or a sample of data) to summarize a specific characteristic of that distribution. Measures of central tendency and dispersion are the most fundamental types of summary statistics.

    ---

    ---

    Measures of Central Tendency

    These measures provide a single value that attempts to describe a set of data by identifying the central position within that set of data.

    1. Mean (Expected Value)

    The most important measure of central tendency is the mean, also known as the expected value. For a random variable XX, its expected value, denoted as E[X]E[X] or ΞΌX\mu_X, represents the weighted average of all possible values that XX can take, where the weights are the corresponding probabilities.

    πŸ“– Expected Value (Mean)

    The expected value of a random variable is the long-run average value of repetitions of the experiment it represents. It is the center of mass of the probability distribution.

    For a discrete random variable XX with a set of possible values SS and a probability mass function (PMF) p(x)p(x), the expected value is the sum of each value multiplied by its probability.

    πŸ“ Expected Value of a Discrete Random Variable
    E[X]=ΞΌX=βˆ‘x∈Sxβ‹…p(x)E[X] = \mu_X = \sum_{x \in S} x \cdot p(x)

    Variables:

      • xx: A possible value of the random variable XX.

      • p(x)p(x): The probability that XX takes the value xx, i.e., P(X=x)P(X=x).

      • SS: The sample space or set of all possible values for XX.


    When to use: When given the PMF of a discrete random variable and asked to find its mean.

    For a continuous random variable XX with a probability density function (PDF) f(x)f(x), the expected value is found by integrating the product of xx and f(x)f(x) over the entire range of XX.

    πŸ“ Expected Value of a Continuous Random Variable
    E[X]=ΞΌX=βˆ«βˆ’βˆžβˆžxβ‹…f(x) dxE[X] = \mu_X = \int_{-\infty}^{\infty} x \cdot f(x) \,dx

    Variables:

      • xx: A variable representing the values of the random variable XX.

      • f(x)f(x): The probability density function of XX.


    When to use: When given the PDF of a continuous random variable and asked to find its mean.

    Worked Example:

    Problem: A discrete random variable XX has the following probability mass function: P(X=1)=0.2P(X=1) = 0.2, P(X=2)=0.5P(X=2) = 0.5, and P(X=3)=0.3P(X=3) = 0.3. Calculate the mean of XX.

    Solution:

    Step 1: Identify the values of xx and their corresponding probabilities p(x)p(x).
    The values are x1=1x_1=1, x2=2x_2=2, x3=3x_3=3.
    The probabilities are p(1)=0.2p(1)=0.2, p(2)=0.5p(2)=0.5, p(3)=0.3p(3)=0.3.

    Step 2: Apply the formula for the expected value of a discrete random variable.

    E[X]=βˆ‘i=13xiβ‹…p(xi)E[X] = \sum_{i=1}^{3} x_i \cdot p(x_i)

    Step 3: Substitute the values and compute the sum.

    E[X]=(1Γ—0.2)+(2Γ—0.5)+(3Γ—0.3)E[X] = (1 \times 0.2) + (2 \times 0.5) + (3 \times 0.3)
    E[X]=0.2+1.0+0.9E[X] = 0.2 + 1.0 + 0.9

    Step 4: Calculate the final result.

    E[X]=2.1E[X] = 2.1

    Answer: The mean of the random variable XX is 2.12.1.

    2. Median

    The median is the value that separates the higher half from the lower half of a probability distribution. For a continuous random variable XX with cumulative distribution function (CDF) FX(x)F_X(x), the median mm is the value such that P(X≀m)=0.5P(X \le m) = 0.5.

    πŸ“– Median

    The median mm of a random variable XX is any value such that P(X≀m)β‰₯0.5P(X \le m) \ge 0.5 and P(Xβ‰₯m)β‰₯0.5P(X \ge m) \ge 0.5. For a continuous random variable with CDF FX(x)F_X(x), it is the value mm for which FX(m)=0.5F_X(m) = 0.5.

    We observe that the median is less sensitive to extreme values (outliers) than the mean, a property known as robustness.

    3. Mode

    The mode of a random variable is the value at which its PMF or PDF takes its maximum value. A distribution can have one mode (unimodal), two modes (bimodal), or more (multimodal).

    πŸ“– Mode

    The mode of a random variable XX is the value that is most likely to occur. For a discrete random variable, it is the value xx that maximizes the PMF p(x)p(x). For a continuous random variable, it is the value xx that maximizes the PDF f(x)f(x).

    ---

    Measures of Dispersion

    While central tendency tells us about the location of a distribution, measures of dispersion describe its spread or variability.

    1. Variance

    Variance is the most common measure of dispersion. It quantifies the spread of a random variable's values around its mean. Specifically, it is the expected value of the squared deviation from the mean.

    πŸ“– Variance

    The variance of a random variable XX with mean ΞΌ=E[X]\mu = E[X], denoted as Var(X)Var(X) or Οƒ2\sigma^2, is defined as Var(X)=E[(Xβˆ’ΞΌ)2]Var(X) = E[(X - \mu)^2]. A small variance indicates that the data points tend to be very close to the mean, while a high variance indicates that the data points are spread out over a wider range.

    While the definition is intuitive, a more practical formula, often called the computational formula, is used for calculations. We can derive it as follows:

    Var(X)=E[(Xβˆ’ΞΌ)2]Var(X) = E[(X - \mu)^2]
    =E[X2βˆ’2ΞΌX+ΞΌ2]= E[X^2 - 2\mu X + \mu^2]

    By linearity of expectation,

    =E[X2]βˆ’E[2ΞΌX]+E[ΞΌ2]= E[X^2] - E[2\mu X] + E[\mu^2]

    Since ΞΌ\mu is a constant, E[2ΞΌX]=2ΞΌE[X]=2ΞΌ2E[2\mu X] = 2\mu E[X] = 2\mu^2 and E[ΞΌ2]=ΞΌ2E[\mu^2] = \mu^2.

    =E[X2]βˆ’2ΞΌ2+ΞΌ2= E[X^2] - 2\mu^2 + \mu^2

    =E[X2]βˆ’ΞΌ2= E[X^2] - \mu^2
    =E[X2]βˆ’(E[X])2= E[X^2] - (E[X])^2
    πŸ“ Variance (Computational Formula)
    Var(X)=Οƒ2=E[X2]βˆ’(E[X])2Var(X) = \sigma^2 = E[X^2] - (E[X])^2

    Variables:

      • E[X2]E[X^2]: The expected value of the square of the random variable.

      • (E[X])2(E[X])^2: The square of the expected value of the random variable.


    When to use: This is the preferred formula for most GATE problems as it simplifies calculation. It requires computing two expectations: E[X]E[X] and E[X2]E[X^2].

    2. Standard Deviation

    The standard deviation is simply the positive square root of the variance. Its primary advantage is that it is expressed in the same units as the random variable, making it more interpretable than the variance.

    πŸ“– Standard Deviation

    The standard deviation of a random variable XX, denoted by Οƒ\sigma, is the positive square root of its variance.

    Οƒ=Var(X)\sigma = \sqrt{Var(X)}

    Worked Example:

    Problem: For the discrete random variable XX from the previous example with P(X=1)=0.2P(X=1) = 0.2, P(X=2)=0.5P(X=2) = 0.5, and P(X=3)=0.3P(X=3) = 0.3, calculate the variance and standard deviation. We already found that E[X]=2.1E[X] = 2.1.

    Solution:

    Step 1: Calculate E[X2]E[X^2] using its definition for a discrete random variable.

    E[X2]=βˆ‘i=13xi2β‹…p(xi)E[X^2] = \sum_{i=1}^{3} x_i^2 \cdot p(x_i)

    Step 2: Substitute the values.

    E[X2]=(12Γ—0.2)+(22Γ—0.5)+(32Γ—0.3)E[X^2] = (1^2 \times 0.2) + (2^2 \times 0.5) + (3^2 \times 0.3)
    E[X2]=(1Γ—0.2)+(4Γ—0.5)+(9Γ—0.3)E[X^2] = (1 \times 0.2) + (4 \times 0.5) + (9 \times 0.3)
    E[X2]=0.2+2.0+2.7E[X^2] = 0.2 + 2.0 + 2.7
    E[X2]=4.9E[X^2] = 4.9

    Step 3: Apply the computational formula for variance.

    Var(X)=E[X2]βˆ’(E[X])2Var(X) = E[X^2] - (E[X])^2

    Step 4: Substitute the calculated values of E[X2]E[X^2] and E[X]E[X].

    Var(X)=4.9βˆ’(2.1)2Var(X) = 4.9 - (2.1)^2
    Var(X)=4.9βˆ’4.41Var(X) = 4.9 - 4.41
    Var(X)=0.49Var(X) = 0.49

    Step 5: Calculate the standard deviation by taking the square root of the variance.

    Οƒ=Var(X)=0.49\sigma = \sqrt{Var(X)} = \sqrt{0.49}
    Οƒ=0.7\sigma = 0.7

    Answer: The variance is 0.490.49 and the standard deviation is 0.70.7.

    ---

    Mean and Variance of Standard Distributions

    For the GATE exam, it is imperative to know the mean and variance of several standard probability distributions by heart. Direct application of these formulas can save significant time.

    | Distribution | Parameters | Mean (E[X]E[X]) | Variance (Var(X)Var(X)) |
    | :--- | :--- | :--- | :--- |
    | Binomial | nn (trials), pp (success prob.) | npnp | np(1βˆ’p)np(1-p) |
    | Poisson | Ξ»\lambda (rate) | Ξ»\lambda | Ξ»\lambda |
    | Uniform (Continuous) | aa (min), bb (max) | a+b2\frac{a+b}{2} | (bβˆ’a)212\frac{(b-a)^2}{12} |
    | Exponential | Ξ»\lambda (rate) | 1Ξ»\frac{1}{\lambda} | 1Ξ»2\frac{1}{\lambda^2} |
    | Normal | ΞΌ\mu (mean), Οƒ2\sigma^2 (variance) | ΞΌ\mu | Οƒ2\sigma^2 |
    | Standard Normal | ΞΌ=0\mu=0, Οƒ2=1\sigma^2=1 | 00 | 11 |

    ❗ Must Remember

    The properties of the Poisson and Standard Normal distributions are frequently tested.

      • For a Poisson random variable, the mean and variance are identical: E[X]=Var(X)=Ξ»E[X] = Var(X) = \lambda.

      • For a Standard Normal random variable, the mean is 00 and the variance is 11.

    ---

    Problem-Solving Strategies

    A strategic approach is essential for solving problems under time constraints.

    πŸ’‘ GATE Strategy: Recognize the Distribution

    Before starting a lengthy calculation from a PMF or PDF, always check if the problem describes a standard distribution (Binomial, Poisson, etc.). If it does, you can use the known formulas for mean and variance directly, which is significantly faster than calculating from first principles.

    πŸ’‘ GATE Strategy: Use Properties of Expectation and Variance

    For a transformed random variable Y=aX+bY = aX + b, where aa and bb are constants:

      • E[Y]=E[aX+b]=aE[X]+bE[Y] = E[aX + b] = aE[X] + b

      • Var(Y)=Var(aX+b)=a2Var(X)Var(Y) = Var(aX + b) = a^2 Var(X)


    These properties are extremely useful for simplifying problems. Note that the additive constant bb does not affect the variance, as it simply shifts the distribution without changing its spread.

    ---

    Common Mistakes

    Awareness of common pitfalls can prevent the loss of valuable marks.

    ⚠️ Avoid These Errors
      • ❌ Confusing Standard Deviation and Variance: Students often provide the variance when the standard deviation is asked, or vice-versa.
    βœ… Always double-check what the question asks for. Remember Οƒ=Οƒ2\sigma = \sqrt{\sigma^2}.
      • ❌ Incorrectly Applying Variance Properties: A frequent error is to write Var(aX+b)=aVar(X)+bVar(aX+b) = aVar(X)+b.
    βœ… The correct property is Var(aX+b)=a2Var(X)Var(aX+b) = a^2 Var(X). The constant bb shifts the mean but not the spread, and the scaling factor aa is squared.
      • ❌ Error in Computational Formula: Forgetting to square the mean in the variance formula, i.e., using E[X2]βˆ’E[X]E[X^2] - E[X] instead of E[X2]βˆ’(E[X])2E[X^2] - (E[X])^2.
    βœ… Always be careful to square the entire term E[X]E[X].

    ---

    Practice Questions

    :::question type="MCQ" question="A random variable XX has the probability mass function P(X=0)=1/3P(X=0) = 1/3, P(X=1)=1/2P(X=1) = 1/2, and P(X=2)=1/6P(X=2) = 1/6. What is the variance of the random variable Y=2Xβˆ’3Y = 2X - 3?" options=["17/36","17/9","34/9","5/6"] answer="17/9" hint="First, find the variance of XX using the computational formula Var(X)=E[X2]βˆ’(E[X])2Var(X) = E[X^2] - (E[X])^2. Then, use the property Var(aX+b)=a2Var(X)Var(aX+b) = a^2Var(X)." solution="
    Step 1: Calculate the mean of XX, E[X]E[X].

    E[X]=(0Γ—13)+(1Γ—12)+(2Γ—16)E[X] = \left(0 \times \frac{1}{3}\right) + \left(1 \times \frac{1}{2}\right) + \left(2 \times \frac{1}{6}\right)
    E[X]=0+12+13=56E[X] = 0 + \frac{1}{2} + \frac{1}{3} = \frac{5}{6}

    Step 2: Calculate E[X2]E[X^2].

    E[X2]=(02Γ—13)+(12Γ—12)+(22Γ—16)E[X^2] = \left(0^2 \times \frac{1}{3}\right) + \left(1^2 \times \frac{1}{2}\right) + \left(2^2 \times \frac{1}{6}\right)
    E[X2]=0+12+46=12+23=76E[X^2] = 0 + \frac{1}{2} + \frac{4}{6} = \frac{1}{2} + \frac{2}{3} = \frac{7}{6}

    Step 3: Calculate the variance of XX, Var(X)Var(X).

    Var(X)=E[X2]βˆ’(E[X])2Var(X) = E[X^2] - (E[X])^2
    Var(X)=76βˆ’(56)2=76βˆ’2536Var(X) = \frac{7}{6} - \left(\frac{5}{6}\right)^2 = \frac{7}{6} - \frac{25}{36}
    Var(X)=42βˆ’2536=1736Var(X) = \frac{42 - 25}{36} = \frac{17}{36}

    Step 4: Calculate the variance of Y=2Xβˆ’3Y = 2X - 3.

    Var(Y)=Var(2Xβˆ’3)=22Var(X)Var(Y) = Var(2X - 3) = 2^2 Var(X)
    Var(Y)=4Γ—1736=179Var(Y) = 4 \times \frac{17}{36} = \frac{17}{9}

    Result: The variance of YY is 17/917/9.
    "
    :::

    :::question type="NAT" question="A continuous random variable XX has a probability density function given by f(x)=38x2f(x) = \frac{3}{8}x^2 for 0≀x≀20 \le x \le 2, and f(x)=0f(x)=0 otherwise. Calculate the mean of XX." answer="1.5" hint="Use the formula for the expected value of a continuous random variable: E[X]=∫xβ‹…f(x) dxE[X] = \int x \cdot f(x) \,dx over the defined range." solution="
    Step 1: Set up the integral for the expected value, E[X]E[X].

    E[X]=βˆ«βˆ’βˆžβˆžxβ‹…f(x) dxE[X] = \int_{-\infty}^{\infty} x \cdot f(x) \,dx

    Step 2: Substitute the given PDF and adjust the integration limits.

    E[X]=∫02xβ‹…(38x2) dxE[X] = \int_{0}^{2} x \cdot \left(\frac{3}{8}x^2\right) \,dx

    Step 3: Simplify the integrand.

    E[X]=38∫02x3 dxE[X] = \frac{3}{8} \int_{0}^{2} x^3 \,dx

    Step 4: Evaluate the integral.

    E[X]=38[x44]02E[X] = \frac{3}{8} \left[ \frac{x^4}{4} \right]_{0}^{2}

    Step 5: Substitute the limits of integration.

    E[X]=38(244βˆ’044)E[X] = \frac{3}{8} \left( \frac{2^4}{4} - \frac{0^4}{4} \right)
    E[X]=38(164βˆ’0)E[X] = \frac{3}{8} \left( \frac{16}{4} - 0 \right)
    E[X]=38Γ—4E[X] = \frac{3}{8} \times 4

    Result:

    E[X]=128=32=1.5E[X] = \frac{12}{8} = \frac{3}{2} = 1.5
    " :::

    :::question type="MSQ" question="Let XX be a random variable with mean E[X]=10E[X] = 10 and variance Var(X)=4Var(X) = 4. Let a new random variable be defined as Y=5βˆ’2XY = 5 - 2X. Which of the following statements is/are correct?" options=["The mean of Y is -15.","The variance of Y is -8.","The variance of Y is 16.","The standard deviation of Y is 4."] answer="The mean of Y is -15.,The variance of Y is 16.,The standard deviation of Y is 4." hint="Apply the properties of expectation and variance for a linear transformation. E[aX+b]=aE[X]+bE[aX+b] = aE[X]+b and Var(aX+b)=a2Var(X)Var(aX+b) = a^2Var(X)." solution="
    Let us analyze each statement.
    The transformation is Y=5βˆ’2XY = 5 - 2X. Here, a=βˆ’2a = -2 and b=5b = 5.

    Statement 1: The mean of Y is -15.
    Using the property of expectation:

    E[Y]=E[5βˆ’2X]=5βˆ’2E[X]E[Y] = E[5 - 2X] = 5 - 2E[X]

    Substituting E[X]=10E[X] = 10:
    E[Y]=5βˆ’2(10)=5βˆ’20=βˆ’15E[Y] = 5 - 2(10) = 5 - 20 = -15

    Thus, this statement is correct.

    Statement 2: The variance of Y is -8.
    Variance can never be negative. Thus, this statement is incorrect.

    Statement 3: The variance of Y is 16.
    Using the property of variance:

    Var(Y)=Var(5βˆ’2X)=(βˆ’2)2Var(X)Var(Y) = Var(5 - 2X) = (-2)^2 Var(X)

    Substituting Var(X)=4Var(X) = 4:
    Var(Y)=4Γ—4=16Var(Y) = 4 \times 4 = 16

    Thus, this statement is correct.

    Statement 4: The standard deviation of Y is 4.
    The standard deviation is the square root of the variance.

    SD(Y)=Var(Y)=16=4SD(Y) = \sqrt{Var(Y)} = \sqrt{16} = 4

    Thus, this statement is correct.

    Therefore, the correct options are: "The mean of Y is -15.", "The variance of Y is 16.", and "The standard deviation of Y is 4."
    "
    :::

    :::question type="NAT" question="The number of defects on a semiconductor wafer follows a Poisson distribution with a mean of 2 defects per wafer. What is the standard deviation of the number of defects per wafer?" answer="1.414" hint="Recall the key property of a Poisson distribution regarding its mean and variance." solution="
    Step 1: Identify the distribution and its parameter.
    The problem states that the number of defects follows a Poisson distribution. The mean is given as 2 defects per wafer. For a Poisson distribution, the parameter Ξ»\lambda is equal to the mean.
    So, Ξ»=2\lambda = 2.

    Step 2: Recall the formula for the variance of a Poisson distribution.
    For a Poisson random variable XX with parameter Ξ»\lambda, the variance is given by:

    Var(X)=Ξ»Var(X) = \lambda

    Step 3: Calculate the variance.

    Var(X)=2Var(X) = 2

    Step 4: Calculate the standard deviation.
    The standard deviation Οƒ\sigma is the square root of the variance.

    Οƒ=Var(X)=2\sigma = \sqrt{Var(X)} = \sqrt{2}

    Step 5: Compute the numerical value.

    Οƒβ‰ˆ1.41421...\sigma \approx 1.41421...

    Result: The standard deviation, rounded to three decimal places, is 1.414.
    "
    :::

    :::question type="MCQ" question="A fair six-sided die is rolled 10 times. Let XX be the number of times the number '4' appears. What is the variance of XX?" options=["50/36","5/3","25/18","5/6"] answer="25/18" hint="Recognize that this scenario describes a Binomial experiment. Identify the parameters nn and pp, and then use the formula for the variance of a Binomial distribution." solution="
    Step 1: Identify the type of random variable.
    The experiment consists of a fixed number of independent trials (n=10n=10). Each trial has two outcomes: success (rolling a '4') or failure (not rolling a '4'). The probability of success is constant for each trial. This is the definition of a Binomial experiment.

    Step 2: Determine the parameters of the Binomial distribution.
    The number of trials is n=10n = 10.
    The probability of success, pp, is the probability of rolling a '4' on a fair die, which is p=1/6p = 1/6.
    The probability of failure is q=1βˆ’p=1βˆ’1/6=5/6q = 1 - p = 1 - 1/6 = 5/6.

    Step 3: Apply the formula for the variance of a Binomial random variable.
    The variance of a Binomial distribution is given by Var(X)=np(1βˆ’p)Var(X) = np(1-p), or npqnpq.

    Step 4: Substitute the parameters and calculate the variance.

    Var(X)=10Γ—16Γ—56Var(X) = 10 \times \frac{1}{6} \times \frac{5}{6}

    Var(X)=5036Var(X) = \frac{50}{36}

    Step 5: Simplify the fraction.

    Var(X)=2518Var(X) = \frac{25}{18}

    Result: The variance of XX is 25/1825/18.
    "
    :::

    ---

    Summary

    A firm grasp of central tendency and dispersion is non-negotiable for success in the Probability and Statistics section of the GATE exam. These concepts form the bedrock upon which more complex statistical ideas are built.

    ❗ Key Takeaways for GATE

    • Mean and Variance Definitions: Be fluent in calculating the mean (E[X]E[X]) and variance (Var(X)Var(X)) for both discrete (using summation) and continuous (using integration) random variables.

    • The Computational Formula is Key: For variance calculations, almost always use

    • Var(X)=E[X2]βˆ’(E[X])2Var(X) = E[X^2] - (E[X])^2

      It is faster and less prone to error than the definitional formula.
    • Memorize Standard Distributions: You must be able to instantly recall the mean and variance for Binomial, Poisson, Uniform, and Normal distributions. Questions often test these properties directly, and recognizing them saves critical time. The facts that E[X]=Var(X)=Ξ»E[X]=Var(X)=\lambda for Poisson and E[X]=0,Var(X)=1E[X]=0, Var(X)=1 for the Standard Normal are particularly high-yield.

    ---

    What's Next?

    Mastery of these fundamental measures prepares you for more advanced topics in probability theory.

    πŸ’‘ Continue Learning

    This topic connects to:

      • Probability Distributions: The mean and variance are defining characteristics of any probability distribution. A deep understanding of PMFs and PDFs is required to derive these measures from first principles.

      • Covariance and Correlation: These concepts extend the idea of variance to two random variables. Covariance measures how two variables change together, building directly on the concepts of expectation and deviation from the mean.

      • Chebyshev's Inequality: This powerful theorem uses the mean and standard deviation to provide a bound on the probability that a random variable lies a certain distance from its mean, regardless of the underlying distribution.


    Master these connections for comprehensive GATE preparation!

    ---

    πŸ’‘ Moving Forward

    Now that you understand Measures of Central Tendency and Dispersion, let's explore Correlation and Covariance which builds on these concepts.

    ---

    Part 3: Correlation and Covariance

    Introduction

    In our study of random variables, we have thus far focused primarily on the properties of a single variable, such as its mean and variance. The variance, in particular, quantifies the spread or dispersion of a variable's distribution around its mean. However, in many practical applications, we are interested in understanding the relationship between two or more random variables. Do they tend to move in the same direction, in opposite directions, or is there no discernible pattern to their joint behavior?

    This chapter introduces two fundamental statistical measures that address this question: covariance and correlation. Covariance provides a measure of the joint variability of two random variables, indicating the direction of their linear relationship. Correlation, a standardized version of covariance, goes a step further by also quantifying the strength of this linear association. A firm grasp of these concepts is indispensable, as they form the bedrock of more advanced topics such as regression analysis and portfolio theory, and are frequently tested in the GATE examination.

    πŸ“– Covariance

    The covariance between two random variables, XX and YY, with expected values E[X]=ΞΌXE[X] = \mu_X and E[Y]=ΞΌYE[Y] = \mu_Y, is defined as the expected value of the product of their deviations from their respective means.

    Cov(X,Y)=E[(Xβˆ’ΞΌX)(Yβˆ’ΞΌY)]Cov(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]

    ---

    Key Concepts

    1. Understanding and Calculating Covariance

    The definition of covariance, E[(Xβˆ’ΞΌX)(Yβˆ’ΞΌY)]E[(X - \mu_X)(Y - \mu_Y)], provides significant intuition. Consider the term (Xβˆ’ΞΌX)(Yβˆ’ΞΌY)(X - \mu_X)(Y - \mu_Y). If XX and YY tend to be simultaneously above their means or simultaneously below their means, this product will be positive on average. This results in a positive covariance, suggesting a positive linear relationship. Conversely, if one variable tends to be above its mean when the other is below its mean, the product will be negative on average, yielding a negative covariance. If there is no consistent linear pattern, the positive and negative products will tend to cancel out, resulting in a covariance near zero.




    Positive Covariance







    Negative Covariance







    Zero Covariance






    While the definitional formula is useful for conceptual understanding, a more practical formula is often used for computation, especially with discrete random variables. We can expand the definition:

    Cov(X,Y)=E[XYβˆ’XΞΌYβˆ’YΞΌX+ΞΌXΞΌY]=E[XY]βˆ’E[XΞΌY]βˆ’E[YΞΌX]+E[ΞΌXΞΌY]=E[XY]βˆ’ΞΌYE[X]βˆ’ΞΌXE[Y]+ΞΌXΞΌY=E[XY]βˆ’ΞΌYΞΌXβˆ’ΞΌXΞΌY+ΞΌXΞΌY=E[XY]βˆ’ΞΌXΞΌY\begin{aligned}Cov(X, Y) & = E[XY - X\mu_Y - Y\mu_X + \mu_X\mu_Y] \\ & = E[XY] - E[X\mu_Y] - E[Y\mu_X] + E[\mu_X\mu_Y] \\ & = E[XY] - \mu_Y E[X] - \mu_X E[Y] + \mu_X\mu_Y \\ & = E[XY] - \mu_Y \mu_X - \mu_X \mu_Y + \mu_X\mu_Y \\ & = E[XY] - \mu_X \mu_Y\end{aligned}

    This leads to the widely used computational formula.

    πŸ“ Computational Formula for Covariance
    Cov(X,Y)=E[XY]βˆ’E[X]E[Y]Cov(X, Y) = E[XY] - E[X]E[Y]

    Variables:

      • E[XY]E[XY] = The expected value of the product of the random variables XX and YY.

      • E[X]E[X] = The expected value of XX.

      • E[Y]E[Y] = The expected value of YY.


    When to use: This formula is almost always more convenient for calculation, particularly for discrete random variables where a joint probability mass function is available.

    Worked Example:

    Problem: A fair six-sided die is rolled. Let XX be a random variable that is 1 if the outcome is even and 0 if it is odd. Let YY be a random variable that is 1 if the outcome is greater than 3, and 0 otherwise. Calculate the covariance between XX and YY.

    Solution:

    Step 1: Define the sample space and probabilities.
    The sample space is S={1,2,3,4,5,6}S = \{1, 2, 3, 4, 5, 6\}, with each outcome having a probability of 1/61/6.

    Step 2: Determine the values of XX and YY for each outcome and calculate E[X]E[X] and E[Y]E[Y].

    • For XX: X=1X=1 for {2,4,6}\{2, 4, 6\}; X=0X=0 for {1,3,5}\{1, 3, 5\}.

    • For YY: Y=1Y=1 for {4,5,6}\{4, 5, 6\}; Y=0Y=0 for {1,2,3}\{1, 2, 3\}.


    P(X=1)=3/6=1/2P(X=1) = 3/6 = 1/2, P(X=0)=3/6=1/2P(X=0) = 3/6 = 1/2.
    E[X]=(1β‹…12)+(0β‹…12)=12E[X] = \left(1 \cdot \frac{1}{2}\right) + \left(0 \cdot \frac{1}{2}\right) = \frac{1}{2}

    P(Y=1)=3/6=1/2P(Y=1) = 3/6 = 1/2, P(Y=0)=3/6=1/2P(Y=0) = 3/6 = 1/2.

    E[Y]=(1β‹…12)+(0β‹…12)=12E[Y] = \left(1 \cdot \frac{1}{2}\right) + \left(0 \cdot \frac{1}{2}\right) = \frac{1}{2}

    Step 3: Determine the value of the product XYXY for each outcome and calculate E[XY]E[XY].
    We need to find the outcomes where XY=1XY=1. This occurs only when both X=1X=1 (even) and Y=1Y=1 (greater than 3). The outcomes satisfying this are {4,6}\{4, 6\}.
    Thus, P(XY=1)=P({4,6})=2/6=1/3P(XY=1) = P(\{4, 6\}) = 2/6 = 1/3. For all other outcomes, XY=0XY=0.

    E[XY]=(1β‹…P(XY=1))+(0β‹…P(XY=0))=1β‹…13=13E[XY] = \left(1 \cdot P(XY=1)\right) + \left(0 \cdot P(XY=0)\right) = 1 \cdot \frac{1}{3} = \frac{1}{3}

    Step 4: Apply the computational formula for covariance.

    Cov(X,Y)=E[XY]βˆ’E[X]E[Y]Cov(X, Y) = E[XY] - E[X]E[Y]
    Cov(X,Y)=13βˆ’(12)(12)Cov(X, Y) = \frac{1}{3} - \left(\frac{1}{2}\right)\left(\frac{1}{2}\right)
    Cov(X,Y)=13βˆ’14Cov(X, Y) = \frac{1}{3} - \frac{1}{4}
    Cov(X,Y)=4βˆ’312=112Cov(X, Y) = \frac{4 - 3}{12} = \frac{1}{12}

    Answer: The covariance between XX and YY is 1/121/12.

    ---

    2. Properties of Covariance

    Understanding the properties of covariance is critical for simplifying complex expressions, a common requirement in GATE problems.

  • Symmetry: Cov(X,Y)=Cov(Y,X)Cov(X, Y) = Cov(Y, X)

  • Covariance with a Constant: The covariance of a random variable with a constant is always zero.

  • Cov(X,c)=0Cov(X, c) = 0

  • Relationship with Variance: The covariance of a random variable with itself is its variance.

  • Cov(X,X)=E[Xβ‹…X]βˆ’E[X]E[X]=E[X2]βˆ’(E[X])2=Var(X)Cov(X, X) = E[X \cdot X] - E[X]E[X] = E[X^2] - (E[X])^2 = Var(X)

  • Effect of Scaling and Shifting (Linear Transformations): For constants a,b,c,da, b, c, d:

  • Cov(aX+b,cY+d)=acβ‹…Cov(X,Y)Cov(aX + b, cY + d) = ac \cdot Cov(X, Y)

    Notice that the additive constants bb and dd do not affect the covariance, as they do not change the spread of the variables.
  • Bilaterality:

  • Cov(X+Y,Z)=Cov(X,Z)+Cov(Y,Z)Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z)

    Cov(X,Y+Z)=Cov(X,Y)+Cov(X,Z)Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z)

    Worked Example:

    Problem: Let XX be a random variable with Var(X)=9Var(X) = 9. Let Y=5βˆ’2XY = 5 - 2X. Calculate Cov(X,Y)Cov(X, Y).

    Solution:

    Step 1: Identify the relationship between XX and YY.
    We are given YY as a linear transformation of XX: Y=βˆ’2X+5Y = -2X + 5.

    Step 2: Apply the property of covariance under linear transformations.
    We need to find Cov(X,Y)=Cov(X,βˆ’2X+5)Cov(X, Y) = Cov(X, -2X + 5).
    Let a=1,b=0,c=βˆ’2,d=5a=1, b=0, c=-2, d=5. The general property is Cov(aX+b,cX+d)=acβ‹…Cov(X,X)Cov(aX + b, cX + d) = ac \cdot Cov(X, X).

    Cov(X,βˆ’2X+5)=(1)(βˆ’2)β‹…Cov(X,X)Cov(X, -2X + 5) = (1)(-2) \cdot Cov(X, X)

    Step 3: Use the relationship between covariance and variance.
    We know that Cov(X,X)=Var(X)Cov(X, X) = Var(X).

    Cov(X,Y)=βˆ’2β‹…Var(X)Cov(X, Y) = -2 \cdot Var(X)

    Step 4: Substitute the given value of Var(X)Var(X).
    We are given Var(X)=9Var(X) = 9.

    Cov(X,Y)=βˆ’2β‹…9=βˆ’18Cov(X, Y) = -2 \cdot 9 = -18

    Answer: The covariance between XX and YY is βˆ’18-18.

    ---

    #
    ## 3. Correlation Coefficient

    A significant limitation of covariance is that its magnitude is scale-dependent. If we change the units of XX from meters to centimeters, the covariance will increase by a factor of 100, even though the underlying relationship between the variables has not changed. To overcome this, we use the correlation coefficient, which is a normalized measure.

    πŸ“– Pearson Correlation Coefficient (ρ)

    The Pearson correlation coefficient between two random variables, XX and YY, is their covariance divided by the product of their standard deviations.

    ρ(X,Y)=Cov(X,Y)ΟƒXΟƒY\rho(X, Y) = \frac{Cov(X, Y)}{\sigma_X \sigma_Y}

    The correlation coefficient, ρ\rho, is a dimensionless quantity that always lies in the range [βˆ’1,1][-1, 1].

    • ρ=+1\rho = +1: Perfect positive linear relationship.

    • ρ=βˆ’1\rho = -1: Perfect negative linear relationship.

    • ρ=0\rho = 0: No linear relationship.

    • Values between 0 and 1 indicate the strength of a positive linear relationship.

    • Values between -1 and 0 indicate the strength of a negative linear relationship.




    πŸ“
    Correlation Coefficient

    ρ(X,Y)=Cov(X,Y)Var(X)Var(Y)\rho(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}

    Variables:

      • Cov(X,Y)Cov(X, Y) = Covariance of XX and YY.

      • Var(X)Var(X) = Variance of XX.

      • Var(Y)Var(Y) = Variance of YY.


    Application: Use this to find a standardized measure of linear association, which is independent of the units of the variables.


    ❗ Correlation vs. Causation

    A non-zero correlation between two variables does not, by itself, imply that one variable causes the other. There could be a third, unobserved variable (a confounding variable) influencing both, or the relationship could be purely coincidental.

    ---

    #
    ## 4. Variance of Sums of Random Variables

    Covariance plays a crucial role in determining the variance of a sum or difference of random variables.

    Let us derive the formula for Var(X+Y)Var(X+Y):

    Var(X+Y)=E[((X+Y)βˆ’E[X+Y])2]=E[((Xβˆ’E[X])+(Yβˆ’E[Y]))2]=E[(Xβˆ’E[X])2+(Yβˆ’E[Y])2+2(Xβˆ’E[X])(Yβˆ’E[Y])]\begin{aligned}Var(X+Y) & = E[((X+Y) - E[X+Y])^2] \\
    & = E[((X - E[X]) + (Y - E[Y]))^2] \\
    & = E[(X - E[X])^2 + (Y - E[Y])^2 + 2(X - E[X])(Y - E[Y])]\end{aligned}

    By linearity of expectation:
    Var(X+Y)=E[(Xβˆ’E[X])2]+E[(Yβˆ’E[Y])2]+2E[(Xβˆ’E[X])(Yβˆ’E[Y])]\begin{aligned}Var(X+Y) & = E[(X - E[X])^2] + E[(Y - E[Y])^2] + 2E[(X - E[X])(Y - E[Y])]\end{aligned}

    This simplifies to:
    Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y)

    Similarly, for the difference:

    Var(Xβˆ’Y)=Var(X)+Var(Y)βˆ’2Cov(X,Y)Var(X-Y) = Var(X) + Var(Y) - 2Cov(X, Y)

    A special and very important case arises when XX and YY are independent. If two variables are independent, their covariance is zero. (The converse is not always true). For independent variables, the formulas simplify significantly:

    Var(XΒ±Y)=Var(X)+Var(Y)(ifΒ X,YΒ areΒ independent)Var(X \pm Y) = Var(X) + Var(Y) \quad \text{(if X,YX, Y are independent)}

    ---

    Problem-Solving Strategies

    πŸ’‘ GATE Strategy: Discrete Covariance Calculation

    For problems involving discrete random variables derived from an experiment (like coin tosses or dice rolls), follow a systematic procedure:

    • List Outcomes: Enumerate all possible outcomes of the experiment and their probabilities.

    • Create a Joint Table: Construct a table listing each outcome, its probability, and the corresponding values of XX, YY, and the product XYXY.

    • Calculate Marginal PMFs: From the joint table, determine the probability mass functions (PMFs) for XX and YY individually.

    • Compute Expectations: Calculate E[X]E[X] and E[Y]E[Y] using their PMFs.

    • Compute E[XY]E[XY]: Calculate the expected value of the product XYXY directly from the joint table created in Step 2.

    • Apply Formula: Substitute the computed values into Cov(X,Y)=E[XY]βˆ’E[X]E[Y]Cov(X, Y) = E[XY] - E[X]E[Y].

    ---

    Common Mistakes

    ⚠️ Avoid These Errors
      • ❌ Confusing Zero Correlation with Independence.
    βœ… Zero correlation or covariance implies only the absence of a linear relationship. Variables can have a strong non-linear relationship and still have zero correlation. However, if two variables are independent, their covariance is always zero.
      • ❌ Incorrectly Applying Constants in Formulas.
    βœ… Remember that variance scales quadratically while covariance scales linearly. Var(aX)=a2Var(X)Var(aX) = a^2 Var(X) Cov(aX,Y)=aβ‹…Cov(X,Y)Cov(aX, Y) = a \cdot Cov(X, Y)
      • ❌ Assuming Variance of a Sum is the Sum of Variances.
    βœ… The formula Var(X+Y)=Var(X)+Var(Y)Var(X+Y) = Var(X) + Var(Y) is only valid if XX and YY are uncorrelated (Cov(X,Y)=0Cov(X,Y)=0). For the general case, you must include the covariance term: Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y).

    ---

    Practice Questions

    :::question type="MCQ" question="Let WW be a random variable with E[W]=0E[W]=0 and Var(W)=1Var(W)=1. A new random variable VV is defined as V=3βˆ’4WV = 3 - 4W. Given that E[(Vβˆ’E[V])2]=16E[(V - E[V])^2] = 16 and E[(Vβˆ’E[V])W]=βˆ’4E[(V - E[V])W] = -4, which of the following statements is consistent?" options=["The definition of VV is consistent with the given expected values.","The definition of VV is inconsistent because the variance is incorrect.","The definition of VV is inconsistent because the covariance term is incorrect.","The data is insufficient to determine consistency."] answer="The definition of VV is consistent with the given expected values." hint="Use the properties of expectation, variance, and covariance on the linear transformation V=3βˆ’4WV = 3 - 4W and check if the results match the given values." solution="
    Step 1: Calculate the expected value of VV from its definition.

    E[V]=E[3βˆ’4W]=3βˆ’4E[W]E[V] = E[3 - 4W] = 3 - 4E[W]

    Since E[W]=0E[W]=0, we have E[V]=3E[V] = 3.

    Step 2: Calculate the variance of VV from its definition.
    The given term E[(Vβˆ’E[V])2]E[(V - E[V])^2] is the definition of Var(V)Var(V). So, we are given Var(V)=16Var(V)=16.
    Let's calculate Var(V)Var(V) from the transformation:

    Var(V)=Var(3βˆ’4W)=(βˆ’4)2Var(W)Var(V) = Var(3 - 4W) = (-4)^2 Var(W)

    Since Var(W)=1Var(W)=1, we have Var(V)=16β‹…1=16Var(V) = 16 \cdot 1 = 16. This matches the given information.

    Step 3: Calculate the covariance term from its definition.
    The given term E[(Vβˆ’E[V])W]E[(V - E[V])W] is a form of covariance. Since E[W]=0E[W]=0, this is E[(Vβˆ’E[V])(Wβˆ’E[W])]=Cov(V,W)E[(V - E[V])(W - E[W])] = Cov(V, W). So, we are given Cov(V,W)=βˆ’4Cov(V, W) = -4.
    Let's calculate Cov(V,W)Cov(V, W) from the transformation:

    Cov(V,W)=Cov(3βˆ’4W,W)=Cov(βˆ’4W,W)Cov(V, W) = Cov(3 - 4W, W) = Cov(-4W, W)

    Using properties, Cov(βˆ’4W,W)=βˆ’4β‹…Cov(W,W)=βˆ’4β‹…Var(W)Cov(-4W, W) = -4 \cdot Cov(W, W) = -4 \cdot Var(W).
    Since Var(W)=1Var(W)=1, we have Cov(V,W)=βˆ’4β‹…1=βˆ’4Cov(V, W) = -4 \cdot 1 = -4. This also matches the given information.

    Answer: \boxed{The definition of VV is consistent with the given expected values.}
    "
    :::

    :::question type="NAT" question="Two balls are drawn with replacement from an urn containing 3 red balls and 2 blue balls. Let the random variable XX be the number of red balls drawn, and let the random variable YY be the number of blue balls drawn. The value of the covariance of XX and YY, Cov(X,Y)Cov(X, Y), is ______ (rounded off to two decimal places)." answer="-0.48" hint="Notice that X+Y=2X+Y=2 is a constant. Use this relationship and the properties of variance and covariance." solution="
    Step 1: Establish the relationship between XX and YY.
    Since two balls are drawn in total, the number of red balls (XX) plus the number of blue balls (YY) must equal 2.

    X+Y=2X + Y = 2

    Step 2: Use the property of variance of a constant.
    The variance of a constant is zero.

    Var(X+Y)=Var(2)=0Var(X+Y) = Var(2) = 0

    Step 3: Expand Var(X+Y)Var(X+Y) using the formula involving covariance.

    Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y)

    Step 4: Equate the expressions from Step 2 and Step 3.

    Var(X)+Var(Y)+2Cov(X,Y)=0Var(X) + Var(Y) + 2Cov(X, Y) = 0

    This implies:
    Cov(X,Y)=βˆ’12(Var(X)+Var(Y))Cov(X, Y) = -\frac{1}{2}(Var(X) + Var(Y))

    Step 5: Calculate Var(X)Var(X) and Var(Y)Var(Y).
    The drawing of each ball is a Bernoulli trial. Let a "success" be drawing a red ball. The probability of success is p=3/5=0.6p = 3/5 = 0.6. Since we draw n=2n=2 balls with replacement, XX follows a binomial distribution B(n=2,p=0.6)B(n=2, p=0.6).
    The variance of a binomial distribution is np(1βˆ’p)np(1-p).

    Var(X)=2β‹…(0.6)β‹…(1βˆ’0.6)=2β‹…0.6β‹…0.4=0.48Var(X) = 2 \cdot (0.6) \cdot (1 - 0.6) = 2 \cdot 0.6 \cdot 0.4 = 0.48

    Similarly, YY is the number of blue balls, so it follows B(n=2,p=2/5=0.4)B(n=2, p=2/5=0.4).
    Var(Y)=2β‹…(0.4)β‹…(1βˆ’0.4)=2β‹…0.4β‹…0.6=0.48Var(Y) = 2 \cdot (0.4) \cdot (1 - 0.4) = 2 \cdot 0.4 \cdot 0.6 = 0.48

    Step 6: Substitute the variances into the equation for covariance.

    Cov(X,Y)=βˆ’12(0.48+0.48)Cov(X, Y) = -\frac{1}{2}(0.48 + 0.48)

    Cov(X,Y)=βˆ’12(0.96)Cov(X, Y) = -\frac{1}{2}(0.96)

    Cov(X,Y)=βˆ’0.48Cov(X, Y) = -0.48

    Answer: \boxed{-0.48}
    "
    :::

    :::question type="MSQ" question="Let XX and YY be two random variables such that Var(X)=16Var(X)=16, Var(Y)=25Var(Y)=25, and the correlation coefficient ρ(X,Y)=βˆ’0.5\rho(X, Y)=-0.5. Which of the following statements are correct?" options=["Cov(X,Y)=βˆ’10Cov(X, Y) = -10","Var(X+Y)=21Var(X+Y) = 21","Var(Xβˆ’Y)=61Var(X-Y) = 61","Cov(3X,2βˆ’Y)=30Cov(3X, 2-Y) = 30"] answer="A,B,C" hint="Calculate each value using the fundamental formulas for covariance, correlation, and variance of sums/differences." solution="
    Let's evaluate each option.

    Option A: Cov(X,Y)Cov(X, Y)
    The formula for correlation is ρ(X,Y)=Cov(X,Y)Var(X)Var(Y)\rho(X, Y) = \frac{Cov(X, Y)}{\sqrt{Var(X)Var(Y)}}.

    Cov(X,Y)=ρ(X,Y)β‹…Var(X)Var(Y)Cov(X, Y) = \rho(X, Y) \cdot \sqrt{Var(X)Var(Y)}

    Cov(X,Y)=βˆ’0.5β‹…16β‹…25=βˆ’0.5β‹…400=βˆ’0.5β‹…20=βˆ’10Cov(X, Y) = -0.5 \cdot \sqrt{16 \cdot 25} = -0.5 \cdot \sqrt{400} = -0.5 \cdot 20 = -10

    So, statement A is correct.

    Option B: Var(X+Y)Var(X+Y)
    The formula is Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)Var(X+Y) = Var(X) + Var(Y) + 2Cov(X, Y).

    Var(X+Y)=16+25+2(βˆ’10)=41βˆ’20=21Var(X+Y) = 16 + 25 + 2(-10) = 41 - 20 = 21

    So, statement B is correct.

    Option C: Var(Xβˆ’Y)Var(X-Y)
    The formula is Var(Xβˆ’Y)=Var(X)+Var(Y)βˆ’2Cov(X,Y)Var(X-Y) = Var(X) + Var(Y) - 2Cov(X, Y).

    Var(Xβˆ’Y)=16+25βˆ’2(βˆ’10)=41+20=61Var(X-Y) = 16 + 25 - 2(-10) = 41 + 20 = 61

    So, statement C is correct.

    Option D: Cov(3X,2βˆ’Y)Cov(3X, 2-Y)
    Using the property Cov(aX+b,cY+d)=acβ‹…Cov(X,Y)Cov(aX+b, cY+d) = ac \cdot Cov(X, Y).
    Here a=3,b=0,c=βˆ’1,d=2a=3, b=0, c=-1, d=2.

    Cov(3X,2βˆ’Y)=(3)(βˆ’1)β‹…Cov(X,Y)=βˆ’3β‹…(βˆ’10)=30Cov(3X, 2-Y) = (3)(-1) \cdot Cov(X, Y) = -3 \cdot (-10) = 30

    So, statement D is correct.

    Answer: \boxed{A, B, C}
    "
    :::

    ---

    Summary

    ❗ Key Takeaways for GATE

    • Covariance measures the direction of a linear relationship. For computations, always use the formula Cov(X,Y)=E[XY]βˆ’E[X]E[Y]Cov(X, Y) = E[XY] - E[X]E[Y].

    • Correlation is the normalized version of covariance, ρ(X,Y)=Cov(X,Y)ΟƒXΟƒY\rho(X, Y) = \frac{Cov(X, Y)}{\sigma_X \sigma_Y}, which measures both the strength and direction of the linear relationship, bounded between -1 and 1.

    • Properties of Linear Transformations are frequently tested. Remember that Var(aX+b)=a2Var(X)Var(aX+b) = a^2 Var(X) and Cov(aX+b,cY+d)=acβ‹…Cov(X,Y)Cov(aX+b, cY+d) = ac \cdot Cov(X, Y).

    • The Variance of a Sum depends on covariance: Var(XΒ±Y)=Var(X)+Var(Y)Β±2Cov(X,Y)Var(X \pm Y) = Var(X) + Var(Y) \pm 2Cov(X, Y). This simplifies only if the variables are uncorrelated.

    ---

    ---

    What's Next?

    πŸ’‘ Continue Learning

    A solid understanding of covariance and correlation is a prerequisite for several advanced topics in data analysis and statistics.

      • Linear Regression: Correlation is the foundation of simple linear regression, which seeks to model the linear relationship between a dependent and an independent variable. The slope of the regression line is directly related to the covariance and variance of the variables.
      • Multivariate Distributions: In the study of distributions involving multiple random variables (e.g., the Multivariate Normal Distribution), the relationships between pairs of variables are described by a covariance matrix, a critical component of the distribution's parameterization.

    ---

    πŸ’‘ Moving Forward

    Now that you understand Correlation and Covariance, let's explore Conditional Expectation and Variance which builds on these concepts.

    ---

    Part 4: Conditional Expectation and Variance

    Introduction

    In our study of random variables, we often seek to understand the relationship between them. While concepts like covariance and correlation provide a measure of linear association, they do not capture the full picture. Conditional expectation offers a more powerful and nuanced tool. It allows us to determine the expected value of one random variable, given that we have observed the outcome of another. This concept is fundamental to prediction and estimation, forming the bedrock of modern statistical modeling and machine learning.

    Consider a scenario where we wish to predict a student's final exam score (YY) based on their mid-term score (XX). The unconditional expectation, E[Y]E[Y], gives us the average final score over all students. However, a much more accurate prediction can be made if we know the student's mid-term score, say X=xX=x. The value we would then be interested in is the conditional expectation, E[Y∣X=x]E[Y|X=x], which represents the average final score specifically for students who scored xx on the mid-term.

    This chapter will formally define conditional expectation and its counterpart, conditional variance. We will explore their properties, most notably the Law of Total Expectation and the Law of Total Variance, which provide elegant methods for decomposing complex problems. Furthermore, we shall see how these concepts can be applied to solve intricate problems involving sequences of random events, a common feature in GATE questions.

    πŸ“– Conditional Expectation

    Let XX and YY be two random variables. The conditional expectation of YY given that XX has taken the value xx, denoted by E[Y∣X=x]E[Y|X=x], is the expected value of YY computed with respect to its conditional probability distribution given X=xX=x.

    It is crucial to distinguish between E[Y∣X=x]E[Y|X=x], which is a function of xx, and E[Y∣X]E[Y|X], which is a random variable because its value depends on the random outcome of XX.

    ---

    Key Concepts

    1. Conditional Expectation for Discrete Random Variables

    When dealing with discrete random variables, the conditional expectation is computed as a weighted average, where the weights are given by the conditional probability mass function (PMF).

    First, we must define the conditional PMF of YY given X=xX=x.

    pY∣X(y∣x)=P(Y=y∣X=x)=P(X=x,Y=y)P(X=x)=pX,Y(x,y)pX(x)p_{Y|X}(y|x) = P(Y=y | X=x) = \frac{P(X=x, Y=y)}{P(X=x)} = \frac{p_{X,Y}(x,y)}{p_X(x)}

    Here, pX,Y(x,y)p_{X,Y}(x,y) is the joint PMF of XX and YY, and pX(x)p_X(x) is the marginal PMF of XX, provided that pX(x)>0p_X(x) > 0.

    With this conditional PMF, we can now define the conditional expectation.

    πŸ“ Conditional Expectation (Discrete)
    E[Y∣X=x]=βˆ‘yyβ‹…pY∣X(y∣x)E[Y|X=x] = \sum_{y} y \cdot p_{Y|X}(y|x)

    Variables:

      • yy = A possible value of the random variable YY.

      • pY∣X(y∣x)p_{Y|X}(y|x) = The conditional PMF of YY given X=xX=x.


    When to use: Use this formula when both XX and YY are discrete random variables and you are given their joint PMF.

    Worked Example:

    Problem: The joint PMF of two discrete random variables XX and YY is given by the following table:

    | | Y=0 | Y=1 | Y=2 |
    | :--- | :--- | :--- | :--- |
    | X=0 | 0.1 | 0.2 | 0.1 |
    | X=1 | 0.3 | 0.1 | 0.2 |

    Calculate the conditional expectation E[Y∣X=1]E[Y|X=1].

    Solution:

    Step 1: Find the marginal PMF of XX, pX(x)p_X(x), specifically for x=1x=1. We sum the probabilities across the row for X=1X=1.

    pX(1)=P(X=1)=P(X=1,Y=0)+P(X=1,Y=1)+P(X=1,Y=2)p_X(1) = P(X=1) = P(X=1, Y=0) + P(X=1, Y=1) + P(X=1, Y=2)
    pX(1)=0.3+0.1+0.2=0.6p_X(1) = 0.3 + 0.1 + 0.2 = 0.6

    Step 2: Determine the conditional PMF of YY given X=1X=1 for each possible value of YY.

    For Y=0Y=0:

    P(Y=0∣X=1)=P(X=1,Y=0)P(X=1)=0.30.6=0.5P(Y=0|X=1) = \frac{P(X=1, Y=0)}{P(X=1)} = \frac{0.3}{0.6} = 0.5

    For Y=1Y=1:

    P(Y=1∣X=1)=P(X=1,Y=1)P(X=1)=0.10.6=16P(Y=1|X=1) = \frac{P(X=1, Y=1)}{P(X=1)} = \frac{0.1}{0.6} = \frac{1}{6}

    For Y=2Y=2:

    P(Y=2∣X=1)=P(X=1,Y=2)P(X=1)=0.20.6=13P(Y=2|X=1) = \frac{P(X=1, Y=2)}{P(X=1)} = \frac{0.2}{0.6} = \frac{1}{3}

    (As a check, we note that 0.5+1/6+1/3=3/6+1/6+2/6=10.5 + 1/6 + 1/3 = 3/6 + 1/6 + 2/6 = 1, as required for a valid PMF).

    Step 3: Apply the formula for conditional expectation.

    E[Y∣X=1]=βˆ‘yyβ‹…P(Y=y∣X=1)E[Y|X=1] = \sum_{y} y \cdot P(Y=y|X=1)
    E[Y∣X=1]=(0β‹…0.5)+(1β‹…16)+(2β‹…13)E[Y|X=1] = (0 \cdot 0.5) + (1 \cdot \frac{1}{6}) + (2 \cdot \frac{1}{3})

    Step 4: Compute the final value.

    E[Y∣X=1]=0+16+23=16+46=56E[Y|X=1] = 0 + \frac{1}{6} + \frac{2}{3} = \frac{1}{6} + \frac{4}{6} = \frac{5}{6}

    Answer: E[Y∣X=1]=56β‰ˆ0.833E[Y|X=1] = \frac{5}{6} \approx 0.833

    ---

    2. Conditional Expectation for Continuous Random Variables

    The logic for continuous variables is analogous to the discrete case, with sums replaced by integrals and PMFs by probability density functions (PDFs).

    The conditional PDF of YY given X=xX=x is defined as:

    fY∣X(y∣x)=fX,Y(x,y)fX(x)f_{Y|X}(y|x) = \frac{f_{X,Y}(x,y)}{f_X(x)}

    where fX(x)=βˆ«βˆ’βˆžβˆžfX,Y(x,y) dxf_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx is the marginal PDF of XX, assuming fX(x)>0f_X(x) > 0.

    πŸ“ Conditional Expectation (Continuous)
    E[Y∣X=x]=βˆ«βˆ’βˆžβˆžyβ‹…fY∣X(y∣x) dyE[Y|X=x] = \int_{-\infty}^{\infty} y \cdot f_{Y|X}(y|x) \, dy

    Variables:

      • yy = A possible value of the random variable YY.

      • fY∣X(y∣x)f_{Y|X}(y|x) = The conditional PDF of YY given X=xX=x.


    When to use: Use this when XX and YY are continuous random variables with a given joint PDF. This is a common pattern in GATE questions.

    Worked Example:

    Problem: Let the joint PDF of random variables XX and YY be

    fX,Y(x,y)={x+y3,0<x<1,0<y<20,otherwisef_{X,Y}(x,y) = \begin{cases} \frac{x+y}{3}, & 0 < x < 1, \quad 0 < y < 2 \\ 0, & \text{otherwise} \end{cases}

    Calculate E[X∣Y=1]E[X|Y=1].

    Solution:

    Step 1: Find the marginal PDF of YY, fY(y)f_Y(y), by integrating the joint PDF with respect to xx.

    fY(y)=βˆ«βˆ’βˆžβˆžfX,Y(x,y) dx=∫01x+y3 dxf_Y(y) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx = \int_{0}^{1} \frac{x+y}{3} \, dx
    fY(y)=13[x22+yx]01=13(12+y)f_Y(y) = \frac{1}{3} \left[ \frac{x^2}{2} + yx \right]_0^1 = \frac{1}{3} \left( \frac{1}{2} + y \right)

    This is valid for 0<y<20 < y < 2.

    Step 2: Evaluate the marginal PDF at the given condition, y=1y=1.

    fY(1)=13(12+1)=13(32)=12f_Y(1) = \frac{1}{3} \left( \frac{1}{2} + 1 \right) = \frac{1}{3} \left( \frac{3}{2} \right) = \frac{1}{2}

    Step 3: Determine the conditional PDF fX∣Y(x∣y=1)f_{X|Y}(x|y=1).

    fX∣Y(x∣1)=fX,Y(x,1)fY(1)f_{X|Y}(x|1) = \frac{f_{X,Y}(x,1)}{f_Y(1)}
    fX∣Y(x∣1)=(x+1)/31/2=2(x+1)3f_{X|Y}(x|1) = \frac{(x+1)/3}{1/2} = \frac{2(x+1)}{3}

    This conditional PDF is valid for 0<x<10 < x < 1.

    Step 4: Apply the formula for conditional expectation.

    E[X∣Y=1]=βˆ«βˆ’βˆžβˆžxβ‹…fX∣Y(x∣1) dxE[X|Y=1] = \int_{-\infty}^{\infty} x \cdot f_{X|Y}(x|1) \, dx
    E[X∣Y=1]=∫01xβ‹…2(x+1)3 dx=23∫01(x2+x) dxE[X|Y=1] = \int_{0}^{1} x \cdot \frac{2(x+1)}{3} \, dx = \frac{2}{3} \int_{0}^{1} (x^2+x) \, dx

    Step 5: Compute the integral.

    E[X∣Y=1]=23[x33+x22]01E[X|Y=1] = \frac{2}{3} \left[ \frac{x^3}{3} + \frac{x^2}{2} \right]_0^1
    E[X∣Y=1]=23((13+12)βˆ’0)=23(2+36)=23β‹…56E[X|Y=1] = \frac{2}{3} \left( (\frac{1}{3} + \frac{1}{2}) - 0 \right) = \frac{2}{3} \left( \frac{2+3}{6} \right) = \frac{2}{3} \cdot \frac{5}{6}

    Step 6: Simplify to find the final answer.

    E[X∣Y=1]=1018=59E[X|Y=1] = \frac{10}{18} = \frac{5}{9}

    Answer: E[X∣Y=1]=59E[X|Y=1] = \frac{5}{9}

    ---

    3. Properties of Conditional Expectation

    The true power of conditional expectation is revealed through its properties, which simplify complex calculations and provide deep theoretical insights.

    The Law of Total Expectation (Tower Property)

    This is arguably the most important property of conditional expectation and is frequently tested in GATE. It states that the expected value of the conditional expectation of YY given XX is simply the expected value of YY.

    πŸ“ Law of Total Expectation
    E[Y]=E[E[Y∣X]]E[Y] = E[E[Y|X]]

    Variables:

      • E[Y∣X]E[Y|X] is a random variable, as it is a function of the random variable XX.

      • E[E[Y∣X]]E[E[Y|X]] denotes taking the expectation of this new random variable.


    When to use: This law is used to find an unconditional expectation when it is easier to first compute the expectation by conditioning on another variable. It is also a fundamental identity tested directly.

    Let us demonstrate this for the continuous case.

    E[E[Y∣X]]=βˆ«βˆ’βˆžβˆžE[Y∣X=x]β‹…fX(x) dx=βˆ«βˆ’βˆžβˆž(βˆ«βˆ’βˆžβˆžyβ‹…fY∣X(y∣x) dy)fX(x) dxSinceΒ fY∣X(y∣x)β‹…fX(x)=fX,Y(x,y),Β weΒ have:E[E[Y∣X]]=βˆ«βˆ’βˆžβˆžβˆ«βˆ’βˆžβˆžyβ‹…fX,Y(x,y) dy dxByΒ changingΒ theΒ orderΒ ofΒ integration:E[E[Y∣X]]=βˆ«βˆ’βˆžβˆžy(βˆ«βˆ’βˆžβˆžfX,Y(x,y) dx) dyTheΒ innerΒ integralΒ isΒ theΒ marginalΒ PDFΒ ofΒ Y,Β fY(y).E[E[Y∣X]]=βˆ«βˆ’βˆžβˆžyβ‹…fY(y) dy=E[Y]\begin{aligned}E[E[Y|X]] & = \int_{-\infty}^{\infty} E[Y|X=x] \cdot f_X(x) \, dx \\
    & = \int_{-\infty}^{\infty} \left( \int_{-\infty}^{\infty} y \cdot f_{Y|X}(y|x) \, dy \right) f_X(x) \, dx \\
    \text{Since } f_{Y|X}(y|x) \cdot f_X(x) & = f_{X,Y}(x,y), \text{ we have:} \\
    E[E[Y|X]] & = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} y \cdot f_{X,Y}(x,y) \, dy \, dx \\
    \text{By changing the order of integration:} \\
    E[E[Y|X]] & = \int_{-\infty}^{\infty} y \left( \int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dx \right) \, dy \\
    \text{The inner integral is the marginal PDF of Y, } f_Y(y). \\
    E[E[Y|X]] & = \int_{-\infty}^{\infty} y \cdot f_Y(y) \, dy = E[Y]\end{aligned}

    This elegant result confirms the property.

    Other Key Properties

  • Linearity: E[aY+bZ∣X]=aE[Y∣X]+bE[Z∣X]E[aY + bZ | X] = aE[Y|X] + bE[Z|X] for constants a,ba, b.

  • Taking Out What Is Known: For any function gg, E[g(X)Y∣X]=g(X)E[Y∣X]E[g(X)Y | X] = g(X)E[Y|X]. This is because, given XX, the value of g(X)g(X) is no longer random; it is a known constant. A special case is E[g(X)∣X]=g(X)E[g(X)|X] = g(X).

  • Independence: If XX and YY are independent, then E[Y∣X]=E[Y]E[Y|X] = E[Y]. Knowing the value of XX provides no new information about YY.
  • ---

    ---

    #
    ## 4. Conditional Variance and The Law of Total Variance

    Similar to expectation, we can define the variance of a random variable conditional on the value of another.

    πŸ“– Conditional Variance

    The conditional variance of YY given X=xX=x is defined as:

    Var(Y∣X=x)=E[(Yβˆ’E[Y∣X=x])2∣X=x]Var(Y|X=x) = E[(Y - E[Y|X=x])^2 | X=x]

    A more convenient computational form is:
    Var(Y∣X=x)=E[Y2∣X=x]βˆ’(E[Y∣X=x])2Var(Y|X=x) = E[Y^2|X=x] - (E[Y|X=x])^2

    Like conditional expectation, Var(Y∣X)Var(Y|X) is a random variable whose value depends on XX.

    This leads to a decomposition formula for variance, analogous to the Law of Total Expectation.

    πŸ“ Law of Total Variance
    Var(Y)=E[Var(Y∣X)]+Var(E[Y∣X])Var(Y) = E[Var(Y|X)] + Var(E[Y|X])

    Variables:

      • Var(Y)Var(Y) is the total variance of YY.

      • E[Var(Y∣X)]E[Var(Y|X)] is the expected conditional variance. It represents the average amount of variance remaining in YY even after we know XX.

      • Var(E[Y∣X])Var(E[Y|X]) is the variance of the conditional expectation. It represents the portion of the variance in YY that is explained by the variability of XX.


    When to use: This formula is extremely useful in situations where a random variable's variance is influenced by another random process. It breaks down the total variance into components.

    ---

    #
    ## 5. Application: Recurrence Relations for Expected Values

    A powerful application of conditional expectation is in solving problems involving sequences of trials, such as finding the expected number of steps to reach a certain state. This technique was implicitly tested in GATE. The core idea is to condition on the outcome of the first step.

    Let EE be the expected number of trials until a target event occurs. We can write:

    E=βˆ‘outcomesΒ iE[Trials∣firstΒ outcomeΒ isΒ i]β‹…P(firstΒ outcomeΒ isΒ i)E = \sum_{\text{outcomes } i} E[\text{Trials} | \text{first outcome is } i] \cdot P(\text{first outcome is } i)

    Worked Example:

    Problem: A fair coin is flipped repeatedly. What is the expected number of flips required to see the pattern HT (Heads followed by Tails)?

    Solution:

    Step 1: Define the states and the expected values from each state.
    Let EE be the expected number of flips from the start.
    Let EHE_H be the expected number of additional flips required, given that we have just seen a Head.

    Step 2: Set up the equation for EE by conditioning on the first flip.
    The first flip is either H (with probability 1/2) or T (with probability 1/2).
    If the first flip is T, we have wasted one flip and are back to the start. The expected number of additional flips is EE. Total flips: 1+E1+E.
    If the first flip is H, we have used one flip and are now in state H. The expected number of additional flips is EHE_H. Total flips: 1+EH1+E_H.

    E=12(1+E)+12(1+EH)E = \frac{1}{2}(1 + E) + \frac{1}{2}(1 + E_H)

    Step 3: Set up the equation for EHE_H by conditioning on the next flip.
    From state H, the next flip is either T (probability 1/2) or H (probability 1/2).
    If the next flip is T, we have achieved the pattern HT. The process stops. Total additional flips: 1.
    If the next flip is H, we have wasted a flip but are still in a state where the last flip was H. The expected number of additional flips is still EHE_H. Total additional flips: 1+EH1+E_H.

    EH=12(1)+12(1+EH)E_H = \frac{1}{2}(1) + \frac{1}{2}(1 + E_H)

    Step 4: Solve the system of linear equations. First, solve for EHE_H.

    EH=12+12+12EHE_H = \frac{1}{2} + \frac{1}{2} + \frac{1}{2}E_H
    EH=1+12EHE_H = 1 + \frac{1}{2}E_H
    12EH=1β€…β€ŠβŸΉβ€…β€ŠEH=2\frac{1}{2}E_H = 1 \implies E_H = 2

    Step 5: Substitute the value of EHE_H back into the equation for EE.

    E=12(1+E)+12(1+2)E = \frac{1}{2}(1 + E) + \frac{1}{2}(1 + 2)
    E=12+12E+32E = \frac{1}{2} + \frac{1}{2}E + \frac{3}{2}
    E=2+12EE = 2 + \frac{1}{2}E
    12E=2β€…β€ŠβŸΉβ€…β€ŠE=4\frac{1}{2}E = 2 \implies E = 4

    Answer: The expected number of flips is 4\boxed{4}.

    ---

    Problem-Solving Strategies

    πŸ’‘ GATE Strategy: Step-by-Step Conditioning

    For any problem of the form "Find E[g(Y)∣X=x]E[g(Y)|X=x]", follow this sequence:

    • Identify the type: Are the variables discrete or continuous?

    • Find the marginal: Calculate the marginal distribution of the conditioning variable (pX(x)p_X(x) or fX(x)f_X(x)).

    • Find the conditional distribution: Use the formula pY∣X(y∣x)=pX,Y(x,y)/pX(x)p_{Y|X}(y|x) = p_{X,Y}(x,y)/p_X(x) (or its continuous counterpart). This is the most critical step.

    • Integrate/Sum: Apply the definition of expectation using the conditional distribution you just found: E[g(Y)∣X=x]=∫g(y)fY∣X(y∣x)dyE[g(Y)|X=x] = \int g(y) f_{Y|X}(y|x) dy.

    This structured approach prevents errors in calculation, especially with complex integration bounds.

    πŸ’‘ Recurrence Strategy: Define States

    For problems asking for the "expected number of trials until...", the key is to define states based on the progress towards the goal.

    • Let EiE_i be the expected number of additional steps from state ii.

    • The initial state's expectation is what you want to find (e.g., E0E_0).

    • For each state, write an equation for EiE_i by conditioning on the outcome of the next trial. The equation will be of the form Ei=1+βˆ‘jP(transitionΒ toΒ j)EjE_i = 1 + \sum_j P(\text{transition to } j) E_j.

    • The "success" state has an expected additional number of steps of 0.

    • Solve the resulting system of linear equations.

    ---

    Common Mistakes

    ⚠️ Avoid These Errors
      • ❌ Confusing E[Y∣X]E[Y|X] and E[Y∣X=x]E[Y|X=x]: Students often forget that E[Y∣X]E[Y|X] is a random variable (a function of XX), while E[Y∣X=x]E[Y|X=x] is a specific value (a function of the number xx). This distinction is crucial for understanding the Law of Total Expectation, E[E[Y∣X]]=E[Y]E[E[Y|X]] = E[Y].
      • ❌ Incorrect Marginalization: A frequent error in continuous problems is using incorrect limits when integrating the joint PDF to find the marginal PDF. Always carefully check the support of the joint PDF. For example, if 0<y<x0 < y < x, the integral for fX(x)f_X(x) must be over yy from 00 to xx, not over the full range of yy.
      • ❌ Misinterpreting the Law of Total Variance: A common mistake is to write Var(Y)=Var(E[Y∣X])+Var(Var(Y∣X))Var(Y) = Var(E[Y|X]) + Var(Var(Y|X)) instead of the correct Var(Y)=Var(E[Y∣X])+E[Var(Y∣X)]Var(Y) = Var(E[Y|X]) + E[Var(Y|X)]. Remember, you take the expectation of the conditional variance, not its variance.
    βœ… The correct formula is Var(Y)=E[Var(Y∣X)]+Var(E[Y∣X])Var(Y) = E[Var(Y|X)] + Var(E[Y|X]).

    ---

    Practice Questions

    :::question type="MCQ" question="Let XX be a random variable and g(X)=X2g(X) = X^2. Which of the following is always equal to E[g(X)Y∣X=x]E[g(X)Y | X=x]?" options=["g(x)E[Y]g(x) E[Y]","g(x)E[Y∣X=x]g(x) E[Y|X=x]","E[Y]E[g(X)∣X=x]E[Y] E[g(X)|X=x]","g(E[X∣Y=y])E[Y]g(E[X|Y=y])E[Y]"] answer="g(x)E[Y∣X=x]g(x) E[Y|X=x]" hint="Recall the 'taking out what is known' property of conditional expectation. When we condition on X=xX=x, any function of xx is treated as a constant." solution="The property of conditional expectation states that for any function gg, E[g(X)Y∣X]=g(X)E[Y∣X]E[g(X)Y | X] = g(X)E[Y|X]. When we evaluate this at a specific point X=xX=x, the random variable g(X)g(X) becomes the constant g(x)g(x).
    Therefore, we have:

    E[g(X)Y∣X=x]=g(x)E[Y∣X=x]E[g(X)Y | X=x] = g(x)E[Y|X=x]

    In this specific case, g(x)=x2g(x) = x^2. So, E[X2Y∣X=x]=x2E[Y∣X=x]E[X^2 Y | X=x] = x^2 E[Y|X=x].
    Answer: \boxed{g(x) E[Y|X=x]}"
    :::

    :::question type="NAT" question="The joint PDF of two random variables XX and YY is given by f(x,y)=8xyf(x,y) = 8xy for 0<y<x<10 < y < x < 1, and 0 otherwise. Calculate the value of E[Y∣X=0.5]E[Y|X=0.5]. (Round off to 2 decimal places)." answer="0.33" hint="First, find the marginal PDF fX(x)f_X(x). Then find the conditional PDF fY∣X(y∣x)f_{Y|X}(y|x). Finally, integrate yβ‹…fY∣X(y∣0.5)y \cdot f_{Y|X}(y|0.5) over the appropriate range of yy." solution="
    Step 1: Find the marginal PDF fX(x)f_X(x).
    The limits for yy are from 00 to xx.

    fX(x)=∫0x8xy dy=8x[y22]0x=8x(x22)=4x3f_X(x) = \int_0^x 8xy \, dy = 8x \left[ \frac{y^2}{2} \right]_0^x = 8x \left( \frac{x^2}{2} \right) = 4x^3

    This is for 0<x<10 < x < 1.

    Step 2: Find the conditional PDF fY∣X(y∣x)f_{Y|X}(y|x).

    fY∣X(y∣x)=f(x,y)fX(x)=8xy4x3=2yx2f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)} = \frac{8xy}{4x^3} = \frac{2y}{x^2}

    This is for 0<y<x0 < y < x.

    Step 3: Calculate E[Y∣X=0.5]E[Y|X=0.5].
    First, find the conditional PDF for x=0.5x=0.5.

    fY∣X(y∣0.5)=2y(0.5)2=2y0.25=8yf_{Y|X}(y|0.5) = \frac{2y}{(0.5)^2} = \frac{2y}{0.25} = 8y

    The range for yy is 0<y<0.50 < y < 0.5.

    Step 4: Integrate to find the conditional expectation.

    E[Y∣X=0.5]=∫00.5yβ‹…fY∣X(y∣0.5) dy=∫00.5yβ‹…(8y) dyE[Y|X=0.5] = \int_0^{0.5} y \cdot f_{Y|X}(y|0.5) \, dy = \int_0^{0.5} y \cdot (8y) \, dy

    E[Y∣X=0.5]=∫00.58y2 dy=8[y33]00.5E[Y|X=0.5] = \int_0^{0.5} 8y^2 \, dy = 8 \left[ \frac{y^3}{3} \right]_0^{0.5}

    E[Y∣X=0.5]=83(0.5)3=83β‹…18=13E[Y|X=0.5] = \frac{8}{3} (0.5)^3 = \frac{8}{3} \cdot \frac{1}{8} = \frac{1}{3}

    Result:
    The value is approximately 0.3333... Rounding to 2 decimal places gives 0.33.
    Answer: \boxed{0.33}"
    :::

    :::question type="MSQ" question="Let XX and YY be random variables. Which of the following statements are always true?" options=["E[E[Y∣X]]=E[Y]E[E[Y|X]] = E[Y]","Var(Y)=Var(E[Y∣X])+Var(Var(Y∣X))Var(Y) = Var(E[Y|X]) + Var(Var(Y|X))","E[X+Y∣X]=X+E[Y∣X]E[X+Y|X] = X + E[Y|X]","IfXandYareindependent,If X and Y are independent,E[XY|X] = X E[Y]$"] answer="A,C,D" hint="Carefully check the standard properties of conditional expectation and variance. Pay close attention to the Law of Total Expectation and Law of Total Variance." solution="

    • Option A: This is the Law of Total Expectation (or Tower Property), which is a fundamental property and is always true.

    • Option B: This is an incorrect statement of the Law of Total Variance. The correct law is Var(Y)=Var(E[Y∣X])+E[Var(Y∣X)]Var(Y) = Var(E[Y|X]) + E[Var(Y|X)]. The second term should be the expectation of the conditional variance, not the variance of the conditional variance. So, this statement is false.

    • Option C: Using the linearity property and the 'taking out what is known' property: E[X+Y∣X]=E[X∣X]+E[Y∣X]E[X+Y|X] = E[X|X] + E[Y|X]. Since E[X∣X]=XE[X|X]=X, the statement simplifies to X+E[Y∣X]X + E[Y|X]. This is correct.

    • Option D: Using the 'taking out what is known' property, we get E[XY∣X]=XE[Y∣X]E[XY|X] = X E[Y|X]. If XX and YY are independent, then E[Y∣X]=E[Y]E[Y|X] = E[Y]. Substituting this in, we get E[XY∣X]=XE[Y]E[XY|X] = X E[Y]. This statement is correct.


    Therefore, options A, C, and D are always true.
    Answer: \boxed{A, C, D}"
    :::

    :::question type="NAT" question="A coin has a probability p=2/3p=2/3 of landing Heads (H). It is flipped repeatedly until the pattern Heads-Tails (HT) appears for the first time. What is the expected number of flips required?" answer="4.5" hint="Set up recurrence relations. Let EE be the expected flips from the start, and EHE_H be the expected additional flips after one Head. Solve the system of two linear equations in terms of pp." solution="
    Step 1: Define states.
    Let EE be the expected number of flips from the start.
    Let EHE_H be the expected additional flips, given the last flip was H.
    Let p=2/3p=2/3 be the probability of Heads (H) and q=1βˆ’p=1/3q=1-p=1/3 be the probability of Tails (T).

    Step 2: Formulate the equation for EE.
    From the start, the first flip is either H (with probability pp) or T (with probability qq).

    • If T: We use 1 flip and are back to the start. Total flips: 1+E1+E.

    • If H: We use 1 flip and move to state H. Total flips: 1+EH1+E_H.

    E=q(1+E)+p(1+EH)=1+qE+pEHE = q(1+E) + p(1+E_H) = 1 + qE + pE_H

    Step 3: Formulate the equation for EHE_H.
    From state H, the next flip is either T (with probability qq) or H (with probability pp).

    • If T: We use 1 flip and the pattern HT is achieved. The process stops. Total additional flips: 1.

    • If H: We use 1 flip and are still in state H (the last flip was H). Total additional flips: 1+EH1+E_H.

    EH=q(1)+p(1+EH)=1+pEHE_H = q(1) + p(1+E_H) = 1 + pE_H

    Step 4: Solve for EHE_H.

    EHβˆ’pEH=1E_H - pE_H = 1

    EH(1βˆ’p)=1E_H(1-p) = 1

    Since 1βˆ’p=q1-p=q:
    qEH=1β€…β€ŠβŸΉβ€…β€ŠEH=1qqE_H = 1 \implies E_H = \frac{1}{q}

    Step 5: Substitute EHE_H into the equation for EE.

    E=1+qE+p(1q)E = 1 + qE + p\left(\frac{1}{q}\right)

    Eβˆ’qE=1+pqE - qE = 1 + \frac{p}{q}

    E(1βˆ’q)=q+pqE(1-q) = \frac{q+p}{q}

    Since 1βˆ’q=p1-q=p and q+p=1q+p=1:
    pE=1qpE = \frac{1}{q}

    E=1pqE = \frac{1}{pq}

    Step 6: Calculate the value using p=2/3p=2/3 and q=1/3q=1/3.

    E=1(2/3)(1/3)=12/9=92=4.5E = \frac{1}{(2/3)(1/3)} = \frac{1}{2/9} = \frac{9}{2} = 4.5

    Result:
    The expected number of flips is 4.5.
    Answer: \boxed{4.5}"
    :::

    ---

    Summary

    ❗ Key Takeaways for GATE

    • Law of Total Expectation: The most fundamental property is E[Y]=E[E[Y∣X]]E[Y] = E[E[Y|X]]. This allows breaking down a complex expectation calculation by conditioning on a suitable random variable. It is frequently tested directly as a theoretical question.

    • Calculation Procedure: For computational problems involving E[Y∣X=x]E[Y|X=x], always follow the three-step process: find the marginal distribution of XX, then find the conditional distribution of YY given XX, and finally compute the expectation using that conditional distribution. Be meticulous with the limits of integration or summation.

    • Recurrence Relations: For problems asking for the expected time or trials to an event, the method of conditioning on the first step is extremely effective. Define states representing progress towards the goal and set up a system of linear equations for the expected values from each state.

    ---

    ---

    What's Next?

    πŸ’‘ Continue Learning

    This topic serves as a foundation for several advanced areas in data analysis and probability.

      • Regression Analysis: The conditional expectation E[Y∣X=x]E[Y|X=x] is precisely the regression function of YY on XX. It represents the best possible prediction of YY given XX, in the sense of minimizing mean squared error.
      • Markov Chains: The state-based approach we used for recurrence problems is the essence of analyzing discrete-time Markov chains. The concept of conditioning on the previous state is central to the Markov property.
      • Bayesian Statistics: Conditional distributions are the heart of Bayesian inference. Bayes' theorem is used to update our belief about a parameter (a conditional distribution) after observing data.

    ---

    Chapter Summary

    πŸ“– Random Variables - Key Takeaways

    In this chapter, we have introduced the fundamental concept of a random variable and the mathematical tools used to characterize its behavior. A thorough understanding of these principles is essential for subsequent topics in probability and statistics. The most critical concepts to retain are as follows:

    • Fundamental Definition: A random variable is a function that assigns a numerical value to each outcome in the sample space of a random experiment. We distinguish between discrete random variables, which take on a countable number of values and are described by a Probability Mass Function (PMF), and continuous random variables, which take values in an interval and are described by a Probability Density Function (PDF).

    • Expectation: The expected value, or mean, of a random variable XX, denoted E[X]E[X], represents its long-term average. It is the center of mass of the probability distribution. For a discrete variable,

    E[X]=βˆ‘ixiP(X=xi)E[X] = \sum_{i} x_i P(X=x_i)

    and for a continuous variable,
    E[X]=βˆ«βˆ’βˆžβˆžxfX(x)dxE[X] = \int_{-\infty}^{\infty} x f_X(x) dx

    • Variance and Standard Deviation: Variance, Var⁑(X)\operatorname{Var}(X), is the primary measure of the dispersion or spread of a distribution around its mean.

    Var⁑(X)=E[(Xβˆ’E[X])2]=E[X2]βˆ’(E[X])2\operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2

    The standard deviation, ΟƒX=Var⁑(X)\sigma_X = \sqrt{\operatorname{Var}(X)}, provides this measure in the same units as the random variable itself.

    • Covariance and Correlation: For two random variables XX and YY, covariance, Cov⁑(X,Y)\operatorname{Cov}(X, Y), measures their joint variability.

    Cov⁑(X,Y)=E[XY]βˆ’E[X]E[Y]\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]

    The correlation coefficient,
    ρXY=Cov⁑(X,Y)ΟƒXΟƒY\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}

    normalizes this measure to the range [βˆ’1,1][-1, 1], indicating the strength and direction of the linear relationship between them. It is crucial to remember that if XX and YY are independent, then Cov⁑(X,Y)=0\operatorname{Cov}(X, Y) = 0, but the converse is not generally true.

    • Properties of Expectation and Variance: The linearity of expectation,

    E[aX+bY]=aE[X]+bE[Y]E[aX + bY] = aE[X] + bE[Y]

    is a universally applicable and powerful tool. The variance of a linear combination is given by
    Var⁑(aX+bY)=a2Var⁑(X)+b2Var⁑(Y)+2abCov⁑(X,Y)\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y)

    The covariance term vanishes if and only if the variables are uncorrelated.

    • Conditional Expectation: The conditional expectation E[X∣Y=y]E[X|Y=y] is the expected value of XX given that the random variable YY has taken the specific value yy. A cornerstone result is the Law of Total Expectation,

    E[X]=E[E[X∣Y]]E[X] = E[E[X|Y]]

    which allows us to compute an expectation by conditioning on another related variable.

    ---

    Chapter Review Questions

    :::question type="MCQ" question="Let XX and YY be two random variables with E[X]=2E[X]=2, E[Y]=3E[Y]=3, Var⁑(X)=1\operatorname{Var}(X)=1, and Var⁑(Y)=4\operatorname{Var}(Y)=4. If the correlation coefficient between them is ρXY=0.5\rho_{XY} = 0.5, what is the variance of the random variable Z=2Xβˆ’YZ = 2X - Y?" options=["4","8","12","16"] answer="A" hint="Recall the formula for the variance of a linear combination of two random variables: Var⁑(aX+bY)=a2Var⁑(X)+b2Var⁑(Y)+2abCov⁑(X,Y)\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y). You will first need to find the covariance from the given correlation coefficient." solution="
    We are asked to find the variance of Z=2Xβˆ’YZ = 2X - Y. The general formula for the variance of a linear combination aX+bYaX + bY is:

    Var⁑(aX+bY)=a2Var⁑(X)+b2Var⁑(Y)+2abCov⁑(X,Y)\operatorname{Var}(aX + bY) = a^2\operatorname{Var}(X) + b^2\operatorname{Var}(Y) + 2ab\operatorname{Cov}(X, Y)

    In our case, a=2a=2 and b=βˆ’1b=-1. The formula becomes:
    Var⁑(2Xβˆ’Y)=(2)2Var⁑(X)+(βˆ’1)2Var⁑(Y)+2(2)(βˆ’1)Cov⁑(X,Y)\operatorname{Var}(2X - Y) = (2)^2\operatorname{Var}(X) + (-1)^2\operatorname{Var}(Y) + 2(2)(-1)\operatorname{Cov}(X, Y)

    Var⁑(Z)=4Var⁑(X)+Var⁑(Y)βˆ’4Cov⁑(X,Y)\operatorname{Var}(Z) = 4\operatorname{Var}(X) + \operatorname{Var}(Y) - 4\operatorname{Cov}(X, Y)

    We are given Var⁑(X)=1\operatorname{Var}(X)=1 and Var⁑(Y)=4\operatorname{Var}(Y)=4. To find Cov⁑(X,Y)\operatorname{Cov}(X, Y), we use the definition of the correlation coefficient:
    ρXY=Cov⁑(X,Y)ΟƒXΟƒY\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}

    where ΟƒX=Var⁑(X)\sigma_X = \sqrt{\operatorname{Var}(X)} and ΟƒY=Var⁑(Y)\sigma_Y = \sqrt{\operatorname{Var}(Y)}.
    Given Var⁑(X)=1\operatorname{Var}(X)=1, we have ΟƒX=1=1\sigma_X = \sqrt{1} = 1.
    Given Var⁑(Y)=4\operatorname{Var}(Y)=4, we have ΟƒY=4=2\sigma_Y = \sqrt{4} = 2.
    Now, we can find the covariance:
    Cov⁑(X,Y)=ρXYΟƒXΟƒY=(0.5)(1)(2)=1\operatorname{Cov}(X, Y) = \rho_{XY} \sigma_X \sigma_Y = (0.5)(1)(2) = 1

    Finally, we substitute all the values back into the variance formula for ZZ:
    Var⁑(Z)=4(1)+4βˆ’4(1)\operatorname{Var}(Z) = 4(1) + 4 - 4(1)

    Var⁑(Z)=4+4βˆ’4=4\operatorname{Var}(Z) = 4 + 4 - 4 = 4

    Therefore, the variance of ZZ is 4\boxed{4}.
    "
    :::

    :::question type="NAT" question="A continuous random variable XX is uniformly distributed over the interval [1,3][1, 3]. The value of the conditional expectation E[X∣X>2]E[X | X > 2] is ____." answer="2.5" hint="First, find the conditional PDF of XX given the event A={X>2}A = \{X > 2\}. The formula is fX∣A(x)=fX(x)/P(A)f_{X|A}(x) = f_X(x) / P(A) for xx in the conditioned range." solution="
    Let X∼U(1,3)X \sim U(1, 3). The probability density function (PDF) of XX is:

    fX(x)={13βˆ’1=12forΒ 1≀x≀30otherwisef_X(x) = \begin{cases} \frac{1}{3-1} = \frac{1}{2} & \text{for } 1 \le x \le 3 \\ 0 & \text{otherwise} \end{cases}

    We want to find the conditional expectation E[X∣X>2]E[X | X > 2]. Let the event be A={X>2}A = \{X > 2\}.
    First, we calculate the probability of this event:
    P(A)=P(X>2)=∫23fX(x)dx=∫2312dx=12[x]23=12(3βˆ’2)=12P(A) = P(X > 2) = \int_{2}^{3} f_X(x) dx = \int_{2}^{3} \frac{1}{2} dx = \frac{1}{2} [x]_2^3 = \frac{1}{2}(3 - 2) = \frac{1}{2}

    The conditional PDF of XX given AA is defined as:
    fX∣A(x)=fX(x)P(A)for x∈Af_{X|A}(x) = \frac{f_X(x)}{P(A)} \quad \text{for } x \in A

    For 2<x≀32 < x \le 3, the conditional PDF is:
    fX∣A(x)=1/21/2=1f_{X|A}(x) = \frac{1/2}{1/2} = 1

    And fX∣A(x)=0f_{X|A}(x) = 0 otherwise. This means that given X>2X>2, XX is uniformly distributed on the interval (2,3](2, 3].
    Now, we can compute the conditional expectation:
    E[X∣X>2]=βˆ«βˆ’βˆžβˆžxfX∣A(x)dx=∫23xβ‹…1dxE[X | X > 2] = \int_{-\infty}^{\infty} x f_{X|A}(x) dx = \int_{2}^{3} x \cdot 1 dx

    E[X∣X>2]=[x22]23=322βˆ’222=92βˆ’42=52=2.5E[X | X > 2] = \left[ \frac{x^2}{2} \right]_2^3 = \frac{3^2}{2} - \frac{2^2}{2} = \frac{9}{2} - \frac{4}{2} = \frac{5}{2} = 2.5

    The value of the conditional expectation is 2.5\boxed{2.5}.
    "
    :::

    :::question type="MCQ" question="A fair six-sided die is rolled. Let XX be the random variable for the outcome. What is the expected value of the random variable Y=(Xβˆ’3.5)2Y = (X-3.5)^2?" options=["2.917","3.5","1.708","2.5"] answer="A" hint="The value E[(Xβˆ’ΞΌ)2]E[(X-\mu)^2] is, by definition, the variance of XX. First, calculate the mean (expected value) of the die roll." solution="
    The random variable XX represents the outcome of a fair six-sided die roll. The possible values for XX are {1,2,3,4,5,6}\{1, 2, 3, 4, 5, 6\}, each with probability P(X=x)=1/6P(X=x) = 1/6.

    First, we calculate the expected value of XX, denoted ΞΌX\mu_X or E[X]E[X]:

    E[X]=βˆ‘x=16xβ‹…P(X=x)=16(1+2+3+4+5+6)=216=3.5E[X] = \sum_{x=1}^{6} x \cdot P(X=x) = \frac{1}{6}(1+2+3+4+5+6) = \frac{21}{6} = 3.5

    The question asks for the value of E[Y]=E[(Xβˆ’3.5)2]E[Y] = E[(X-3.5)^2].
    We recognize that 3.53.5 is the mean of XX. Therefore, we are being asked to calculate E[(Xβˆ’E[X])2]E[(X - E[X])^2]. This is the definition of the variance of XX, Var⁑(X)\operatorname{Var}(X).
    Var⁑(X)=E[(Xβˆ’E[X])2]=E[X2]βˆ’(E[X])2\operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2

    We can calculate Var⁑(X)\operatorname{Var}(X) by first finding E[X2]E[X^2]:
    E[X2]=βˆ‘x=16x2β‹…P(X=x)=16(12+22+32+42+52+62)E[X^2] = \sum_{x=1}^{6} x^2 \cdot P(X=x) = \frac{1}{6}(1^2+2^2+3^2+4^2+5^2+6^2)

    E[X2]=16(1+4+9+16+25+36)=916E[X^2] = \frac{1}{6}(1+4+9+16+25+36) = \frac{91}{6}

    Now, we can find the variance:
    Var⁑(X)=E[X2]βˆ’(E[X])2=916βˆ’(3.5)2=916βˆ’(72)2\operatorname{Var}(X) = E[X^2] - (E[X])^2 = \frac{91}{6} - (3.5)^2 = \frac{91}{6} - \left(\frac{7}{2}\right)^2

    Var⁑(X)=916βˆ’494=2β‹…91βˆ’3β‹…4912=182βˆ’14712=3512\operatorname{Var}(X) = \frac{91}{6} - \frac{49}{4} = \frac{2 \cdot 91 - 3 \cdot 49}{12} = \frac{182 - 147}{12} = \frac{35}{12}

    As a decimal, 3512β‰ˆ2.9166...\frac{35}{12} \approx 2.9166....
    Rounding to three decimal places, the answer is 2.917\boxed{2.917}.
    "
    :::

    :::question type="NAT" question="Two discrete random variables XX and YY have a joint PMF given by P(X=1,Y=0)=0.1P(X=1, Y=0)=0.1, P(X=1,Y=1)=0.4P(X=1, Y=1)=0.4, P(X=2,Y=0)=0.3P(X=2, Y=0)=0.3, and P(X=2,Y=1)=0.2P(X=2, Y=1)=0.2. The covariance, Cov⁑(X,Y)\operatorname{Cov}(X, Y), is ____." answer="-0.1" hint="Use the formula Cov⁑(X,Y)=E[XY]βˆ’E[X]E[Y]\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]. You will need to calculate the marginal distributions to find E[X]E[X] and E[Y]E[Y] first." solution="
    To find the covariance Cov⁑(X,Y)\operatorname{Cov}(X, Y), we use the formula Cov⁑(X,Y)=E[XY]βˆ’E[X]E[Y]\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]. We must compute each term separately.

    1. Calculate marginal PMFs and expectations E[X]E[X] and E[Y]E[Y]:
    The marginal PMF of XX is found by summing over the values of YY:
    P(X=1)=P(X=1,Y=0)+P(X=1,Y=1)=0.1+0.4=0.5P(X=1) = P(X=1, Y=0) + P(X=1, Y=1) = 0.1 + 0.4 = 0.5
    P(X=2)=P(X=2,Y=0)+P(X=2,Y=1)=0.3+0.2=0.5P(X=2) = P(X=2, Y=0) + P(X=2, Y=1) = 0.3 + 0.2 = 0.5
    The expected value of XX is:
    E[X]=1β‹…P(X=1)+2β‹…P(X=2)=1(0.5)+2(0.5)=0.5+1.0=1.5E[X] = 1 \cdot P(X=1) + 2 \cdot P(X=2) = 1(0.5) + 2(0.5) = 0.5 + 1.0 = 1.5.

    The marginal PMF of YY is found by summing over the values of XX:
    P(Y=0)=P(X=1,Y=0)+P(X=2,Y=0)=0.1+0.3=0.4P(Y=0) = P(X=1, Y=0) + P(X=2, Y=0) = 0.1 + 0.3 = 0.4
    P(Y=1)=P(X=1,Y=1)+P(X=2,Y=1)=0.4+0.2=0.6P(Y=1) = P(X=1, Y=1) + P(X=2, Y=1) = 0.4 + 0.2 = 0.6
    The expected value of YY is:
    E[Y]=0β‹…P(Y=0)+1β‹…P(Y=1)=0(0.4)+1(0.6)=0.6E[Y] = 0 \cdot P(Y=0) + 1 \cdot P(Y=1) = 0(0.4) + 1(0.6) = 0.6.

    2. Calculate E[XY]E[XY]:
    The expectation of the product XYXY is calculated using the joint PMF:

    E[XY]=βˆ‘x,yxyβ‹…P(X=x,Y=y)E[XY] = \sum_{x,y} xy \cdot P(X=x, Y=y)

    The only terms that are non-zero are when both x≠0x \ne 0 and y≠0y \ne 0. In this case, that is when y=1y=1.
    E[XY]=(1)(1)P(X=1,Y=1)+(2)(1)P(X=2,Y=1)E[XY] = (1)(1)P(X=1, Y=1) + (2)(1)P(X=2, Y=1)

    E[XY]=(1)(0.4)+(2)(0.2)=0.4+0.4=0.8E[XY] = (1)(0.4) + (2)(0.2) = 0.4 + 0.4 = 0.8

    3. Calculate Covariance:
    Now we substitute the computed values into the covariance formula:

    Cov⁑(X,Y)=E[XY]βˆ’E[X]E[Y]\operatorname{Cov}(X, Y) = E[XY] - E[X]E[Y]

    Cov⁑(X,Y)=0.8βˆ’(1.5)(0.6)\operatorname{Cov}(X, Y) = 0.8 - (1.5)(0.6)

    Cov⁑(X,Y)=0.8βˆ’0.9=βˆ’0.1\operatorname{Cov}(X, Y) = 0.8 - 0.9 = -0.1

    The covariance is βˆ’0.1\boxed{-0.1}.
    "
    :::

    ---

    What's Next?

    πŸ’‘ Continue Your GATE Journey

    Having completed this chapter on Random Variables, you have established a firm foundation in the language and mathematics used to describe and analyze random phenomena. These concepts are not an endpoint but rather a critical stepping stone in your preparation.

    Key connections:

      • Relation to Previous Chapters: This chapter builds directly upon the fundamentals of Set Theory and Probability. The sample spaces and events we studied previously are now mapped to numerical values, allowing us to use the tools of calculus and algebra. The axioms of probability provide the rigorous underpinning for the properties of PMFs and PDFs.
      • Foundation for Future Chapters: The concepts mastered here are indispensable for the chapters that follow:
    - Standard Probability Distributions: Our next step is to study specific, named families of random variables (such as Binomial, Poisson, Normal, and Exponential distributions). These are mathematical models for common real-world processes, and each is defined by its PMF or PDF, mean, and varianceβ€”the very concepts we have just explored. - Statistics and Estimation: In statistics, we often work with a sample of data to make inferences about a larger population. The sample mean and sample variance are themselves random variables. Understanding their expected values and variances is crucial for topics like parameter estimation and hypothesis testing. - Stochastic Processes: A stochastic process, such as a Markov Chain, is a sequence of random variables, typically indexed by time. A deep understanding of the properties of a single random variable is the essential prerequisite for analyzing systems that evolve randomly over time.

    🎯 Key Points to Remember

    • βœ“ Master the core concepts in Random Variables before moving to advanced topics
    • βœ“ Practice with previous year questions to understand exam patterns
    • βœ“ Review short notes regularly for quick revision before exams

    Related Topics in Probability and Statistics

    More Resources

    Why Choose MastersUp?

    🎯

    AI-Powered Plans

    Personalized study schedules based on your exam date and learning pace

    πŸ“š

    15,000+ Questions

    Verified questions with detailed solutions from past papers

    πŸ“Š

    Smart Analytics

    Track your progress with subject-wise performance insights

    πŸ”–

    Bookmark & Revise

    Save important questions for quick revision before exams

    Start Your Free Preparation β†’

    No credit card required β€’ Free forever for basic features