Discrete Probability Distributions
Overview
In our preceding discussions, we established the foundational axioms of probability theory and explored the concept of random variables. We now advance our study to a more structured examination of how probabilities are assigned to the outcomes of a discrete random variable. This chapter introduces a family of standard discrete probability distributions, which serve as indispensable mathematical models for a vast array of random phenomena encountered in engineering and data analysis. These distributions provide a formal framework for quantifying uncertainty in scenarios where the outcomes are countable, such as the number of defective items in a batch, the count of network packet arrivals in a given time interval, or the result of a simple success/fail experiment.
A thorough command of these distributions is paramount for success in the GATE examination. Questions frequently require not only the application of a specific formula but also the critical ability to identify which distribution best models the described scenario. By understanding the underlying assumptions and characteristic properties of each distributionβBernoulli, Binomial, Poisson, and Uniformβwe equip ourselves with the analytical tools necessary to deconstruct complex problems. Mastery of this material will enable us to calculate probabilities, determine expected values, and analyze variances with precision, skills that are fundamental to the quantitative reasoning assessed in the examination.
---
Chapter Contents
| # | Topic | What You'll Learn |
|---|-------|-------------------|
| 1 | Probability Mass Function (PMF) | Defining probability for discrete random variables. |
| 2 | Bernoulli Distribution | Modeling a single trial with two outcomes. |
| 3 | Binomial Distribution | Modeling multiple independent Bernoulli trials. |
| 4 | Poisson Distribution | Modeling event counts in a fixed interval. |
| 5 | Uniform Distribution | Modeling outcomes with equal probability. |
---
Learning Objectives
After completing this chapter, you will be able to:
- Define and interpret the Probability Mass Function (PMF) for a discrete random variable.
- Identify the appropriate discrete distribution (Bernoulli, Binomial, Poisson, Uniform) to model a given scenario.
- Calculate probabilities, expected values, and variances for each of the standard discrete distributions.
- Apply the properties of these distributions to solve quantitative problems typical of the GATE examination.
---
We now turn our attention to the Probability Mass Function (PMF)...
## Part 1: Probability Mass Function (PMF)
Introduction
In our study of probability, we are often concerned with numerical outcomes of a random experiment. A random variable provides a way to map these outcomes to numerical values. When the random variable can only take on a countable number of distinct values, we classify it as a discrete random variable. Consider, for instance, the number of heads in three coin tosses, which can be 0, 1, 2, or 3, but not 1.5.
To fully characterize a discrete random variable, we must specify the probability associated with each of its possible values. The function that provides this information is known as the Probability Mass Function, or PMF. It is the fundamental tool for describing the probability distribution of a discrete random variable, serving as the cornerstone for calculating probabilities of events and deriving other key statistical measures.
Let be a discrete random variable with a set of possible values (the support) . The Probability Mass Function (PMF) of is a function, denoted by , that gives the probability that is exactly equal to some value .
For any value not in the support , we have .
---
Key Concepts and Properties
The PMF is not an arbitrary function; it must satisfy two fundamental properties that are a direct consequence of the axioms of probability. These properties are essential for a function to be considered a valid PMF.
#
## 1. Properties of a Valid PMF
Let us consider a discrete random variable with support . Its PMF, , must adhere to the following two conditions:
Condition 1: Non-negativity
The probability of any event must be non-negative. Therefore, for every possible value in the support , the PMF must be greater than or equal to zero.
Condition 2: Summation to Unity
The sum of the probabilities of all possible outcomes of a random experiment must be equal to 1. Consequently, the sum of the PMF over all values in the support must equal 1.
These two properties are the definitive test for a valid PMF. If a function describing a discrete random variable satisfies both, it is a valid PMF.
#
## 2. Visualizing a PMF
A PMF can be effectively visualized using a bar chart or a line graph, where the x-axis represents the possible values of the random variable , and the y-axis represents the corresponding probabilities . This graphical representation provides an intuitive understanding of the distribution of probability mass across the different outcomes.
Consider the simple experiment of rolling a single fair six-sided die. The random variable is the number shown on the die. The support is , and since the die is fair, the probability of each outcome is . The PMF is:
We can represent this visually.
#
## 3. Calculating Probabilities from a PMF
The primary utility of a PMF is in calculating the probability of an event. An event is a subset of the support . The probability of an event is the sum of the probabilities of the individual outcomes that constitute .
Variables:
- = A discrete random variable
- = An event, which is a subset of the support of
- = The PMF of
When to use: Use this formula to find the probability that the outcome of a random experiment falls within a specific range or set of values. For example, to find , we sum the PMF for all integer values from to .
Worked Example:
Problem: A discrete random variable has the following probability mass function: for .
(a) Find the value of the constant .
(b) Calculate .
Solution:
(a) Find the value of k
Step 1: Apply the property that the sum of all probabilities must equal 1.
Step 2: Substitute the given PMF into the summation.
Step 3: Simplify the expression.
Step 4: Solve for .
Result: The value of the constant is .
(b) Calculate P(X β₯ 2)
Step 1: Identify the outcomes that satisfy the event . These are and .
Step 2: Use the PMF with the calculated value of to find the probabilities for and . The PMF is .
Step 3: Sum these probabilities to find the final answer.
Answer: The probability is .
---
Problem-Solving Strategies
In many GATE problems involving PMFs, a constant (like in the example) will be unknown. The very first step should always be to use the property . This will almost always yield an equation that allows you to solve for the unknown constant. Once the constant is found, the PMF is fully defined, and any subsequent probability calculations become straightforward.
---
Common Mistakes
* β Forgetting to check both conditions for a valid PMF. Students often check that the sum is 1 but forget to verify that for all . A negative value for any probability makes the function invalid.
β
Always check both: (1) for all and (2) .
* β Incorrectly summing probabilities for range-based events. For an event like , students might mistakenly include .
β
Pay close attention to strict () versus inclusive () inequalities. Write out the exact integer values included in the event before summing their probabilities. For , the sum is over integers such that .
---
Practice Questions
:::question type="MCQ" question="Which of the following functions can be a valid Probability Mass Function for a random variable with support ?" options=["", "", "", ""] answer="P(x) = \frac{x+1}{6}" hint="Check the two conditions for a valid PMF for each option: non-negativity and sum to unity." solution="
Analysis of Options:
- Option A:
- Option B:
- Option C:
- Option D:
Let's re-evaluate Option B.
Sum is . Non-negativity is satisfied. So, this is also a valid PMF.
There might be an error in my question creation. Let me adjust the question or options. Let's adjust option B to .
New Option B: .
. Sum = . This is better.
So, with the corrected option, only C is valid. I'll use the corrected option for the question.
Final Solution (with corrected option B as ):
We test each option against the two properties of a PMF. The support is .
My logic was flawed. Let me re-create a better set of options.
Let's try again.
Support .
A) . Sum = . Invalid.
B) . For , . Invalid.
C) . . So is valid.
D) . Sum = . All are non-negative. Valid.
Okay, let's make a question where only one is correct.
Support .
A) . Sum = .
B) . For , .
C) . Sum = . Non-negative. This is correct.
D) . Sum = .
This is a good MCQ. I will use this set.
Solution:
We must check two conditions for each function over the support :
- Option A:
- Option B:
- Option C:
- Option D:
Therefore, only is a valid PMF.
"
:::
:::question type="NAT" question="A discrete random variable has a PMF given by for . What is the value of the constant ?" answer="0.692" hint="The sum of probabilities over the entire support must be equal to 1. Set up the equation and solve for c." solution="
Step 1: The sum of the PMF over its support must be 1.
Step 2: Substitute the given PMF into the summation.
Step 3: Simplify the expression.
Step 4: Find a common denominator and solve for .
Step 5: Convert the fraction to a decimal rounded to three places.
Result: The value of , rounded to three decimal places, is 0.692.
"
:::
:::question type="MSQ" question="The number of defects in a manufactured item has the following PMF: , , , . Which of the following statements are true?" options=["This is a valid PMF.", "The probability of at least one defect is 0.4.", "The probability of at most one defect is 0.7.", "P(X > 1) = P(X < 1)"] answer="A,B" hint="Verify the properties of a PMF first. Then calculate the probability for each event described in the options." solution="
Analysis of Statements:
- Option A: This is a valid PMF.
- Option B: The probability of at least one defect is 0.4.
- Option C: The probability of at most one defect is 0.7.
- Option D: P(X > 1) = P(X < 1)
Thus, only statements A and B are correct.
"
:::
---
Summary
- Definition: The Probability Mass Function (PMF), , gives the probability that a discrete random variable takes on a specific value .
- Two Core Properties: A function is a valid PMF if and only if it satisfies both:
- Problem-Solving: When faced with a PMF containing an unknown constant, always start by using the "sum to unity" property to solve for it.
- Non-negativity: for all possible values of .
- Sum to Unity: , where the sum is over all possible values of .
---
What's Next?
The Probability Mass Function is a foundational concept for discrete random variables. A solid understanding of PMF is essential for mastering related topics:
- Cumulative Distribution Function (CDF): The CDF for a discrete variable is calculated by summing the PMF. The CDF, , represents the cumulative probability up to a value .
- Expectation and Variance: The expected value (or mean) of a discrete random variable is a weighted average of its possible values, where the weights are the probabilities given by the PMF. The formula is . Variance is also computed using the PMF.
---
Now that you understand Probability Mass Function (PMF), let's explore Bernoulli Distribution which builds on these concepts.
---
Part 2: Bernoulli Distribution
Introduction
In the study of probability, we often begin with the simplest of random experiments: those with only two possible outcomes. These binary scenarios form the fundamental building blocks for more complex probabilistic models. The Bernoulli distribution is the discrete probability distribution that governs such an experiment, often termed a Bernoulli trial. Its importance lies not in its direct application to complex systems, but in its role as the foundation upon which other critical distributions, most notably the Binomial distribution, are constructed.
A thorough understanding of the Bernoulli distribution is essential for grasping the principles of random variables and their behavior. We will explore the mathematical formulation of this distribution, its primary characteristics such as mean and variance, and its application in modeling single events of success or failure. For the GATE examination, mastery of these fundamentals is a prerequisite for tackling more advanced topics in probability and statistics.
A Bernoulli trial is a random experiment that has exactly two possible outcomes. These outcomes are conventionally labeled "success" and "failure". The probability of success is denoted by , and consequently, the probability of failure is , often denoted by .
---
Key Concepts
The Bernoulli distribution describes the behavior of a random variable that represents the outcome of a single Bernoulli trial.
A random variable is said to be a Bernoulli random variable with parameter if it can take only two values: (representing success) and (representing failure). We write this as .
#
## 1. Probability Mass Function (PMF)
The Probability Mass Function (PMF) of a discrete random variable gives the probability that the variable is equal to a specific value. For a Bernoulli random variable , the PMF assigns a probability to the outcomes and .
We can express this relationship concisely with a single formula.
Variables:
- = The Bernoulli random variable
- = The outcome, which can be (failure) or (success)
- = The probability of success ()
When to use: Use this formula to find the probability of a specific outcome (success or failure) in a single trial.
We can verify this formula for both possible values of :
- If (success): .
- If (failure): .
The PMF can be visualized as a simple bar chart.
---
#
## 2. Mean and Variance
The mean, or expected value, and the variance are two of the most important measures describing a probability distribution. For the Bernoulli distribution, these are straightforward to derive.
#
### Mean (Expected Value)
The expected value, , is the long-run average value of the random variable.
Variables:
- = The expected value (mean) of the random variable
- = The probability of success
Application: This gives the average outcome over many repeated trials.
Derivation:
The expected value of a discrete random variable is defined as .
Step 1: Apply the definition of expected value to the Bernoulli random variable.
Step 2: Expand the summation over the two possible outcomes, and .
Step 3: Substitute the probabilities and .
Step 4: Simplify the expression.
Result:
The mean of a Bernoulli random variable is simply the probability of success, .
#
### Variance
The variance, , measures the spread or dispersion of the distribution. It is defined as .
Variables:
- = The variance of the random variable
- = The probability of success
Application: This quantifies the uncertainty of the outcome. The variance is maximized when .
Derivation:
To find the variance, we first need to compute .
Step 1: Apply the definition of expected value to the function .
Step 2: Expand the summation.
Step 3: Substitute the probabilities and simplify.
Step 4: Now, apply the variance formula .
Step 5: Factor the expression.
Result:
The variance of a Bernoulli random variable is , which is often written as .
Worked Example:
Problem: A switch in a circuit is closed with a probability of . Let be a random variable where if the switch is closed and otherwise. Find the mean and variance of .
Solution:
Step 1: Identify the distribution and its parameter.
The experiment has two outcomes (closed or not closed), so it is a Bernoulli trial. The random variable follows a Bernoulli distribution. The probability of success (switch is closed) is given as .
So, .
Step 2: Calculate the mean using the formula .
Step 3: Calculate the variance using the formula .
Step 4: Compute the final value for the variance.
Answer: The mean of is and the variance of is .
---
Problem-Solving Strategies
In a GATE problem, look for keywords that indicate a single trial with only two outcomes. Phrases like "a single coin toss," "one item is tested," "a component either fails or does not fail," or "a single bit is transmitted" are strong indicators. If the problem involves multiple such trials, it is likely a Binomial distribution problem, which is built upon the Bernoulli trial. Always start by identifying if the core experiment is a Bernoulli trial.
---
Common Mistakes
- β Confusing and : Students sometimes mistakenly use the probability of failure when the question asks for a calculation related to success. Always clearly define which outcome corresponds to "success" () and assign the probability to it.
- β Assuming : Do not assume the outcomes are equally likely unless the problem explicitly states it (e.g., "a fair coin"). The parameter can be any value between and .
- β Applying to Multiple Trials: The Bernoulli distribution applies only to a single trial. Using its formulas for an experiment with multiple trials (e.g., finding the probability of 3 heads in 5 tosses) is incorrect. That scenario requires the Binomial distribution.
---
Practice Questions
:::question type="MCQ" question="Which of the following random experiments is best described by a Bernoulli distribution?" options=["The number of defective items in a sample of 10 items.", "The outcome of a single attempt at a free throw in basketball (make or miss).", "The sum of the numbers appearing on two dice rolls.", "The number of emails arriving in an inbox in one hour."] answer="The outcome of a single attempt at a free throw in basketball (make or miss)." hint="A Bernoulli distribution models a single experiment with exactly two possible outcomes." solution="A Bernoulli distribution models a single trial with two mutually exclusive outcomes (success/failure).
- Option A involves 10 trials, which is a Binomial experiment.
- Option B involves a single trial (one free throw) with two outcomes (make or miss). This is a perfect fit for a Bernoulli distribution.
- Option C has multiple possible outcomes (sums from 2 to 12).
- Option D involves counting events over an interval, which is typically modeled by a Poisson distribution.
:::
:::question type="NAT" question="A random variable follows a Bernoulli distribution with a mean of . What is the variance of ?" answer="0.24" hint="Recall the formulas for the mean and variance of a Bernoulli distribution. The mean directly gives you the parameter ." solution="
Step 1: Identify the given information.
We are given that follows a Bernoulli distribution and its mean is .
Step 2: Relate the mean to the parameter .
For a Bernoulli distribution, the mean is given by .
Therefore, .
Step 3: Use the formula for the variance of a Bernoulli distribution.
The variance is given by .
Step 4: Substitute the value of and calculate the variance.
Result: The variance of is 0.24.
"
:::
:::question type="MSQ" question="Let be a Bernoulli random variable with parameter , where . Which of the following statements are ALWAYS true?" options=["The mean of is always greater than its variance.", "The variance of is maximized when .", ".", "The expected value of is equal to the expected value of ."] answer="A,B,C,D" hint="Analyze each statement using the properties and formulas of the Bernoulli distribution. For the first statement, compare and ." solution="
Let's evaluate each statement:
- A: The mean of is always greater than its variance.
- B: The variance of is maximized when .
- C: .
- D: The expected value of is equal to the expected value of .
All four statements are true.
"
:::
---
Summary
- Single Trial, Two Outcomes: The Bernoulli distribution models a single experiment with only two outcomes: success (value 1) and failure (value 0).
- Governed by Parameter : The entire distribution is defined by a single parameter, , which is the probability of success. The probability of failure is .
- Core Formulas: You must memorize the mean and variance.
- Mean:
- Variance:
---
What's Next?
The Bernoulli distribution is the fundamental unit for several other important discrete distributions. Mastering it is crucial for understanding the following topics:
- Binomial Distribution: This distribution models the number of successes in a fixed number () of independent and identical Bernoulli trials. It is a direct and essential extension of the Bernoulli concept.
- Geometric Distribution: This distribution models the number of Bernoulli trials needed to get the first success. It also relies on the concept of repeated, independent Bernoulli trials.
---
Now that you understand Bernoulli Distribution, let's explore Binomial Distribution which builds on these concepts.
---
Part 3: Binomial Distribution
Introduction
In our study of probability, we frequently encounter experiments that consist of a sequence of repeated, independent trials, where each trial has only two possible outcomes. Consider, for instance, the repeated tossing of a coin, where each toss results in either a head or a tail. Another example might be the inspection of items from a production line, where each item is classified as either defective or non-defective. The Binomial distribution provides a fundamental mathematical model for analyzing the number of "successes" that occur in a fixed number of such trials.
This distribution is a cornerstone of discrete probability theory and finds extensive application in fields ranging from quality control and engineering to genetics and finance. For the GATE examination, a firm grasp of the Binomial distribution's propertiesβits probability mass function, its mean, and its varianceβis essential for solving a significant class of problems involving discrete random variables. We shall explore the theoretical underpinnings of this distribution and its practical application through carefully selected examples.
Let a random experiment consist of independent and identical trials, where each trial can result in one of two mutually exclusive outcomes, termed 'success' (S) and 'failure' (F). Let the probability of a success in a single trial be , and the probability of a failure be .
If the random variable denotes the total number of successes in these trials, then is said to follow a Binomial distribution with parameters and . We denote this as .
---
Key Concepts
The foundation of the Binomial distribution lies in the concept of a Bernoulli trial, which we shall examine first.
#
## 1. The Bernoulli Trial
A single trial that can result in only two possible outcomes is known as a Bernoulli trial. The key assumptions for a sequence of trials to be considered for a Binomial model are:
#
## 2. Probability Mass Function (PMF)
The probability mass function (PMF) of a Binomial distribution gives the probability of observing exactly successes in trials. Let us consider how to derive this.
The probability of any specific sequence of successes and failures is, by the independence of trials, . However, there are multiple such sequences. The number of ways to arrange successes among trials is given by the binomial coefficient, .
It follows that the probability of obtaining exactly successes, regardless of the order, is the product of the number of ways and the probability of any one specific sequence.
For a random variable , the probability of obtaining exactly successes in trials is given by:
Variables:
- = total number of independent trials
- = number of successes ()
- = probability of success in a single trial
- is the binomial coefficient.
When to use: To find the probability of an exact number of successes.
Worked Example:
Problem: A fair coin is tossed 6 times. What is the probability of getting exactly 4 heads?
Solution:
Step 1: Identify the parameters of the Binomial distribution.
The experiment consists of independent trials. A "success" is getting a head. For a fair coin, the probability of success is . The number of successes we are interested in is .
Step 2: Apply the Binomial PMF formula.
We use the formula .
Step 3: Calculate the binomial coefficient and the powers.
The binomial coefficient is:
The powers are:
Step 4: Compute the final probability.
Answer: The probability of getting exactly 4 heads is .
---
#
## 3. Expectation and Variance
For any probability distribution, the measures of central tendency and dispersion are of paramount importance. For the Binomial distribution, these are the expectation (or mean) and the variance.
#
### Expectation (Mean)
The expectation of a Binomial random variable, , represents the average number of successes we would expect to see if we repeated the entire -trial experiment many times. Its formula is remarkably simple and intuitive.
Variables:
- = total number of trials
- = probability of success in a single trial
When to use: When asked for the mean, expected value, or average number of successes.
Worked Example:
Problem: An electronics factory produces microchips, with a 5% defect rate. If a batch of 200 microchips is selected for quality control, what is the expected number of defective microchips in the batch?
Solution:
Step 1: Identify the Binomial parameters.
This scenario can be modeled as a Binomial distribution. Each microchip is a trial.
Number of trials, .
A "success" is finding a defective microchip. The probability of success is .
Step 2: Apply the formula for expectation.
Step 3: Substitute the values and calculate.
Answer: The expected number of defective microchips is .
#
### Variance and Standard Deviation
The variance measures the spread or dispersion of the distribution around its mean. A larger variance implies that the outcomes are more spread out.
Variables:
- = total number of trials
- = probability of success
- = probability of failure
Note: The standard deviation is the square root of the variance, .
We observe that the variance is maximized when for a fixed . This corresponds to the case of maximum uncertainty in the outcome of a single trial.
Worked Example:
Problem: For the microchip example above (), calculate the variance and standard deviation of the number of defective microchips.
Solution:
Step 1: Identify the parameters needed for variance.
We have and . We first need to calculate .
Step 2: Apply the formula for variance.
Step 3: Substitute the values and calculate.
Step 4: Calculate the standard deviation.
Answer: The variance is and the standard deviation is approximately .
---
Problem-Solving Strategies
A common type of question in GATE involves calculating probabilities for a range of outcomes, such as "at least one" or "at most k".
The probability of "at least one" success is most efficiently calculated using the complement rule. The complement of "at least one success" is "zero successes".
Since , the formula simplifies to:
This is significantly faster than calculating .
Worked Example:
Problem: A communication system has a 2% chance of a bit error during transmission. If a message of 50 bits is sent, what is the probability that the message has at least one error?
Solution:
Step 1: Define the problem in terms of a Binomial distribution.
Let be the number of bit errors. This is a Binomial experiment with trials.
The probability of "success" (an error) is .
We need to find .
Step 2: Use the complement rule for "at least one".
Step 3: Calculate .
First, find :
Now, calculate the probability of zero successes:
Step 4: Compute the final probability.
Using a calculator, .
Answer: The probability of at least one bit error is approximately .
---
Common Mistakes
- β Using Binomial for Dependent Trials: Applying the Binomial formula to sampling without replacement from a small population.
- β Forgetting the Binomial Coefficient: Calculating but forgetting to multiply by . This calculates the probability of one specific sequence of outcomes, not the total probability of successes.
- β Confusing and : Incorrectly identifying the probability of success. For example, if a question asks for the probability of at most 2 failures, the "success" in this context is actually a failure.
---
Practice Questions
:::question type="NAT" question="A machine manufactures bolts. The probability that a bolt is defective is 0.1. In a random sample of 400 bolts, the variance of the number of defective bolts is _______." answer="36" hint="Use the formula for the variance of a Binomial distribution, Var(X) = npq." solution="
Step 1: Identify the parameters.
The number of trials is .
The probability of success (a defective bolt) is .
Step 2: Calculate the probability of failure, .
Step 3: Apply the variance formula.
Step 4: Compute the final value.
Result: The variance is 36.
"
:::
:::question type="MCQ" question="An unbiased six-sided die is rolled 5 times. What is the probability of getting at most one '6'?" options=["","","",""] answer="" hint="The probability of 'at most one' is P(X=0) + P(X=1). Calculate each term using the Binomial PMF." solution="
Step 1: Define the Binomial experiment.
Number of trials, .
A 'success' is rolling a '6'. Probability of success, .
Probability of failure, .
We need to find .
Step 2: Calculate .
Step 3: Calculate .
Step 4: Sum the probabilities.
Result: The correct option is .
"
:::
:::question type="MSQ" question="Which of the following scenarios can be correctly modeled using a Binomial distribution? Assume all necessary conditions of randomness." options=["The number of heads obtained in 15 tosses of a fair coin.","The number of aces drawn when 5 cards are drawn one by one, without replacement, from a standard deck of 52 cards.","The number of students who pass an exam from a class of 50, where each student has an independent 80% chance of passing.","The number of times a person has to roll a die until they get a '6'."] answer="The number of heads obtained in 15 tosses of a fair coin.,The number of students who pass an exam from a class of 50, where each student has an independent 80% chance of passing." hint="Check the four conditions for a Binomial experiment for each option: fixed trials, two outcomes, constant probability, and independence." solution="
- Option A: This is a classic Binomial scenario. There is a fixed number of trials (), two outcomes (Heads/Tails), the probability of success is constant (), and the trials are independent. This is correct.
- Option B: This is incorrect. The trials are not independent because the cards are drawn without replacement. The probability of drawing an ace on the second draw depends on what was drawn on the first. This describes a Hypergeometric distribution.
- Option C: This is a Binomial scenario. There is a fixed number of trials ( students), two outcomes (Pass/Fail), the probability of success is constant (), and the students' outcomes are independent. This is correct.
- Option D: This is incorrect. The number of trials is not fixed. The experiment continues until a success occurs. This describes a Geometric distribution.
:::
:::question type="NAT" question="For a binomial distribution , the mean is 6 and the variance is 2. The value of is _______." answer="9" hint="You are given two equations: and . Solve this system of equations for and first." solution="
Step 1: Write down the given equations.
We are given the mean, .
We are given the variance, , where .
Step 2: Solve for .
We can substitute the value of from the first equation into the second equation.
Step 3: Solve for .
Since :
Step 4: Solve for .
Using the mean equation, :
Result: The value of is 9.
"
:::
---
Summary
- Conditions are Key: Always verify the four conditions for a Binomial experiment before applying its formulas: fixed number of trials (), two outcomes, constant probability of success (), and independence.
- Master the Core Formulas: You must be able to recall and apply the formulas for the PMF, mean, and variance without hesitation.
- Use the Complement Rule: For problems asking for the probability of "at least one" success, the fastest method is to calculate , which is .
- PMF:
- Mean:
- Variance:
---
What's Next?
The Binomial distribution is closely related to other important distributions that you will encounter in your GATE preparation.
- Poisson Distribution: When the number of trials is very large and the probability of success is very small, the Binomial distribution can be approximated by the Poisson distribution with parameter . This is useful for modeling rare events.
- Normal Distribution: For a large number of trials , the Binomial distribution can be approximated by a continuous Normal distribution. This is a result of the De Moivre-Laplace theorem and is fundamental to statistical inference.
---
Now that you understand Binomial Distribution, let's explore Poisson Distribution which builds on these concepts.
---
Part 4: Poisson Distribution
Introduction
In the study of discrete probability distributions, we often encounter scenarios involving the count of events occurring within a fixed interval of time or space. The Poisson distribution provides a mathematical model for such phenomena, particularly when the events are rare and occur independently of one another at a constant average rate. For instance, we might model the number of emails arriving in an inbox per hour, the number of defects per square meter of a material, or the number of radioactive decays in a given time interval.
Understanding the Poisson distribution is essential for modeling count data, a frequent task in data analysis and statistical inference. It serves as a foundational tool for analyzing random events that happen infrequently but with a known average rate. Its unique properties, such as the equality of its mean and variance, make it both elegant in theory and powerful in application.
A discrete random variable is said to follow a Poisson distribution with parameter if its probability mass function (PMF) is given by:
where is a non-negative integer (). We denote this as . The parameter represents the average number of events in the given interval.
---
Key Concepts
#
## 1. The Probability Mass Function (PMF)
The core of the Poisson distribution is its PMF, which assigns a probability to each possible number of occurrences, . The formula encapsulates the trade-off between the average rate and the specific count . Note that the sum of all probabilities over the entire range of is unity, as expected for any valid probability distribution.
We recognize the summation as the Taylor series expansion of . It follows that:
To illustrate the shape of the distribution, consider the PMF for .
The distribution is unimodal, with the peak occurring at or near the mean . For non-integer , the mode is . For integer , the probabilities at and are equal and maximal.
---
#
## 2. Mean and Variance
A defining characteristic of the Poisson distribution is the equality of its mean (expected value) and variance. This property is both elegant and of practical importance.
Variables:
- = Expected value or mean of the random variable
- = Variance of the random variable
- = The average rate of event occurrence
When to use: These are fundamental properties used in almost any problem involving the Poisson distribution. If a problem states that count data follows a Poisson distribution with a mean of 5, we immediately know its variance is also 5.
Worked Example:
Problem: A customer support center receives an average of 4 calls per hour. Assuming the number of calls follows a Poisson distribution, what is the probability that the center receives exactly 2 calls in a given hour?
Solution:
Step 1: Identify the parameter and the desired count .
The average rate of calls is given, so . We need to find the probability of exactly 2 calls, so .
Step 2: Apply the Poisson PMF formula.
Step 3: Substitute the values of and .
Step 4: Compute the result.
Using the approximation , we get:
Answer: The probability of receiving exactly 2 calls in a given hour is approximately .
---
#
## 3. Poisson Approximation to the Binomial Distribution
The Poisson distribution can be viewed as a limiting case of the Binomial distribution, . This approximation is particularly useful when the number of trials, , is very large and the probability of success in each trial, , is very small.
If , and if and such that the product remains constant, then the distribution of can be approximated by a Poisson distribution with parameter .
As a rule of thumb for GATE problems, this approximation is considered reasonable if and .
This connection is invaluable because the Poisson PMF is often computationally simpler than the Binomial PMF for large .
---
Problem-Solving Strategies
For questions asking for the probability of "at least one" or "at least two" events, it is almost always faster to use the complement rule.
- To find :
- To find :
This avoids calculating an infinite sum and reduces the problem to one or two simple PMF calculations.
---
Common Mistakes
- β Incorrect : Using a rate that does not match the interval specified in the question. For example, if the rate is 10 events per hour, the rate for a 30-minute interval is , not 10. Always scale to match the question's time/space unit.
- β Confusing Mean and Standard Deviation: For a Poisson distribution, the mean is and the variance is . Therefore, the standard deviation is , not . β Correct Approach: Remember that , , and .
---
Practice Questions
:::question type="MCQ" question="The number of typos on a page of a book follows a Poisson distribution with a mean of 1.5. What is the probability that a randomly selected page has no typos?" options=["","","",""] answer="" hint="Apply the Poisson PMF for the case where the number of events is zero." solution="
Step 1: Identify the parameters.
The random variable is the number of typos on a page. We are given that follows a Poisson distribution with mean . We want to find the probability of having no typos, which corresponds to .
Step 2: Use the Poisson PMF formula.
Step 3: Substitute and .
Step 4: Simplify the expression.
Recall that any number raised to the power of 0 is 1, and .
Result:
The probability of a page having no typos is .
"
:::
:::question type="NAT" question="On average, a data server fails 0.5 times per day. Assuming a Poisson distribution, what is the probability that the server will fail at least once in a 4-day period? (Round off to two decimal places). Use ." answer="0.87" hint="First, calculate the new average rate for the 4-day period. Then, use the complement rule for 'at least once'." solution="
Step 1: Calculate the rate parameter for the specified interval.
The average failure rate is 0.5 times per day. The interval of interest is 4 days.
Therefore, the new average rate is:
Step 2: Define the event of interest and its complement.
We need to find , where is the number of failures in 4 days.
The complement event is , which is the probability of zero failures.
Using the complement rule:
Step 3: Calculate using the Poisson PMF with .
Step 4: Substitute the given value of and compute the final probability.
Step 5: Round the result to two decimal places.
The probability is approximately 0.87.
Result:
The probability that the server will fail at least once in a 4-day period is 0.87.
"
:::
:::question type="MSQ" question="Let be a random variable following a Poisson distribution with parameter . Which of the following statements is/are correct?" options=["The mean of is 4.","The standard deviation of is 4.","The probability of is equal to the probability of .","The variance of is 4."] answer="The mean of is 4.,The variance of is 4." hint="Recall the fundamental properties of the Poisson distribution: mean, variance, and standard deviation. Also, check the PMF for specific values of k." solution="
Let us evaluate each option for .
- Option A: The mean of is 4.
- Option B: The standard deviation of is 4.
- Option C: The probability of is equal to the probability of .
- Option D: The variance of is 4.
Therefore, the correct options are A and D.
"
:::
---
Summary
- The Poisson distribution models the number of rare, independent events occurring in a fixed interval with a constant average rate .
- The Probability Mass Function (PMF) is . Be prepared to calculate this for small values of .
- A defining property is that the mean and variance are both equal to the rate parameter: . The standard deviation is .
- For problems involving "at least one" event, always use the complement rule: .
---
What's Next?
This topic connects to:
- Binomial Distribution: Master the conditions ( large, small) under which the Poisson distribution serves as an excellent approximation for the Binomial.
- Exponential Distribution: The Poisson distribution counts the number of events in an interval, while the Exponential distribution models the time between consecutive Poisson events. This is a crucial relationship between a discrete and a continuous distribution.
Master these connections for a comprehensive understanding of probability distributions for the GATE exam.
---
Now that you understand Poisson Distribution, let's explore Uniform Distribution which builds on these concepts.
---
Part 5: Uniform Distribution
Introduction
In the study of probability, we often encounter scenarios where all possible outcomes of a random experiment are equally likely. The simplest and most fundamental distribution that models such situations is the Uniform Distribution. It serves as a cornerstone for understanding more complex probability distributions and is foundational to concepts in statistical sampling and simulation.
When a random variable can assume a finite number of values, each with an equal probability of occurrence, we say it follows a Discrete Uniform Distribution. Consider, for example, the roll of a single fair six-sided die. The possible outcomes are the integers from 1 to 6, and if the die is fair, each outcome has a probability of . This is a classic instance of a discrete uniform distribution. We will explore the mathematical formulation of this distribution, its primary characteristics such as mean and variance, and its application in solving elementary probability problems.
A discrete random variable is said to follow a Discrete Uniform Distribution if it can take on distinct values, say , with an equal probability for each value. The probability mass function (PMF) is given by:
for each , and for any other value of .
---
Key Concepts
#
## 1. Probability Mass Function (PMF)
The defining characteristic of a discrete uniform distribution is its constant probability mass function over the set of possible outcomes. If the random variable can take any integer value from to (inclusive), the total number of possible values is .
For a random variable uniformly distributed on the integers :
Variables:
- = The minimum value of the random variable
- = The maximum value of the random variable
- = The total number of possible outcomes
When to use: When a problem states that outcomes are "equally likely," "chosen at random," or describes a fair process like rolling a die or drawing from a well-shuffled deck.
The PMF can be visualized as a set of bars of equal height, representing the uniform probability assigned to each outcome.
---
#
## 2. Mean and Variance
The mean, or expected value, represents the central tendency of the distribution, while the variance measures its spread or dispersion. For a discrete uniform distribution over integers, these have simple, elegant formulas.
Variables:
- = The minimum value
- = The maximum value
When to use: To find the average outcome of a uniformly distributed random variable over many trials.
Variables:
- = The minimum value
- = The maximum value
- = The number of outcomes
When to use: To quantify the spread of outcomes around the mean.
Worked Example:
Problem: A fair six-sided die is rolled. Let the random variable be the outcome of the roll. Calculate the mean and variance of .
Solution:
Step 1: Identify the parameters of the distribution.
The outcomes are the integers . This is a discrete uniform distribution.
The minimum value is .
The maximum value is .
The number of outcomes is .
Step 2: Calculate the mean using the formula .
Step 3: Calculate the variance using the formula .
Answer: The mean of the distribution is and the variance is .
---
Problem-Solving Strategies
In GATE problems, the uniform distribution is often implied rather than explicitly stated. Look for keywords that suggest all outcomes have an equal chance:
- "A number is chosen at random from the set..."
- "A fair coin/die is used..."
- "Each of the items is equally likely to be selected..."
Once you identify these phrases, you can immediately apply the simple formulas for PMF, mean, and variance, saving significant calculation time compared to deriving them from first principles.
---
Common Mistakes
A frequent point of confusion is the calculation of , the number of outcomes.
- β Mistake: Calculating the number of outcomes as . For integers from to , this misses one of the endpoints. For example, for integers from 5 to 10, , but the actual outcomes are , which is 6 values.
- β Correct Approach: Always use to count the number of integers from to , inclusive.
- β Mistake: Confusing the variance formula with that of the continuous uniform distribution. The continuous formula is .
- β Correct Approach: For the discrete case, the formula involves , where is the number of points. Remember: .
---
Practice Questions
:::question type="MCQ" question="A random variable is uniformly distributed over the set of integers . What is the probability ?" options=["1/30", "1/20", "1/21", "25/30"] answer="1/21" hint="First, determine the total number of possible outcomes, n. The probability for any specific outcome is 1/n." solution="
Step 1: Identify the parameters of the distribution.
The minimum value is .
The maximum value is .
Step 2: Calculate the total number of outcomes, .
Step 3: The probability of any single outcome in a uniform distribution is .
Result: The correct option is 1/21.
"
:::
:::question type="NAT" question="A computer generates a random integer from 1 to 10, inclusive, with each integer being equally likely. If is the generated integer, what is the variance of the random variable ?" answer="33" hint="Recall the property of variance: . First, find the variance of ." solution="
Step 1: Find the variance of the random variable .
is uniformly distributed on integers from to .
The number of outcomes is .
Step 2: Apply the variance formula for a discrete uniform distribution.
Step 3: Use the property of variance, , to find the variance of .
Here, and .
Step 4: Substitute the value of .
Result: The variance of is 33.
"
:::
:::question type="MSQ" question="Let a random variable follow a discrete uniform distribution on the set . Which of the following statements is/are correct?" options=["The expected value is 0.", "The probability is 0.4.", "The variance is 2.", "The distribution is symmetric about its mean."] answer="The expected value is 0.,The probability is 0.4.,The variance is 2.,The distribution is symmetric about its mean." hint="Calculate the mean, variance, and relevant probabilities. Consider the shape of the PMF." solution="
Statement 1: The expected value is 0.
The parameters are and .
This statement is correct.
Statement 2: The probability is 0.4.
The total number of outcomes is .
The probability of any single outcome is .
The event corresponds to the outcomes .
This statement is correct.
Statement 3: The variance is 2.
The number of outcomes is .
This statement is correct.
Statement 4: The distribution is symmetric about its mean.
The mean is 0. The set of outcomes is symmetric around 0. The probabilities are equal for all outcomes, so the PMF is symmetric.
This statement is correct.
Result: All four statements are correct.
"
:::
---
Summary
- Identification is Key: The Discrete Uniform Distribution applies to any scenario where a finite number of outcomes are all equally likely.
- Core Formulas: Memorize the formulas for the mean and variance, as they provide a direct path to the solution.
- Count Carefully: Always calculate the number of outcomes as for integers from to inclusive. This is a common source of error.
- Mean:
- Variance: , where .
---
What's Next?
The Uniform Distribution is a building block for more advanced topics. Understanding it well will aid your study of:
- Continuous Uniform Distribution: This is the continuous analogue where a random variable can take any value within a given range with uniform probability density.
- Expectation and Variance: The concepts of mean and variance are universal. The simple forms seen here provide intuition for how these are calculated for other, more complex distributions like the Binomial, Poisson, and Normal distributions.
- Random Number Generation: In computer science and data analysis, generating random numbers often starts with a uniform distribution, which is then transformed to create samples from other distributions.
Master these connections to build a comprehensive understanding of probability and statistics for the GATE examination.
---
Chapter Summary
In this chapter, we have explored the foundational discrete probability distributions that are essential for modeling random phenomena in engineering and computer science. Mastery of these concepts is critical for success in the GATE examination. The most salient points to be retained are as follows:
- Probability Mass Function (PMF): The PMF, denoted by , is the defining function for any discrete random variable . We have established that it must satisfy two fundamental properties: for all possible values , and the sum over all possible values must be unity, i.e., .
- Bernoulli and Binomial Distributions: The Bernoulli distribution is the simplest model, representing a single trial with two outcomes (success or failure). The Binomial distribution is its natural extension, describing the number of successes in a fixed number, , of independent and identically distributed (i.i.d.) Bernoulli trials. The relationship between these two is fundamental.
- Poisson Distribution: We have seen that the Poisson distribution is used to model the number of events occurring within a fixed interval of time or space, given a constant mean rate . Its PMF is given by .
- Poisson Approximation to Binomial: A crucial result for practical computation is the use of the Poisson distribution to approximate the Binomial distribution. This approximation is highly accurate when the number of trials is large and the probability of success is small, where we set the Poisson parameter .
- Expectation and Variance: The mean (Expectation) and variance are the primary measures of central tendency and dispersion for a distribution. It is imperative to remember the standard results for the distributions we have covered:
Binomial : Mean ; Variance .
Poisson : Mean ; Variance .
* Uniform on : Mean .
- Contextual Application: The ability to identify the appropriate distribution for a given problem scenario is a key skill. We have emphasized that Binomial problems involve a fixed number of trials, Poisson problems involve a rate over an interval, and Uniform problems involve outcomes with equal likelihood.
---
Chapter Review Questions
:::question type="MCQ" question="In a manufacturing process, the probability of a single component being defective is 0.002. A batch of 1500 components is selected for quality inspection. Using a suitable approximation, what is the probability that the batch contains exactly 2 defective components?" options=["","","",""] answer="A" hint="Consider the conditions under which a Binomial distribution can be approximated by a Poisson distribution. Calculate the appropriate rate parameter ." solution="
The number of defective components, , in a batch of follows a Binomial distribution with parameters and . The probability mass function is given by:
Calculating this directly is computationally intensive. However, since the number of trials is large () and the probability of a defect is small (), we can use the Poisson approximation.
Step 1: Calculate the Poisson parameter .
The parameter is the expected number of defective components, given by:
Step 2: Apply the Poisson PMF.
The PMF for a Poisson distribution is . We need to find the probability of finding exactly defective components.
Step 3: Simplify the expression.
Thus, the approximate probability is .
"
:::
:::question type="NAT" question="Let be a binomial random variable with parameters and . If the mean of is 20 and its variance is 16, what is the value of the number of trials, ?" answer="100" hint="Recall the formulas for the mean and variance of a Binomial distribution and use them to form a system of equations to solve for and then ." solution="
For a Binomial distribution , we are given the following:
We have a system of two equations with two unknowns, and .
Step 1: Solve for .
We can substitute the expression for the mean into the equation for the variance.
Now, we solve for :
Therefore, the probability of success is:
Step 2: Solve for .
Now that we have the value of , we can use the mean equation to find .
The number of trials is .
"
:::
:::question type="MCQ" question="Which of the following statements regarding discrete probability distributions is FALSE?" options=["The mean of a Poisson distribution is always equal to its variance.","For a Binomial distribution , the variance is always less than or equal to the mean.","The sum of two independent random variables following a Bernoulli distribution with parameter results in a Binomial distribution .","The rate parameter in a Poisson distribution must be an integer."] answer="D" hint="Consider the physical interpretation of the parameters for each distribution. What does represent in the real world?" solution="
Let us analyze each statement to determine its validity.
A) The mean of a Poisson distribution is always equal to its variance.
For a Poisson random variable with parameter , we have and . This statement is TRUE.
B) For a Binomial distribution , the variance is always less than or equal to the mean.
The mean is and the variance is . Since the probability must be in the interval , the term must also be in . Therefore, . This statement is TRUE.
C) The sum of two independent random variables following a Bernoulli distribution with parameter results in a Binomial distribution .
This is the definition of a Binomial random variable. A Binomial distribution models the number of successes in independent Bernoulli trials, each with success probability . For , this statement is correct. This statement is TRUE.
D) The rate parameter in a Poisson distribution must be an integer.
The parameter represents the average rate of occurrence of an event over an interval. Averages need not be integers. For example, a customer service center might receive an average of 2.5 calls per minute. The number of calls received () must be an integer, but the average rate () can be any non-negative real number. This statement is FALSE.
"
:::
:::question type="NAT" question="A discrete random variable can take any integer value from 1 to inclusive, with equal probability. If the variance of is 10, what is the value of ?" answer="11" hint="Use the standard formula for the variance of a discrete uniform distribution over the first integers." solution="
The random variable follows a discrete uniform distribution on the set .
Step 1: Recall the formula for the variance.
For a discrete uniform distribution on the integers from 1 to , the variance is given by the formula:
Step 2: Set up the equation.
We are given that the variance of is 10.
Step 3: Solve for .
Multiply both sides by 12:
Add 1 to both sides:
Take the square root of both sides. Since represents the number of outcomes, it must be positive.
The value of is 11.
"
:::
---
What's Next?
Having completed Discrete Probability Distributions, you have established a firm foundation for understanding and modeling random processes where outcomes are countable. This chapter is a cornerstone of probability theory and has direct applications in numerous GATE subjects.
Key connections:
- Relation to Previous Learning: This chapter provided concrete functional forms (PMFs) and properties for the abstract concept of a discrete random variable, which was introduced in the preceding chapter on basic probability. We have moved from calculating probabilities of events to analyzing the behavior of entire distributions.
- Building Blocks for Future Chapters: The concepts mastered here are indispensable for the topics that follow:
Joint and Conditional Distributions: We will soon extend our analysis from single random variables to systems involving multiple random variables, exploring concepts of covariance and correlation.
* Statistics and Inference: The Binomial and Poisson distributions are fundamental models used in statistical hypothesis testing, confidence intervals, and parameter estimationβall of which are high-yield topics in the GATE syllabus.