Elementary Distributions
This chapter lays the foundation for understanding probabilistic phenomena by introducing elementary discrete and continuous distributions. Mastery of these distributions is critical for comprehending advanced probabilistic models and is frequently assessed in examinations through both direct application and theoretical analysis.
Chapter Contents
|
| Topic |
|---|-------| | 1 | Discrete Distributions | | 2 | Continuous Distributions |We begin with Discrete Distributions.
Part 1: Discrete Distributions
Discrete distributions model random variables that can take on a finite or countably infinite number of values. We use them to analyze outcomes such as the number of successes in trials or the count of events in a specific interval.
---
Core Concepts
1. Bernoulli Distribution
The Bernoulli distribution describes the outcome of a single trial with exactly two possible results: "success" (with probability ) or "failure" (with probability ).
Let be a Bernoulli random variable. Its PMF is:
Where:
= probability of success ()
for success, for failure
Expected Value:
Variance:
Worked Example:
Consider a component that functions correctly with a probability of . We want to find the probability that a single randomly selected component functions correctly.
Step 1: Define the random variable and parameters.
> Let if the component functions correctly, and otherwise. This is a Bernoulli trial.
> The probability of success is .
Step 2: Apply the Bernoulli PMF for .
>
>
Answer: The probability that a single component functions correctly is .
:::question type="MCQ" question="A quality control inspector checks a product for defects. The probability of a product being defective is . What is the probability that a randomly selected product is not defective?" options=["0.05","0.95","0.50","0.10"] answer="0.95" hint="Identify success and failure, and their probabilities." solution="Let if the product is defective (success) and if it is not defective (failure).
The probability of success is .
We want to find the probability that the product is not defective, which corresponds to .
Step 1: Identify the probability of failure.
>
>
>
Answer: The probability that a randomly selected product is not defective is ."
:::
---
2. Binomial Distribution
The Binomial distribution models the number of successes in a fixed number of independent Bernoulli trials. Each trial has the same probability of success, .
Let be a Binomial random variable. Its PMF is:
Where:
= number of trials
= number of successes
= probability of success on a single trial ()
is the binomial coefficient.
Expected Value:
Variance:
Worked Example:
A fair coin is tossed 10 times. We want to find the probability of getting exactly 7 heads.
Step 1: Define the random variable and parameters.
> Let be the number of heads in 10 tosses. This follows a Binomial distribution.
> Number of trials, .
> Probability of success (getting a head), .
> Number of desired successes, .
Step 2: Apply the Binomial PMF.
>
>
>
>
>
>
Answer: The probability of getting exactly 7 heads in 10 tosses is approximately .
:::question type="NAT" question="A manufacturing process produces 5% defective items. If a random sample of 20 items is selected, what is the expected number of defective items in the sample?" answer="1.0" hint="Recall the formula for the expected value of a Binomial distribution." solution="Let be the number of defective items in a sample of 20. This follows a Binomial distribution.
Step 1: Identify the parameters.
> Number of trials, .
> Probability of success (item being defective), .
Step 2: Apply the formula for the expected value of a Binomial distribution.
>
>
>
Answer: The expected number of defective items is ."
:::
---
3. Geometric Distribution
The Geometric distribution models the number of Bernoulli trials required to achieve the first success. This includes the success itself.
Let be a Geometric random variable. Its PMF is:
Where:
= number of trials until the first success (including the success)
= probability of success on a single trial ()
Expected Value:
Variance:
Worked Example:
A quality control process involves testing items until a non-defective item is found. The probability of an item being non-defective is . We want to find the probability that the first non-defective item is found on the 3rd trial.
Step 1: Define the random variable and parameters.
> Let be the number of trials until the first non-defective item is found. This follows a Geometric distribution.
> Probability of success (non-defective item), .
> Number of desired trials, .
Step 2: Apply the Geometric PMF.
>
>
>
>
>
Answer: The probability that the first non-defective item is found on the 3rd trial is .
:::question type="MCQ" question="A basketball player makes a free throw with a probability of . What is the probability that the player's first successful free throw occurs on their 4th attempt?" options=["0.7","0.021","0.0063","0.0189"] answer="0.0189" hint="The first success occurs on the -th trial means failures followed by 1 success." solution="Let be the number of attempts until the first successful free throw. This follows a Geometric distribution.
Step 1: Identify the parameters.
> Probability of success, .
> Number of desired trials, .
Step 2: Apply the Geometric PMF.
>
>
>
>
>
Answer: The probability that the player's first successful free throw occurs on their 4th attempt is ."
:::
---
4. Negative Binomial Distribution
The Negative Binomial distribution generalizes the Geometric distribution. It models the number of Bernoulli trials required to achieve a specified number of successes, .
Let be a Negative Binomial random variable. Its PMF is:
Where:
= total number of trials until -th success (including the -th success)
= number of desired successes
= probability of success on a single trial ()
Expected Value:
Variance:
Worked Example:
A software company is hiring developers, and each candidate has a probability of passing the coding interview, independently. We want to find the probability that the 5th successful hire occurs on the 10th interview.
Step 1: Define the random variable and parameters.
> Let be the number of interviews until the 5th successful hire. This follows a Negative Binomial distribution.
> Number of desired successes, .
> Probability of success (passing interview), .
> Total number of trials, .
Step 2: Apply the Negative Binomial PMF.
>
>
>
>
>
>
Answer: The probability that the 5th successful hire occurs on the 10th interview is approximately .
:::question type="MCQ" question="A machine produces items, and the probability of an item being non-defective is . We are interested in finding the 3rd non-defective item. What is the expected number of items we need to examine until we find the 3rd non-defective item?" options=["3.33","2.7","3","10"] answer="3.33" hint="Use the expected value formula for the Negative Binomial distribution." solution="Let be the number of items examined until the 3rd non-defective item is found. This follows a Negative Binomial distribution.
Step 1: Identify the parameters.
> Number of desired successes, .
> Probability of success (non-defective item), .
Step 2: Apply the formula for the expected value of a Negative Binomial distribution.
>
>
>
Answer: The expected number of items we need to examine is approximately ."
:::
---
5. Poisson Distribution
The Poisson distribution models the number of events occurring in a fixed interval of time or space, given that these events occur with a known constant mean rate and independently of the time since the last event. It is often used for rare events.
Let be a Poisson random variable. Its PMF is:
Where:
= number of events
= average rate of events in the given interval ()
= Euler's number (approximately )
Expected Value:
Variance:
Worked Example:
A call center receives an average of calls per hour. We want to find the probability that the call center receives exactly calls in a specific hour.
Step 1: Define the random variable and parameters.
> Let be the number of calls received in an hour. This follows a Poisson distribution.
> The average rate of calls, .
> Number of desired calls, .
Step 2: Apply the Poisson PMF.
>
>
>
>
>
Answer: The probability that the call center receives exactly 2 calls in an hour is approximately .
:::question type="NAT" question="The number of typos on a page of a certain book follows a Poisson distribution with an average of typos per page. What is the probability that a randomly selected page has no typos? (Round to 4 decimal places)" answer="0.6065" hint="Use the Poisson PMF with ." solution="Let be the number of typos on a page. This follows a Poisson distribution.
Step 1: Identify the parameters.
> Average rate of typos, .
> Number of desired typos, .
Step 2: Apply the Poisson PMF.
>
>
>
>
>
>
Answer: The probability that a randomly selected page has no typos is approximately ."
:::
---
6. Hypergeometric Distribution
The Hypergeometric distribution models the number of successes in a sample drawn without replacement from a finite population containing a known number of successes and failures.
Let be a Hypergeometric random variable. Its PMF is:
Where:
= total number of items in the population
= total number of "success" items in the population
= number of items drawn in the sample
= number of "success" items in the sample (number of desired successes)
Expected Value:
Variance:
Worked Example:
A box contains items, of which are defective. If a sample of items is drawn randomly without replacement, we want to find the probability that exactly of the sampled items is defective.
Step 1: Define the random variable and parameters.
> Let be the number of defective items in the sample. This follows a Hypergeometric distribution.
> Total population size, .
> Total number of defective items in population, .
> Sample size, .
> Number of desired defective items in sample, .
Step 2: Apply the Hypergeometric PMF.
>
>
>
>
>
>
>
Answer: The probability that exactly 1 of the sampled items is defective is .
:::question type="MCQ" question="A deck of 52 cards contains 4 aces. If 5 cards are drawn randomly without replacement, what is the probability that exactly 2 of them are aces?" options=["","","",""] answer="" hint="Identify the total population, number of successes in population, sample size, and number of successes in sample." solution="Let be the number of aces in the sample of 5 cards. This follows a Hypergeometric distribution.
Step 1: Identify the parameters.
> Total population size, (total cards).
> Total number of 'success' items (aces) in population, .
> Sample size, (cards drawn).
> Number of desired 'success' items (aces) in sample, .
Step 2: Apply the Hypergeometric PMF.
>
>
>
Answer: The correct expression for the probability is ."
:::
---
7. Discrete Uniform Distribution
The Discrete Uniform distribution assigns equal probability to each outcome in a finite set of possible values.
Let be a Discrete Uniform random variable. Its PMF is:
Where:
= total number of possible outcomes
= each distinct outcome
If the outcomes are integers from to (inclusive), then .
Expected Value:
Variance:
Worked Example:
A fair six-sided die is rolled. We want to find the probability of rolling a 4.
Step 1: Define the random variable and parameters.
> Let be the outcome of the die roll. This follows a Discrete Uniform distribution.
> The possible outcomes are .
> Number of possible outcomes, .
> Desired outcome, .
Step 2: Apply the Discrete Uniform PMF.
>
>
Answer: The probability of rolling a 4 is .
:::question type="NAT" question="A random number generator produces integers from 1 to 100 (inclusive), with each integer having an equal probability of being generated. What is the expected value of the generated number?" answer="50.5" hint="Identify the range of outcomes and use the expected value formula for a Discrete Uniform distribution." solution="Let be the generated integer. This follows a Discrete Uniform distribution.
Step 1: Identify the parameters.
> The range of outcomes is from to .
> Total number of outcomes, .
Step 2: Apply the formula for the expected value of a Discrete Uniform distribution.
>
>
>
>
Answer: The expected value of the generated number is ."
:::
---
Advanced Applications
Worked Example:
An online server experiences failures at an average rate of failures per day.
Step 1: Identify the distribution and parameters for part 1.
> The number of failures in a fixed period follows a Poisson distribution.
> The average rate of failures per day is .
> For a 3-day period, the new average rate is .
> We want failures.
Step 2: Calculate the probability for part 1.
>
>
>
>
>
Step 3: Identify the distribution and parameters for part 2.
> For the server to run for 5 days without any failures, this means 0 failures in a 5-day period. This is still a Poisson distribution.
> The average rate of failures per day is .
> For a 5-day period, the new average rate is .
> We want failures.
Step 4: Calculate the probability for part 2.
>
>
>
Answer:
:::question type="MSQ" question="A company sends out marketing emails, and historically, 20% of the recipients open the email. If 10 emails are sent to different recipients, which of the following statements are correct? (Select all that apply)" options=["The probability that exactly 3 emails are opened is .","The expected number of opened emails is 2.","The probability that the first opened email is the 5th one sent is .","The variance of the number of opened emails is 2."] answer="The probability that exactly 3 emails are opened is \binom{10}{3}(0.2)^3(0.8)^7}.,The expected number of opened emails is 2.,The probability that the first opened email is the 5th one sent is ." hint="Analyze each statement based on Binomial and Geometric distributions." solution="Let be the number of opened emails out of 10. This is a Binomial distribution with and .
Statement 1: The probability that exactly 3 emails are opened is .
> This is directly from the Binomial PMF: .
> For , .
> This statement is Correct.
Statement 2: The expected number of opened emails is 2.
> For a Binomial distribution, .
> .
> This statement is Correct.
Statement 3: The probability that the first opened email is the 5th one sent is .
> Let be the number of emails sent until the first one is opened. This is a Geometric distribution with .
> .
> For , .
> This statement is Correct.
Statement 4: The variance of the number of opened emails is 2.
> For a Binomial distribution, .
> .
> The statement says the variance is 2, which is incorrect.
> This statement is Incorrect.
Answer: The correct options are 'The probability that exactly 3 emails are opened is .', 'The expected number of opened emails is 2.', 'The probability that the first opened email is the 5th one sent is .' "
:::
---
Problem-Solving Strategies
When faced with a problem involving discrete random variables, carefully analyze the problem context to select the appropriate distribution:
- Bernoulli: Single trial, two outcomes (success/failure).
- Binomial: Fixed number of independent trials, counting successes. Key phrases: "out of trials," "number of successes."
- Geometric: Number of trials until the first success. Key phrases: "first success on the -th trial," "how many attempts until."
- Negative Binomial: Number of trials until the -th success. Key phrases: "-th success on the -th trial."
- Poisson: Number of events in a fixed interval (time/space), average rate given. Key phrases: "average number of events per unit," "number of occurrences."
- Hypergeometric: Sampling without replacement from a finite population with two categories. Key phrases: "drawn from a batch," "without replacement."
- Discrete Uniform: Each outcome in a finite set is equally likely. Key phrases: "fair die," "random integer from to ."
---
Common Mistakes
β Confusing Binomial and Hypergeometric:
Students often use the Binomial distribution for sampling without replacement.
β
Correct approach: If sampling is without replacement from a finite population, use the Hypergeometric distribution. If sampling is with replacement or from an infinite population (or large enough to approximate), use the Binomial.
β Incorrectly identifying for Geometric vs. Negative Binomial:
For Geometric, is the total trials including the first success. For Negative Binomial, is the total trials including the -th success.
β
Correct approach: Read carefully whether the question asks for the number of trials before the -th success or the total trials up to and including the -th success. The standard formulas define as total trials.
β Misinterpreting for Poisson distribution:
The parameter must correspond to the given interval in the question. If the average is per hour and the question asks about a 3-hour period, must be scaled.
β
Correct approach: Always adjust to match the time or space unit specified in the probability question. For example, if average is 2 events/hour, then for a 0.5-hour interval, .
---
Practice Questions
:::question type="NAT" question="A particular type of integrated circuit has a failure rate of 1 in 100 during the first 1000 hours of operation. If a batch of 50 such circuits is tested, what is the variance of the number of circuits that fail within the first 1000 hours?" answer="0.495" hint="Identify the distribution and its parameters. Recall the variance formula." solution="Let be the number of circuits that fail in the batch of 50. This follows a Binomial distribution.
Step 1: Identify the parameters.
> Number of trials (circuits), .
> Probability of success (a circuit failing), .
Step 2: Apply the formula for the variance of a Binomial distribution.
>
>
>
>
Answer: The variance of the number of failing circuits is ."
:::
:::question type="MCQ" question="A biased coin has a probability of landing heads as . What is the probability that the 3rd head occurs on the 5th toss?" options=["","","",""] answer="" hint="This involves a specific number of successes on a specific trial number." solution="Let be the number of tosses until the 3rd head. This follows a Negative Binomial distribution.
Step 1: Identify the parameters.
> Number of desired successes, .
> Probability of success (heads), .
> Total number of trials, .
Step 2: Apply the Negative Binomial PMF.
>
>
>
Answer: The correct option is ''."
:::
:::question type="MSQ" question="Which of the following scenarios can be appropriately modeled by a Poisson distribution? (Select all that apply)" options=["The number of defective items in a sample of 1000 taken from a large production line with a known defect rate.","The number of cars passing a specific point on a highway in a 5-minute interval, given the average traffic flow.","The number of customers arriving at a store between 9 AM and 10 AM, given an average arrival rate.","The number of heads obtained when flipping a coin 20 times."] answer="The number of cars passing a specific point on a highway in a 5-minute interval, given the average traffic flow.,The number of customers arriving at a store between 9 AM and 10 AM, given an average arrival rate." hint="Poisson models events in an interval with an average rate." solution="Option 1: The number of defective items in a sample of 1000 taken from a large production line with a known defect rate.
> This is a Binomial distribution (, defect rate). It can be approximated by Poisson if is large and is small, but it's fundamentally Binomial. So, not appropriately modeled as its primary distribution.
Option 2: The number of cars passing a specific point on a highway in a 5-minute interval, given the average traffic flow.
> This is a classic Poisson scenario: counting events (cars) in a fixed interval (5 minutes) with an average rate.
> This statement is Correct.
Option 3: The number of customers arriving at a store between 9 AM and 10 AM, given an average arrival rate.
> Another classic Poisson scenario: counting events (customer arrivals) in a fixed time interval (1 hour) with an average rate.
> This statement is Correct.
Option 4: The number of heads obtained when flipping a coin 20 times.
> This is a Binomial distribution (, ).
> This statement is Incorrect.
Answer: The correct options are 'The number of cars passing a specific point on a highway in a 5-minute interval, given the average traffic flow.', 'The number of customers arriving at a store between 9 AM and 10 AM, given an average arrival rate.' "
:::
:::question type="NAT" question="A box contains 8 red balls and 4 blue balls. If 3 balls are drawn randomly without replacement, what is the probability that all 3 balls are red? (Round to 4 decimal places)" answer="0.2545" hint="This is sampling without replacement from a finite population." solution="Let be the number of red balls drawn. This follows a Hypergeometric distribution.
Step 1: Identify the parameters.
> Total population size, .
> Total number of 'success' items (red balls) in population, .
> Sample size, .
> Number of desired 'success' items (red balls) in sample, .
Step 2: Apply the Hypergeometric PMF.
>
>
>
>
>
>
>
>
Answer: The probability that all 3 balls are red is approximately ."
:::
:::question type="MCQ" question="A computer program generates a random integer between 1 and 10 (inclusive), such that each integer has an equal probability of being chosen. What is the variance of ?" options=["2.5","8.25","10","9.1667"] answer="8.25" hint="This is a Discrete Uniform distribution. Use the variance formula for integers to ." solution="Let be the random integer generated. This follows a Discrete Uniform distribution.
Step 1: Identify the parameters.
> The range of outcomes is from to .
> Total number of outcomes, .
Step 2: Apply the formula for the variance of a Discrete Uniform distribution.
>
>
>
>
>
>
Answer: The variance of is ."
:::
---
Summary
|
| Formula/Concept | Expression | Expected Value | Variance |
|---|----------------|------------|----------------|----------| | 1 | Bernoulli | | | | | 2 | Binomial | | | | | 3 | Geometric | | | | | 4 | Negative Binomial | | | | | 5 | Poisson | | | | | 6 | Hypergeometric | | | | | 7 | Discrete Uniform | | | |---
What's Next?
This topic connects to:
- Continuous Distributions: Understanding discrete distributions is foundational for studying continuous counterparts like the Exponential, Normal, and Uniform distributions.
- Central Limit Theorem: The Binomial distribution, under certain conditions, can be approximated by the Normal distribution, which is a key concept in the Central Limit Theorem.
- Moment Generating Functions (MGFs): MGFs provide a powerful tool to derive expected values and variances for these distributions, and to prove their properties.
- Stochastic Processes: Discrete distributions are building blocks for modeling discrete-time stochastic processes, such as Markov chains.
---
Proceeding to Continuous Distributions.
---
Part 2: Continuous Distributions
Continuous distributions model random variables that can take any value within a given range, providing a framework for analyzing probabilities of events over continuous scales. We apply these distributions to solve problems involving quantities such as time, distance, or measurement errors.
---
Core Concepts
1. Probability Density Function (PDF) and Cumulative Distribution Function (CDF)
The Probability Density Function (PDF), , describes the relative likelihood for a continuous random variable to take on a given value . The Cumulative Distribution Function (CDF), , gives the probability that will take a value less than or equal to .
For a continuous random variable :
- (where is differentiable)
Where:
for all
(total probability is 1)
for all
is non-decreasing
* and
When to use: To define the probability characteristics of continuous random variables.
Worked Example:
Consider a random variable with CDF given by:
Find the PDF, .
Step 1: Differentiate with respect to for the interval where it is non-constant.
>
Step 2: Combine with the piecewise definition.
>
Answer: The PDF is for and otherwise.
:::question type="MCQ" question="A continuous random variable has a PDF for and otherwise. What is ?" options=["","","",""] answer="" hint="Use the CDF or direct integration of the PDF." solution="Step 1: Calculate the probability by integrating the PDF from to .
>
Step 2: Perform the integration.
>
>
>
>
"
:::
---
2. Expectation and Variance for Continuous Random Variables
The expectation (mean) represents the average value of a continuous random variable . The variance measures the spread or dispersion of the values of around its mean.
For a continuous random variable with PDF :
- Expectation:
- Expectation of a function :
- Variance:
- Second Moment:
When to use: To characterize the central tendency and variability of a continuous distribution.
Worked Example:
A continuous random variable has the PDF for and otherwise. Find and .
Step 1: Calculate .
>
>
Step 2: Calculate .
>
>
Step 3: Calculate .
>
>
Answer: and .
:::question type="NAT" question="Let be a continuous random variable with PDF for and otherwise. Find . Round your answer to two decimal places." answer="0.38" hint="First, find the constant by integrating the PDF over its domain and setting it to 1. Then calculate using the formula." solution="Step 1: Find the constant . The total probability must be 1.
>
>
>
>
Step 2: Calculate using the determined PDF .
>
>
>
>
>
>
Step 3: Convert to two decimal places.
>
"
:::
---
3. Uniform Distribution
A continuous random variable has a Uniform distribution over the interval if its PDF is constant within this interval and zero elsewhere. This implies that all values within the interval are equally likely.
PDF:
CDF:
Mean:
Variance:
When to use: When all outcomes within a specified range are equally probable.
Worked Example:
A bus arrives at a stop at a random time between 10:00 AM and 10:15 AM. Let be the arrival time in minutes past 10:00 AM. Find the probability that the bus arrives between 10:05 AM and 10:10 AM.
Step 1: Define the distribution parameters.
The arrival time is uniformly distributed over the interval minutes. So, and .
Step 2: Write the PDF.
>
Step 3: Calculate the probability .
>
>
>
Answer: The probability that the bus arrives between 10:05 AM and 10:10 AM is .
:::question type="MCQ" question="The lifespan of a certain electronic component, in hours, is uniformly distributed between 500 and 1500. What is the expected lifespan of the component?" options=["750 hours","1000 hours","1250 hours","1500 hours"] answer="1000 hours" hint="For a uniform distribution , the expected value is simply the midpoint of the interval." solution="Step 1: Identify the parameters of the uniform distribution.
The lifespan is uniformly distributed over . So, and .
Step 2: Apply the formula for the mean of a uniform distribution.
>
>
>
Answer: The expected lifespan of the component is 1000 hours."
:::
---
4. Exponential Distribution
The Exponential distribution models the time until an event occurs in a Poisson process, where events occur continuously and independently at a constant average rate. It exhibits the memoryless property.
PDF:
CDF:
Mean:
Variance:
Memoryless Property: for .
When to use: For modeling waiting times, lifetimes of components, or inter-arrival times.
Worked Example:
The time (in minutes) a customer spends waiting for service at a bank follows an exponential distribution with an average waiting time of 5 minutes. What is the probability that a customer waits more than 10 minutes?
Step 1: Determine the rate parameter .
The average waiting time is minutes. For an exponential distribution, .
>
Step 2: Use the CDF or direct integration to find .
>
>
Answer: The probability that a customer waits more than 10 minutes is .
:::question type="MCQ" question="The lifetime of a light bulb follows an exponential distribution with a mean of 800 hours. Given that a light bulb has already lasted 600 hours, what is the probability that it will last for at least another 200 hours?" options=["","","",""] answer="" hint="This question tests the memoryless property of the exponential distribution." solution="Step 1: Determine the rate parameter .
The mean lifetime is hours. So, .
Step 2: Apply the memoryless property.
The memoryless property states that .
Here, hours (already lasted) and hours (another 200 hours).
We need to find .
Step 3: Calculate .
>
>
Answer: The probability that it will last for at least another 200 hours is .
"
:::
---
5. Normal (Gaussian) Distribution
The Normal distribution is a symmetric, bell-shaped distribution characterized by its mean and variance . It is fundamental in statistics due to the Central Limit Theorem.
PDF:
CDF: , where is the CDF of the Standard Normal Distribution . The CDF has no closed-form expression and is typically found using tables or software.
Mean:
Variance:
Standardization: If , then .
When to use: For modeling natural phenomena, measurement errors, or sums of many independent random variables.
Worked Example:
Suppose the scores on a standardized test are normally distributed with a mean of 70 and a standard deviation of 10. What is the probability that a randomly selected student scores between 60 and 85? (Use and ).
Step 1: Define the distribution parameters.
, so and .
Step 2: Standardize the values and to -scores.
>
>
Step 3: Calculate the probability using the standard normal CDF.
>
>
>
>
Answer: The probability that a student scores between 60 and 85 is approximately .
:::question type="MCQ" question="The weights of adult males in a city are normally distributed with a mean of 75 kg and a standard deviation of 5 kg. What percentage of adult males weigh between 70 kg and 80 kg? (Use the empirical rule or )" options=["68.26%","95.45%","99.73%","50%"] answer="68.26%" hint="Recognize that the interval is within one standard deviation of the mean. Recall the empirical rule for normal distributions." solution="Step 1: Identify the distribution parameters.
, so and .
Step 2: Observe the given interval.
The interval is .
We are looking for the probability .
Step 3: Apply the empirical rule.
For a normal distribution, approximately 68.26% of data falls within one standard deviation of the mean.
Alternatively, using Z-scores:
Using symmetry, .
Given :
.
Answer: Approximately 68.26% of adult males weigh between 70 kg and 80 kg."
:::
---
6. Gamma Distribution
The Gamma distribution is a versatile distribution used to model waiting times or the sum of independent exponentially distributed random variables. It is characterized by two parameters: shape () and rate ( or scale ).
PDF:
Where:
is the Gamma function.
For integer , .
Mean:
Variance:
Relationship to Exponential: If are i.i.d. , then . (Here )
When to use: For modeling waiting times until events occur, or in reliability analysis.
Worked Example:
Suppose the waiting time for the first customer at a store is exponentially distributed with a mean of 2 minutes. What is the mean waiting time for the 3rd customer, assuming waiting times are independent?
Step 1: Identify the parameters for a single exponential waiting time.
The mean waiting time for the first customer (an exponential random variable) is 2 minutes. So for , .
Step 2: Relate to the Gamma distribution.
The waiting time for the 3rd customer is the sum of 3 independent exponential random variables, each with rate . This sum follows a Gamma distribution with shape parameter and rate parameter .
So, .
Step 3: Calculate the mean of the Gamma distribution.
>
>
Answer: The mean waiting time for the 3rd customer is 6 minutes.
:::question type="MCQ" question="A system has three identical components, each with a lifetime that follows an exponential distribution with a mean of 100 hours. If the system fails when all three components fail (they operate in parallel), what is the expected time until system failure? Assume component failures are independent." options=["100 hours","200 hours","300 hours","Not enough information"] answer="300 hours" hint="The sum of independent exponential random variables follows a Gamma distribution. Identify the parameters for the Gamma distribution." solution="Step 1: Determine the rate parameter for each component's exponential lifetime.
Mean lifetime hours. For , .
So, .
Step 2: Recognize that the system failure time is the sum of three independent exponential random variables.
Let be the lifetimes of the three components. The system fails when all three fail, meaning the system lifetime is .
Since independently, their sum follows a Gamma distribution with shape parameter and rate parameter .
So, .
Step 3: Calculate the expected value of the Gamma distribution.
>
>
Answer: The expected time until system failure is 300 hours."
:::
---
7. Beta Distribution
The Beta distribution is defined on the interval and is widely used to model probabilities or proportions. It is characterized by two positive shape parameters, and .
PDF:
Where:
* is the Beta function.
Mean:
Variance:
When to use: For modeling probabilities, proportions, or values constrained between 0 and 1.
Worked Example:
A random variable representing the proportion of time a machine is operational follows a Beta distribution with parameters and . What is the expected proportion of time the machine is operational?
Step 1: Identify the parameters of the Beta distribution.
, so and .
Step 2: Apply the formula for the mean of a Beta distribution.
>
>
Answer: The expected proportion of time the machine is operational is or .
:::question type="MCQ" question="The proportion of defective items produced by a manufacturing process is modeled by a Beta distribution with parameters and . What is the PDF of this proportion?" options=[" for "," for "," for "," for "] answer=" for " hint="Recall the definition of the Beta function and simplify the PDF for the given parameters. What distribution is this equivalent to?" solution="Step 1: Identify the parameters of the Beta distribution.
and .
Step 2: Write down the PDF formula for .
>
Step 3: Substitute and into the PDF.
>
Step 4: Calculate the Beta function .
.
Step 5: Substitute back into the PDF.
>
This is the PDF of a Uniform distribution .
Answer: The PDF is for ."
:::
---
8. Lognormal Distribution
A random variable is Lognormally distributed if its logarithm, , is normally distributed. This distribution is suitable for modeling variables that are positively skewed and bounded below by zero, such as financial asset prices or income.
If is normally distributed with mean and variance (i.e., ), then is lognormally distributed.
PDF:
Mean:
Variance:
When to use: For modeling variables that are products of many independent positive random variables, or naturally positive, skewed data.
Worked Example:
The price of a stock, , is lognormally distributed. If is normally distributed with mean and standard deviation , what is the expected price of the stock?
Step 1: Identify the parameters of the underlying normal distribution.
For , we have and .
Thus, .
Step 2: Apply the formula for the mean of a Lognormal distribution.
>
>
>
>
Answer: The expected price of the stock is . (Numerical value is approx. ).
:::question type="MCQ" question="A random variable follows a Lognormal distribution such that . Which of the following statements is true regarding ?" options=["","The median of is ","The variance of is "," can take negative values"] answer="The median of is " hint="Recall the properties of the Lognormal distribution, particularly how its median relates to the mean of the underlying normal distribution." solution="Step 1: Identify the parameters of the underlying normal distribution.
For , we have and .
Step 2: Evaluate each option.
* : The mean of a Lognormal distribution is . Here, . So, this statement is false.
* The median of is : For a Lognormal distribution, the median is . Since , the median is . This statement is true.
* The variance of is : The variance of is . This is not . The variance of is . So, this statement is false.
* can take negative values: The Lognormal distribution is defined for . So, this statement is false.
Answer: The median of is "
:::
---
9. Weibull Distribution
The Weibull distribution is a flexible distribution commonly used in reliability engineering to model lifetimes of components. It can model decreasing, constant, or increasing failure rates depending on its shape parameter.
PDF:
Where:
is the shape parameter.
is the scale parameter.
Mean:
Variance:
Relationship to Exponential: If , the Weibull distribution reduces to the Exponential distribution with rate .
When to use: For modeling material fatigue, wind speed, or reliability of systems.
Worked Example:
The lifetime of a certain type of bearing (in years) follows a Weibull distribution with shape parameter and scale parameter . What is the mean lifetime of these bearings?
Step 1: Identify the parameters of the Weibull distribution.
and .
Step 2: Apply the formula for the mean of a Weibull distribution.
>
>
>
Step 3: Evaluate the Gamma function.
Recall that for integer , . So, .
>
Answer: The mean lifetime of the bearings is 5 years. (Note: Since , this is an exponential distribution with mean ).
:::question type="MCQ" question="A device's time to failure (in months) is modeled by a Weibull distribution with parameters and . What is the initial failure rate trend of this device?" options=["Constant failure rate","Increasing failure rate","Decreasing failure rate","Cannot be determined without more information"] answer="Increasing failure rate" hint="The shape parameter determines the failure rate trend for a Weibull distribution. Consider the cases , , and ." solution="Step 1: Identify the shape parameter .
Here, .
Step 2: Relate the shape parameter to the failure rate trend.
* If , the failure rate is decreasing over time. This indicates 'infant mortality' or devices that improve with age.
* If , the failure rate is constant (equivalent to an exponential distribution). This implies random failures, independent of age.
* If , the failure rate is increasing over time. This indicates 'wear-out' or devices that are more likely to fail as they age.
Step 3: Conclude based on .
Since , the device exhibits an increasing failure rate.
Answer: Increasing failure rate"
:::
---
10. Cauchy Distribution
The Cauchy distribution is a peculiar continuous distribution known for its heavy tails and the fact that its mean and variance are undefined. It is a classic example of a distribution for which the Law of Large Numbers does not apply.
PDF:
Where:
is the location parameter (median and mode).
is the scale parameter.
Mean: Undefined
Variance: Undefined
When to use: In physics (e.g., resonance phenomena), or as a counterexample in probability theory due to its undefined moments.
Worked Example:
A random variable follows a Cauchy distribution with location parameter and scale parameter . What is the median of this distribution?
Step 1: Identify the location parameter.
For a Cauchy distribution, the location parameter is also its median. Here, .
Step 2: State the median.
The median of the Cauchy distribution is .
>
Answer: The median of the distribution is 0.
:::question type="MCQ" question="Which of the following statements is true about a random variable following a Cauchy distribution?" options=["Its mean is always zero.","Its variance is finite.","The Central Limit Theorem applies to sums of Cauchy random variables.","Its PDF is symmetric around its location parameter."] answer="Its PDF is symmetric around its location parameter." hint="Recall the unique properties of the Cauchy distribution, especially regarding its moments and symmetry." solution="Step 1: Evaluate each statement based on the properties of the Cauchy distribution.
* Its mean is always zero. False. The mean of a Cauchy distribution is undefined, not necessarily zero.
* Its variance is finite. False. The variance of a Cauchy distribution is undefined.
* The Central Limit Theorem applies to sums of Cauchy random variables. False. The Central Limit Theorem requires finite variance for the sum to converge to a normal distribution. For Cauchy random variables, the average of i.i.d. Cauchy variables is itself a Cauchy variable, not a normal one.
* Its PDF is symmetric around its location parameter. True. The PDF is clearly symmetric around , as replacing with does not change the value of the PDF.
Answer: Its PDF is symmetric around its location parameter."
:::
---
Advanced Applications
We apply the concepts of continuous distributions to solve problems involving transformations of random variables and finding probabilities in more complex scenarios.
Worked Example:
Let be a continuous random variable with PDF for and otherwise. Find the PDF of .
Step 1: Find the CDF of .
For :
>
So, for .
Step 2: Find the CDF of .
Since , for , we have .
The CDF of , , is .
>
>
And for , for .
Step 3: Find the PDF of by differentiating .
>
>
Answer: The PDF of is for and otherwise, which is a Uniform distribution .
:::question type="NAT" question="Let be an exponential random variable with parameter . Find the median of . Round your answer to two decimal places." answer="0.69" hint="The median is the value such that . Use the CDF of the exponential distribution." solution="Step 1: Write the CDF for an exponential distribution with .
>
Step 2: Set and solve for .
>
>
>
>
Step 3: Calculate the numerical value and round to two decimal places.
>
>
Answer: 0.69"
:::
---
Problem-Solving Strategies
When finding the PDF of a transformed variable :
- Find the CDF of : .
- Express in terms of the CDF of , . This often involves solving for in terms of .
- Differentiate with respect to to get .
For any problem involving probabilities or percentiles with a Normal distribution , always standardize the values to -scores using . This allows the use of standard normal tables or values.
---
Common Mistakes
β Thinking directly gives .
β
For continuous random variables, . The PDF represents the probability density at , not the probability. Probabilities are found by integrating the PDF over an interval.
β Applying the memoryless property to non-exponential distributions.
β
The memoryless property is unique to the exponential distribution (and geometric for discrete). Do not assume it for other continuous distributions.
β Attempting to calculate the mean or variance of a Cauchy distributed variable.
β
The mean and variance of a Cauchy distribution are undefined. This is a key characteristic.
---
Practice Questions
:::question type="MCQ" question="The amount of time (in hours) a student spends studying for an exam is a continuous random variable with PDF for and otherwise. What is the value of ?" options=["","","",""] answer="" hint="The total probability over the domain of the PDF must integrate to 1." solution="Step 1: Set the integral of the PDF over its domain equal to 1.
>
>
Step 2: Evaluate the integral.
>
>
>
>
>
Step 3: Solve for .
>
"
:::
:::question type="NAT" question="The lifetime of a product (in years) is exponentially distributed with a mean of 4 years. What is the probability that the product lasts exactly 5 years? Round your answer to two decimal places." answer="0.00" hint="For a continuous random variable, the probability of it taking an exact value is zero." solution="Step 1: Understand the nature of continuous random variables.
For any continuous random variable and any specific value , the probability is always 0. This is because the probability is defined as the area under the PDF curve, and the area under a single point is zero.
Step 2: Apply this principle to the question.
The question asks for the probability that the product lasts exactly 5 years. Since lifetime is a continuous variable, this probability is 0.
>
Answer: 0.00"
:::
:::question type="MSQ" question="Let be a random variable with PDF for and otherwise. Select ALL correct statements." options=[" for "," is finite."," is finite.",""] answer=" for ," hint="Calculate the CDF, expectation, and variance. Remember to integrate carefully over the domain." solution="Step 1: Calculate the CDF .
For :
>
>
So, for . This statement is correct.
Step 2: Calculate .
>
>
Since is infinite, it is not finite. This statement is incorrect.
Step 3: Calculate .
Since is infinite, must also be infinite (or undefined).
. If is infinite, will also be infinite.
>
So, is infinite. This statement is incorrect.
Step 4: Calculate .
Using the CDF:
>
Alternatively, by integrating the PDF:
>
>
This statement is correct.
Answer: for ,"
:::
:::question type="MCQ" question="If , what is the distribution of ?" options=["","","",""] answer="" hint="Recall the properties of linear transformations of normal random variables: and ." solution="Step 1: Identify the mean and variance of .
, so and .
Step 2: Calculate the mean of .
>
>
Step 3: Calculate the variance of .
>
>
Step 4: State the distribution of .
A linear transformation of a normal random variable is also a normal random variable.
So, .
Answer: "
:::
:::question type="NAT" question="A random variable has a PDF for . What is ? Round your answer to two decimal places." answer="1.00" hint="Split the integral into two parts due to the absolute value in the PDF and the expectation formula." solution="Step 1: Write the formula for .
>
>
Step 2: Split the integral due to the absolute value.
Since is symmetric around 0, we can simplify the integral.
>
>
Step 3: Evaluate the integral using integration by parts, or recognize it as the mean of an exponential distribution.
The integral is the mean of an exponential distribution with . The mean of is .
Here, , so the integral value is .
Alternatively, using integration by parts:
Let , . Then , .
>
>
>
>
Answer: 1.00"
:::
---
Summary
| Formula/Concept | Expression |
|---|------------|
| PDF properties | , |
| CDF from PDF | |
| Expectation | |
| Variance | |
| Uniform Mean | |
| Uniform Variance | |
| Exponential Mean | |
| Exponential Memoryless | |
| Normal Standardization | |
| Gamma Mean | |
| Beta Mean | |
| Lognormal Mean | |
| Cauchy Distribution | Mean and Variance undefined |
---
What's Next?
This topic connects to:
- Joint Distributions: Understanding how multiple continuous random variables interact.
- Central Limit Theorem: The Normal distribution's role in the convergence of sums of random variables.
- Statistical Inference: Using continuous distributions to construct confidence intervals and perform hypothesis tests.
- Stochastic Processes: Applying exponential and gamma distributions in continuous-time models (e.g., queuing theory).
---
Chapter Summary
Differentiate between discrete and continuous random variables, understanding their respective probability mass functions (PMF) and probability density functions (PDF), and cumulative distribution functions (CDF).
Master the properties, parameters, and applications of common discrete distributions: Bernoulli, Binomial, Poisson, Geometric, and Hypergeometric.
Understand the characteristics, parameters, and use cases of fundamental continuous distributions: Uniform, Exponential, and Normal (Gaussian).
Calculate and interpret the expectation, variance, and higher moments for these elementary distributions.
Recognize the conditions under which specific distributions (e.g., Poisson approximation to Binomial, Normal approximation to Binomial) can be applied.
Grasp the significance of distribution parameters in determining shape, spread, and central tendency.
---
Chapter Review Questions
:::question type="MCQ" question="A call center receives calls at an average rate of 5 calls per hour. Assuming the number of calls follows a Poisson distribution, what is the probability of receiving exactly 3 calls in a 30-minute period?" options=["0.1404","0.2138","0.0821","0.2565"] answer="0.2138" hint="Adjust the Poisson rate parameter () to match the specified time period before applying the PMF." solution="For a 30-minute period, the average rate of calls is .
The probability of receiving exactly calls in a Poisson distribution is given by .
For and :
:::
:::question type="NAT" question="The lifetime of a certain electronic component (in hours) is exponentially distributed with a mean of 200 hours. If a component has already functioned for 150 hours, what is the probability that it will last for at least another 100 hours? (Provide answer to 4 decimal places.)" answer="0.6065" hint="Recall the memoryless property of the exponential distribution." solution="For an exponentially distributed random variable with mean , the rate parameter is . Here, , so .
The memoryless property states that .
In this case, hours (time already functioned) and hours (additional time needed).
So, .
The CDF of an exponential distribution is , so .
.
Rounding to 4 decimal places, the answer is 0.6065."
:::
:::question type="MCQ" question="Which of the following distributions describes the number of trials required to achieve the first success in a sequence of independent Bernoulli trials, where the probability of success remains constant for each trial?" options=["Binomial distribution","Poisson distribution","Geometric distribution","Hypergeometric distribution"] answer="Geometric distribution" hint="Consider the stopping condition for each distribution type." solution="The Geometric distribution models the number of independent Bernoulli trials needed to get the first success. The Binomial distribution models the number of successes in a fixed number of trials. The Poisson distribution models the number of events in a fixed interval of time or space. The Hypergeometric distribution models the number of successes in draws without replacement."
:::
:::question type="NAT" question="A discrete random variable has the following probability mass function: , , . What is the variance of ?" answer="0.49" hint="The variance can be calculated as ." solution="First, calculate the expected value :
.
Next, calculate :
.
Finally, calculate the variance:
."
:::
---
What's Next?
Building on the foundation of elementary distributions, subsequent chapters will delve into multivariate distributions, exploring the relationships between multiple random variables. Understanding these basic distributions is also crucial for comprehending advanced topics such as transformations of random variables and the fundamental limit theorems of probability theory.