Jul 24 2007

## Chapter 3 Continuous Random Variables

Continuous Random Variables

A continuous random variable is a random variable which can take values measured on a continuous scale e.g. weights, strengths, times or lengths.

Suppose X = time between messages arriving at a telecommunications system.

For any pre-determined time x, P( X = x) = 0, since if we measured X accurately enough, we are never going to hit the value x exactly.

However, P(X x) need not be zero, so we can use this to describe the probability distribution of X.

Distribution function: F(x) = P(X x).

Note: since P(X = x) = 0, it follows that F(x) = P(X < x).

Probability density function (pdf): f(x) =

F(x) =

For any constants a and b (a < b),

P(a X b) = P(X b) – P(X a)

= F(b) – F(a)

=

If we choose a = -, b = , we get

= P(- < X < ) = 1

Also, for a pdf, f(x) 0 for all x.

Mean of X: =

Variance of X, 2 =

We are going to look at three continuous distributions in detail.

Uniform distribution

The continuous random variable X has the Uniform distribution between 1 and 2

(1 < 2 ) if: f(x) =

X ~ U(1, 2), for short.

Roughly speaking, X ~ U(1, 2) if X can only take values between 1and 2 and any value of X within these values is as likely as any other value.

Result: for U(1, 2), = and

Proof:

= = =

= = = .

2 = – 2 =

=

= = … =

Example 3.1

Solution

Rotation time = 0.02 seconds. Wait time can be anything between 0 and 0.02 sec and each time in this range is as likely as any other time. Therefore, distribution of the wait time is U(0, 0.02) (i.e. 1 = 0 and 2 = 0.02).

= = 0.01 second.

2 = = 3.333 x 10-5; = 0.0058 seconds.

Occurrence of the Uniform distribution

1) Waiting times: as above.

2) Engineering tolerances: e.g. if a diameter is quoted “0.1mm”, it sometimes assumed that the error has a U(-0.1, 0.1) distribution.

3) Simulation: programming languages often have a standard routine for simulating the U(0, 1) distribution. This can be used to simulate other probability distributions.

Exponential distribution

The continuous random variable Y has the Exponential distribution, parameter if:

Result: in a Poisson process rate , the times between occurrences (and the time until the first occurrence) have the Exponential distribution, parameter .

Occurrence

1) Time until the failure of a part.

2) Times between messages arriving in a telecommunications network.

Result: for the Exponential distribution, parameter , mean = , variance = .

Proof

= =

= + = 0 + = .

2 = – 2 =

= +2

= 0 + = = = .

Example 3.2

(a) Probability still working after 5000 hours?

(b) Mean and standard deviation of the time till failure.

Solution

(a) Let Y = time till failure; f(y) = .

0.1 = P(Y 1000) = = = 1 – e-1000

e-1000 = 0.9

-1000 = ln(0.9) = -0.10536

= 1.0536 x 10-4

P(Y > 5000) = = = e-5000

= e-0.5268 0.59.

(b) Mean = 1/ = 9491 hours.

Standard deviation = (Variance) = = 1/ = 9491 hours.

Normal distribution

The continuous random variable X has the Normal distribution with mean and variance 2 if:

f(x) = – < x <

The pdf is symmetric about . X lies between – 1.96 and + 1.96 with probability 0.95 i.e. X lies within 2 standard deviations of the mean approximately 95% of the time.

Occurrence of the Normal distribution

1) Quite a few variables, e.g. measurement errors. (Bell-shaped histogram).

2) Sample means and totals – see below, Central Limit Theorem.

3) Approximation to several other distributions – see below.

There is no simple formula for , so tables must be used. The following result means that it is only necessary to have tables for one value of and 2.

If X ~ N(, 2), then Z = ~ N(0, 1)

Z is the standardised value of X; N(0, 1) is the standard Normal distribution. The Normal tables give values of P(Z z), also called (z), for z between 0 and 3.59.

Example 3.3

(a) P(Z 1.0) = (1.0) = 0.8413.

(b) P(Z -1.0) = P(Z 1.0) (by symmetry)

= 1 – P(Z < 1.0)

= 1 – 0.8413

= 0.1587

(c) P(Z > -0.5) = 1 – P(Z < -0.5)

= 1 – P(Z > 0.5)

= 1 – (1 – P(Z 0.5))

= P(Z 0.5) = (0.5)

= 0.6915.

(d) P(0.5 < Z < 1.5) = P(Z < 1.5) – P(Z < 0.5)

= (1.5) – (0.5)

= 0.9332 – 0.6915

= 0.2417.

(e) 0.8 = P(Z c) = (c)

Using tables “in reverse”, c 0.842.

(f) There is no unique answer; suppose, we want an interval which is symmetric about zero i.e. between –d and d.

Tail area = 0.025

P(Z d) = (d) = 0.975

Using the tables “in reverse”, d = 1.96.

range is -1.96 to 1.96.

Example 3.4 (i) and (ii)

Solution

(i) X ~ N(15.0, 0.022)

P(X > 14.99) =

= P(Z > -0.5) = 0.6915 (from Example 3.3(c))

(ii) From Example 3.3(f), Z = lies in (-1.96, 1.96) with probability 0.95.

i.e. P(-1.96 < < 1.96) = 0.95.

P(15.0 – 0.02 x 1.96 < X < 15.0 + 0.02 x 1.96) = 0.95

P(14.961 < X < 15.039) = 0.95

i.e. the required range is 14.96mm to 15.04mm.

Linear function of Normals

Result: If X1, X2, … , Xn are independent, Xi ~ N(i, i2) and c1, c2, … , cn are constants, then:

~

Special cases: (i) c1 = c2 = 1: X1 + X2 ~ N(1 + 2, 12 + 22)

(ii) c1 = 1, c2 = -1: X1 – X2 ~ N(1 – 2, 12 + 22)

If all the X‘s have the same distribution i.e. 1 = 2 = … = n = , say and 12 = 22 = … = n2 = 2, say, then:

(iii) All ci = 1: X1 + X2 + … + Xn ~ N(n, n2)

(iv) All ci = 1/n: = ~ N(, 2/n)

Example 3.4(iii)

Solution

X ~ N(15.0, 0.022), Y ~ N(15.07, 0.0222) ; X and Y independent (randomly chosen).

V = Y – X ~ N(15.07 – 15.0, 0.0222 + 0.022) = N( 0.07, 0.000884)

P(Pipe fits) = P( X < Y) = P(Y – X > 0)

= P(V > 0)

=

=P(Z > -2.354) = P(Z < 2.354)

= 0.9907

Normal approximations

Sample means and totals

Central Limit Theorem: If X1, X2, … are independent random variables with the same distribution, which has mean and variance 2 (both finite), n = and , then for any z, P(Zn < z) (z) as n .

Interpretation: is approximately N(0, 1) for large n

n is approximately N(, 2/n) for large n

is approximately N(n, n2) for large n

[See slide of distributions of sample means for various sample sizes.]

For the approximation to be good, n has to be bigger than 30 or more for skewed distributions but can be quite small for symmetric distributions.

Example 3.1 (c)

Suppose that a particular job requires 2000 seeks. Within what range will the total wait time lie with probability 50%?

Solution

Let Xi = wait time, in seconds, for the ith seek (i = 1, 2, … , 2000).

Recall that we found that = 0.01 and 2 = 3.333 x 10-5.

By above result, total wait time, T = is approximately N(2000 x 0.01, 2000 x 3.333 x 10-5) i.e. N(20, 0.06667).

From Normal tables,

P(-0.676 < Z < 0.676) = 0.5

P(-0.676 < < 0.676) = 0.5

P(19.8 < T < 20.2) = 0.5

Normal approximation to the Binomial

If X ~ B(n, p) and n is large and np is not too near 0 or 1, then X is approximately

N(np, np(1-p)). See OHP diagram.

P(k1 X k2) = P(k1 – 0.5 X k2 + 0.5)

= P( X k2 + 0.5) – P(X < k1 – 0.5)

Example 3.5

Solution

Let X = number of defective chips; X ~ B(200, 0.1).

Using the above approximation,

P(20 X 30)

= P(Z 2.475) – P(Z < -0.1179) = 0.9934 – 0.4530 0.54.

Normal approximation to the Poisson

If Y ~ Poisson parameter and is large (> 7, say), then Y has approximately a

N(, ) distribution.

P(k1 Y k2)

Example 3.6

Solution

Times at which filters are needed form a Poisson process, rate 0.2 / day.

Therefore, Y, number of filters needed in 100 days, ~ Poisson, = 100 x 0.2 = 20.

Suppose that n filters are ordered. Using the Normal approximation:

0.005 P(Y > n) = P(Y n + 1) = P( n + 1 Y )

= 1 – P()

2.5758 (Normal tables)

n 31.

Therefore, we need to order at least 31 filters.