Jul 24 2007

## Chapter 3 Continuous Random Variables

**Continuous Random Variables**

A continuous random variable is a random variable which can take values measured on a continuous scale e.g. weights, strengths, times or lengths.

Suppose *X* = time between messages arriving at a telecommunications system.

For any pre-determined time *x*, P(* X = x*) = 0, since if we measured *X* accurately enough, we are never going to hit the value x exactly.

However, P(*X* *x*) need not be zero, so we can use this to describe the probability distribution of *X*.

**Distribution function: **F(*x*) = P(*X* *x*).

Note: since P(*X = x*) = 0, it follows that F(*x*) = P(*X* < *x*).

**Probability density function (pdf): **f(*x*) =

F(*x*) =

For any constants *a* and *b* (*a* < *b*),

P(*a* *X* *b*) = P(X *b*) – P(*X* *a*)

= F(*b*) – F(*a*)

=

If we choose *a* = -,* b* = , we get

= P(- < *X* < ) = 1

Also, for a pdf, f(*x*) 0 for all *x*.

Mean of *X*: =

Variance of *X*, 2 =

We are going to look at three continuous distributions in detail.

**Uniform distribution**

The continuous random variable *X* has the Uniform distribution between 1 and 2

(1 < 2 ) if: f(*x*) =

*X* ~ U(1,* *2), for short.

Roughly speaking, *X* ~ U(1, 2) if *X* can only take values between 1and 2 and any value of *X* within these values is as likely as any other value.

**Result: **for U(1, 2), = and

**Proof: **

= = =

= = = .

2 = – 2 = –

= –

= – = … =

**Example 3.1**

*Solution*

Rotation time = 0.02 seconds. Wait time can be anything between 0 and 0.02 sec and each time in this range is as likely as any other time. Therefore, distribution of the wait time is U(0, 0.02) (i.e. 1 = 0 and 2 = 0.02).

= = 0.01 second.

2 = = 3.333 x 10-5; = 0.0058 seconds.

**Occurrence of the Uniform distribution**

1) Waiting times: as above.

2) Engineering tolerances: e.g. if a diameter is quoted “0.1mm”, it sometimes assumed that the error has a U(-0.1, 0.1) distribution.

3) Simulation: programming languages often have a standard routine for simulating the U(0, 1) distribution. This can be used to simulate other probability distributions.

**Exponential distribution**

The continuous random variable *Y* has the Exponential distribution, parameter if:

**Result: **in a Poisson process rate* *, the times between occurrences (and the time until the first occurrence) have the Exponential distribution, parameter .

**Occurrence**

1) Time until the failure of a part.

2) Times between messages arriving in a telecommunications network.

**Result: **for the Exponential distribution, parameter , mean = , variance = .

**Proof**

= =

= + = 0 + = .

2 = – 2 = –

= +2 –

= 0 + – = – = – = .

**Example 3.2**

(a) Probability still working after 5000 hours?

(b) Mean and standard deviation of the time till failure.

*Solution*

(a) Let *Y* = time till failure; f(*y*) = .

0.1 = P(*Y* 1000) = = = 1 – e-1000

e-1000 = 0.9

-1000* = *ln(0.9) = -0.10536

= 1.0536 x 10-4

P(*Y* > 5000) = = = e-5000

= e-0.5268 0.59.

(b) Mean = 1/ = 9491 hours.

Standard deviation = (Variance) = = 1/ = 9491 hours.

**Normal distribution**

The continuous random variable *X* has the Normal distribution with mean and variance 2 if:

f(*x*) = – < x <

The pdf is symmetric about . *X* lies between – 1.96* *and + 1.96* *with probability 0.95 i.e. *X* lies within 2 standard deviations of the mean approximately 95% of the time.

**Occurrence of the Normal distribution**

1) Quite a few variables, e.g. measurement errors. (Bell-shaped histogram).

2) Sample means and totals – see below, Central Limit Theorem.

3) Approximation to several other distributions – see below.

There is no simple formula for , so tables must be used. The following result means that it is only necessary to have tables for one value of and 2.

If *X* ~ N(, 2), then *Z *= ~ N(0, 1)

*Z* is the *standardised* value of *X*; N(0, 1) is the *standard Normal distribution*. The Normal tables give values of P(*Z* *z*), also called (*z*), for *z* between 0 and 3.59.

**Example 3.3**

(a) P(*Z* 1.0) = (1.0) = 0.8413.

(b) P(*Z* -1.0) = P(*Z* 1.0) (by symmetry)

= 1 – P(*Z* < 1.0)

= 1 – 0.8413

= 0.1587

(c) P(*Z* > -0.5) = 1 – P(*Z* < -0.5)

= 1 – P(*Z* > 0.5)

= 1 – (1 – P(*Z* 0.5))

= P(*Z* 0.5) = (0.5)

= 0.6915.

(d) P(0.5 < *Z* < 1.5) = P(*Z* < 1.5) – P(*Z* < 0.5)

= (1.5) – (0.5)

= 0.9332 – 0.6915

= 0.2417.

(e) 0.8 = P(*Z* *c*) = (*c*)

Using tables “in reverse”, *c* 0.842.

(f) There is no unique answer; suppose, we want an interval which is symmetric about zero i.e. between –*d* and *d*.

Tail area = 0.025

P(*Z* *d*) = (*d*) = 0.975

Using the tables “in reverse”, *d* = 1.96.

range is -1.96 to 1.96.

**Example 3.4 (i) and (ii)**

*Solution*

(i) *X* ~ N(15.0, 0.022)

P(*X* > 14.99) =

= P(*Z* > -0.5) = 0.6915 (from Example 3.3(c))

(ii) From Example 3.3(f), *Z* = lies in (-1.96, 1.96) with probability 0.95.

i.e. P(-1.96 < < 1.96) = 0.95.

P(15.0 – 0.02 x 1.96 < *X* < 15.0 + 0.02 x 1.96) = 0.95

P(14.961 < *X* < 15.039) = 0.95

i.e. the required range is 14.96mm to 15.04mm.

**Linear function of Normals**

**Result: **If *X*1, *X*2, … , *X**n* are independent, *X**i* ~ N(*i*, *i*2) and *c*1, *c*2, … , *c**n* are constants, then:

~

Special cases: (i) *c*1 = *c*2 = 1: *X*1 + *X*2 ~ N(1 + 2, *1*2 + *2*2)

(ii) *c*1 = 1, *c*2 = -1: *X*1 – *X*2 ~ N(1 – 2, *1*2 + *2*2)

If all the *X*‘s have the same distribution i.e.* *1 = 2 = … = *n* = , say and *1*2 = *2*2 = … = *n*2 = 2, say, then:

(iii) All *c**i* = 1: *X*1 + *X*2 + … + *X*n ~ N(*n*, *n*2)

(iv) All *c**i* = 1/*n*: = ~ N(, 2/*n*)

**Example 3.4(iii)**

*Solution*

*X* ~ N(15.0, 0.022), *Y* ~ N(15.07, 0.0222) ; *X* and *Y* independent (randomly chosen).

*V* = *Y – X* ~ N(15.07 – 15.0, 0.0222 + 0.022) = N( 0.07, 0.000884)

P(Pipe fits) = P( *X* < *Y*) = P(*Y – X* > 0)

= P(*V* > 0)

=

=P(*Z* > -2.354) = P(*Z* < 2.354)

= 0.9907

**Normal approximations**

**Sample means and totals**

**Central Limit Theorem: **If *X*1, *X*2, … are independent random variables with the same distribution, which has mean and variance 2 (both finite), *n* = and , then for any *z*, P(*Z**n* < z) (*z*) as *n* .

*Interpretation: * is approximately N(0, 1) for large *n*

*n* is approximately N(, 2/*n*) for large *n*

is approximately N(*n*, *n*2) for large* n*

[See slide of distributions of sample means for various sample sizes.]

For the approximation to be good, *n* has to be bigger than 30 or more for skewed distributions but can be quite small for symmetric distributions.

**Example 3.1 (c)**

Suppose that a particular job requires 2000 seeks. Within what range will the total wait time lie with probability 50%?

*Solution*

Let *X**i* = wait time, in seconds, for the *i*th seek (*i* = 1, 2, … , 2000).

Recall that we found that = 0.01 and 2 = 3.333 x 10-5.

By above result, total wait time, *T* = is approximately N(2000 x 0.01, 2000 x 3.333 x 10-5) i.e. N(20, 0.06667).

From Normal tables,

P(-0.676 < Z < 0.676) = 0.5

P(-0.676 < < 0.676) = 0.5

P(19.8 < *T* < 20.2) = 0.5

**Normal approximation to the Binomial**

If *X* ~ B(*n*, *p*) and *n* is large and *np* is not too near 0 or 1, then *X* is approximately

N(*np*, *np*(1-*p*)). See OHP diagram.

P(*k*1 *X* *k*2) = P(k1 – 0.5 X k2 + 0.5)

= P( X k2 + 0.5) – P(X < k1 – 0.5)

**Example 3.5**

*Solution*

Let *X* = number of defective chips; *X *~ B(200, 0.1).

Using the above approximation,

P(20 *X* 30) –

= P(*Z* 2.475) – P(*Z* < -0.1179) = 0.9934 – 0.4530 0.54.

**Normal approximation to the Poisson**

If *Y* ~ Poisson parameter and is large (> 7, say), then *Y* has approximately a

N(*, *) distribution.

P(*k*1 *Y* *k*2) –

**Example 3.6**

*Solution*

Times at which filters are needed form a Poisson process, rate 0.2 / day.

Therefore, *Y*, number of filters needed in 100 days, ~ Poisson, = 100 x 0.2 = 20.

Suppose that *n* filters are ordered. Using the Normal approximation:

0.005 P(*Y* > *n*) = P(*Y* *n *+ 1) = P( *n *+ 1 *Y* )

–

= 1 – P()

2.5758 (Normal tables)

*n* 31.

Therefore, we need to order at least 31 filters.