CHAPTER 9
EXERCISE SOLUTIONS
9.1 a. Statistics. The given means are summaries of samples taken from a larger population.
b. The sample means alone do not provide enough information to say whether the mean salary for males in the company is higher than the mean salary for females. It might be that the observed difference between sample means is consistent with the type of difference that could occur for any two randomly selected samples of n =100 taken from the same population. We might ask how likely it is that the difference in means would be as large as $1,500 (the observed difference) if there were actually no difference between males and females in the population. To answer this, we need to know the sampling distribution for possible differences between the means of two different samples taken from the same population. Later in the text, it will be seen that the relevant sampling distribution can be determined, as long as the sample standard deviations are also provided.
c. No, the sample means most likely would not be the same for two new samples. A sample mean is a random variable and varies from sample to sample.
9.2 a. Parameters. Here, the given means are population values.
b. Yes, because the population means are known exactly, the shareholders can be certain that the mean salary for men in the company is higher than the mean salary for women.
9.9 The
standard error of
is calculated using an observed value of a sample proportion
and it (the standard error) estimates
the "true" standard deviation of the sampling distribution of
. The standard
deviation is found using the known (or assumed to be known) value of the
population proportion.
The formula for the standard
error is
.
The formula for the standard
deviation is
.
In practice, the standard error will be used more often because the value of the population proportion usually will not be known.
9.11 a. p = .20 (20% expressed as a proportion)
b.
c.
. Notice that the standard error calculation uses the sample
proportion
.
d. The mean of the sampling distribution of
equals the population proportion, which is .20.
e.
.
9.12 About
95% of the samples will have a sample proportion in the interval .20 ±
(2)(.04), which is .12 to .20. The sampling distribution of
is approximately a normal distribution, so about 95% of
sample proportions will fall in the interval
.
9.14 a.
|
Figure for
Exercise 9.14a |
|
|
b. The sampling distribution of possible sample means for random samples of n= 40 is approximately a normal distribution. The mean is m = 210 pounds.
The standard deviation of the
sampling distribution is
pounds.
c. The sampling distribution of the mean has a much smaller standard deviation than the distribution of individual weights. Remember that about 99.7% of the distribution is within three standard deviations of the mean. For individual weights (including luggage) about 99.7% of the distribution is in the interval 210 ± 75 pounds. For possible sample means, about 99.7% of the distribution is in the interval 210 ± 11.85 pounds.
|
Figure for
Exercise 9.14c |
|
|
d. If the total weight is
8800, the mean weight is 8800/40 = 220 pounds. The question asked is equivalent
to asking what is the probability that the mean
weight is greater than 220 pounds for 40 passengers. Because the question
is about a sample mean, use the
sampling distribution described in part (b) to find the answer for
.
For
=220 pounds,
.
Use Table A.1 to find that P(Z £ 2.53) = .9943.
= P(Z > 2.53) = 1- P(Z £
2.53) = 1-.9943
= .0057
Notice that this probability is 57 in 10,000, which is equivalent to 1 in 175 (divide 10,000 by 57). In the long run, about 1 of every 175 sold-out flights will exceed the total weight limit.
9.18 a.
In a normal curve distribution, about 95% of the values are within two
standard deviations of the mean. With
the mean and standard deviation given for this problem, the interval
is -20 to
180 miles. Assuming a normal curve,
about 99.7% of the values should be in the interval
, which is the interval -70 to 230 miles for this
problem. But, the two intervals just given don't make sense because it's not
possible for the miles to be negative. A drawing of a normal curve with mean=80
and standard deviation = 50 shows clearly that this normal curve gives
relatively high probability to negative daily miles (an impossible event). Either there are outliers that inflate the
standard deviation or the distribution actually is skewed.
|
Figure for
Exercise 9.18a |
|
|
b. The distribution of the mean miles per day over n = 365 days will be approximately a
normal distribution with mean m=80
miles and standard deviation
miles. This
distribution describes the distribution of mean
daily miles in a year for many different cars.
c. The distribution of the total number of miles will also be approximately a normal curve.
The mean total in a year =(365 days)(mean miles per day) = (365)(80) = 29,200 miles, and the standard deviation of the total is 955.25 miles.
The standard deviation of the total
can be found using methods described in Section 8.8 about sums, differences,
and linear combinations. If we assume that daily miles for the 365 days are
independent, then variance of the
total = sum of the variances for the 365 days.
The variance is
(the squared standard
deviation) each day, so the sum of variances for 365 days =
, and the standard deviation
=
d. The sample size (n=365) is large so the rule should hold even though the distribution of daily miles is not normal. For the rule to hold, the observations must be independent (see p.268 in the text), and that might be debated here. If different renters drive the car each day, observations probably are independent. If each driver keeps the car for a few days, there may be some dependencies between the days.
9.26 The
population mean m and the population
standard deviation s must also be known. The relevant formula is
.
9.27 a. For
= 97,
.
For
= 105,
.
b.
|
Figure for
Exercise 9.27b |
|
|
c.
|
Figure for
Exercise 9.27c |
|
|
d. In essence, the drawings
for parts (b) and (c) are the
same. In part (c) the horizontal axis
shows standardized statistics that correspond to the values along the
horizontal axis in part (b). The area (probability) between
and
equals the area (probability) between z = -1.2
and z = 2.
9.33 a.
. Notice that the sample standard deviation is used so it is
appropriate to use the symbol t to
denote the standardized score.
b.
. As in part (a), the symbol t is used because the sample
standard deviation (rather than the population standard deviation) is used.
9.34a. The drawing will be similar to one for a standard normal distribution. To find the probability below t = 2.5, use software (like Excel or Minitab) or a statistical calculator. The desired probability is .9902. For this problem, degrees of freedom are df = n-1=24. In Excel, the command 1-TDIST(2.5,24,1) gives the necessary probability (because TDIST(2.5,24,1) gives the probability above 2.5 -- see p.277of the text). In Minitab, use Calc>Probability Distributions>t to find the cumulative probability for t =2.5.
|
Figure for
Exercise 9.34a |
|
|
b. The degrees of freedom are df = n-1 = 99, and the probability below t = -5 is approximately .000001 (1 in a million). If Excel is used to find the probability, it's necessary to the probability above t = +5 (which equals the probability below -5 due to symmetry). The command is TDIST(5,99,1).
|
Figure for
Exercise 9.34b |
|
|
9.41 The
distribution of sample means is approximately normal with mean m = 25 mpg and standard deviation
mpg.
About 68% of possible sample means will be in the range 25 ± 0.333.
About 95% of possible sample means will be in the range 25 ± (2)(0.333).
About 99.7% of possible sample means will be in the range 25 ± (3)(0.333).
|
Figure for
Exercise 9.41 |
|
|
9.42 The
mean still is 25mpg, but the standard deviation is now
mpg.
The normal curve will be much more tightly bunched around 25. The main idea is that a sample mean based on n=100 is likely to be closer to the true mean than a sample mean based on n=9.
|
Figure for
Exercise 9.42 |
|
|
9.47 a.
The distribution of possible sample proportions who watch the program is
approximately a normal curve with a mean of .20 and a standard deviation of
b. If 20% of all viewers in
the population watch this show, it is unlikely that a random sample of 2500
households would produce a sample percentage of less than 17%. We expect
virtually all samples to have between 17.6% and 22.4% of viewers. Also, the z-score for 0.17 (which is 17%) is
. The "In the extreme part of Table A.1 indicates that
the probability of a proportion this low or lower is only about .0001 (1 in
10,000).
9.50 a. No, the conditions are not met. The sample size is not large enough. If p = .10 and n = 30, then np = 3, which is less than 5.
b. Yes, assuming that the proportion of interest is a long-run relative frequency over all weekdays and seasons, and the days on which they do the survey is representative of all days and seasons. The fixed probability is that someone will be home during those hours at a randomly selected residence on a randomly selected day. The sample size is large enough. .
c. No, the conditions are not met. A random sample of days of the year was not taken, since the weather was recorded only for days in January and February. Clearly, snow or rain (depending on the area) will occur more frequently in those months than for the year as a whole.
d. Yes the conditions are met. The population consists of all employees of the company and a fixed proportion p of those employees is currently interested in on-site day care. A random sample was taken and the sample size of 100 is large enough unless p is very close to 0 or 1, which is not likely.