3 Sampling distributions and confidence intervals
Normal distribution: one of the most important distributions in Statistics. It is also known as the Gaussian distribution and has the following properties:
- it is bell shaped
- it is symmetrical
- any point on the horizontal axis can be expressed in terms of the number of standard deviations from the mean, for example 95% of the values lie within 1.96 standard deviations either side of the mean. This interval $$(\mbox{mean}\pm 1.96 \times \mbox{standard deviations})$$ is called often the reference or normal range.
Figure 3: The Normal distribution
The Normal distribution with mean of 0 and standard deviation of 1 is called the standard Normal distribution.
Sampling distributions
The observed value of a statistic will in general vary from sample to sample. For example the proportion of individuals classified as obese will vary from one random sample to another. If we take many samples from the population of interest then the values of all these individual sample statistics will form a sampling distribution. With a large enough sample the sampling distribution will be Normal. We can use this property of the sampling distribution to help us draw conclusions about the population of interest.
Standard Error
The standard error is the standard deviationof the sampling distribution of a sample statistic. It gives an estimate of precision of the statistic and is a measure of the uncertainty associated with that statistic. For example, given a single sample the standard error of the sample mean is estimated as $$\frac{s}{\sqrt{n}}$$, where $$s$$ is the standard deviation of the sample data and n is the number of observations in the sample.
Confidence intervals for population parameters
The sampling distribution of a statistic is approximately Normally distributed. This fact can be used to provide a measure of how precisely the corresponding population parameter has been estimated. This is known as a confidence interval (CI). A 95% confidence interval takes the form:
$$\mbox{sample statistic} \pm 1.96 \times \mbox{standard error}$$
Example of mean, standard error and confidence interval
The mean height of a randomly selected group of 100 men was 174.8cm with a standard deviation of 5.07cm. Thus the 95% confidence interval for the sample mean was given by
$$174.8 \pm 1.96 \times\frac{5.07}{\sqrt{100}}cm = 174.8 \pm 1.0cm$$
that is from 173.8cm to 175.8cm. This is usually taken to mean that we are 95% confident that, if we use 174.8cm for the mean height of men, then the worst mistake we are likely to make, is 1cm.