Guide to Statistics: "Supporting Statistics in Medicine;"

Print this page

4 Hypothesis tests

Statistics is useful as it enables us to answer questions of interest, for example whether a new treatment is better than the current treatment. We start with a null hypothesis and then examine evidence to see if it can be sustained. A hypothesis test involves testing a claim, or null hypothesis $$H_{0}$$, against an alternative, $$H_{1}$$. A decision to reject $$H_{0}$$ or not reject $$H_{0}$$ uses sample evidence to calculate a test statistic which is used to obtain a p-value. The p-value is a useful reformulation of the test statistic and is the probability of obtaining the test results or results more extreme if the null hypothesis is true. $$H_{0}$$ is maintained unless it is made untenable by sample evidence i.e. the p-value is less than or equal to $$(\leq)$$ some pre-specified critical value. On the basis of the sample evidence the null hypothesis is either rejected or not rejected. Table 3 shows the different situations that can arise.
Rejecting $$H_{0}$$ when we should not is a Type I error. The probability of making a Type I error is called the significance level, $$\alpha$$. Not rejecting $$H_{0}$$ when we should is a Type II error, which has probability $$\beta$$. The acceptable levels of committing a Type I error and a Type II error are specified before an analysis is conducted and this acceptable Type I error rate provides the critical value mentioned above.  The power of a hypothesis test is the probability of rejecting the null hypothesis when it is actually false (power $$= 1-\beta$$).


Table 3: Testing hypotheses

Table3

One-sided vs two-sided testing
A two-sided test is one in which the alternative hypothesis does not state a particular direction for the effect or difference. Conversely a one-sided test is one in which the alternative hypothesis is that an effect or difference is in a particular direction (e.g. greater than zero). It should be either theoretically plausible or interest only lies in one direction. For example, suppose that a new technology or treatment has been developed that is much cheaper than the existing treatment and it may be that interest lies in proving only that it is no worse. Provided the new treatment is at least as good as the old treatment then it will be used, as it is much cheaper. This is one of the few occasions when a one-sided test is justifiable in Medicine. If a one-sided test is to be used it should be stated at the design stage.

Simple statistical tests
When comparing two groups it is important to distinguish between independent groups and paired groups. Two groups are considered to be independent when subjects are either randomly sampled from two distinct populations or randomly assigned to one of two groups. Two groups are considered to be paired when they consist of observations made within the same individual or between individuals who are explicitly paired.


Table 4: Simple statistical methods for comparing two groups

Comparison

Data type

Assumptions

Method

Difference between two independent groups

Numerical:
     Measurable

 

Normally distributed

 

Independent samples t-test

 

Not Normally distributed

Mann-Whitney U test

     Count

 

Mann Whitney U

 

Categorical:

 

 

 

     Binary

Large sample, most expected frequencies > 5

Chi-squared test

 

 

Small sample, at least 1 expected frequency < 5

Fisher’s exact test

 

     Nominal

More than two categories
Most expected frequencies > 5

Chi-squared test

 

     Ordinal

 

Mann-Whitney U

Difference between paired groups:

Numerical:
     Measurable

 

Differences Normally distributed

 

Paired t-test

 

Differences not Normally distributed

Wilcoxon matched pairs test

 

     Count

 

Wilcoxon matched pairs test

 

Categorical:

 

 

 

     Binary

 

McNemar’s test

 

     Nominal

More than two categories

No simple test available, consult a statistician

 

     Ordinal

 

Wilcoxon matched pairs test or sign test

Contents