Chapter 16 More tests for relationships (Diagnostics)

We can use hypothesis tests to test the assumptions of other hypothesis tests!


16.1 Test normality for 1+ populations: histogram, QQ-plot, Shapiro-Wilks Test

Here we illustrate using the car weights.

  1. Check the histogram.
hist(mtcars$wt, main = "Histogram of Car Weights", freq=F)
  • Note: The histogram is roughly normal (roughly balanced around one mode towards the centre).


  1. Check the QQ-plot.
  • A QQ-plot is another summary for quantitative variables. Further information can be found here.
help(qqnorm)
qqnorm(mtcars$wt, main = "QQ-plot for Car weights")
qqline(mtcars$wt, col = "blue")
  • Note: The data is reasonably evenly scattered around the theoretical quantile-quantile line which suggests that the car weights are reasonably normally distributed.


  1. Test using the Shapiro-Wilks Test
  • Warning! The Shapiro-Wilks normality test is sensitive to sample size. Small samples will nearly always pass normality tests. Therefore, it is recomended to use both visual inspection and a significance test to decide whether a dataset is normally distributed.

  • H:

    • \(H_{0}\) : The data is normally distributed.
    • \(H_{1}\) : There data is not normally distributed.
    • We hope to retain \(H_{0}\)!
  • T,P&C:

shapiro.test(mtcars$wt)

Note: The p-value is much greater than 0.05, so strong evidence that the data is normally distributed.


16.2 Test equal variance of 2 populations (Boxplots; F-test; variance ratio)

  1. Compare the 2 boxplots.
boxplot(extra~group, data = sleep)
  • Note: It is clear from the boxplots that the variation in group 1 is less than group 2.

  • Boxplots are a very useful tool as they provide an indication of whether the sample means are similar or different, whether the samples are normally distributed, and whether the variances are similar between the 2 samples.

  1. Test using the Variance Test (Levene)
  • H:

    • Let \(\sigma_1^2\) = variance of the group 1 drug.

    • Let \(\sigma_2^2\) = variance of the group 2 drug.

    • \(H_{0}\) : There is no difference: \(\sigma_1^2=\sigma_2^2\)

    • \(H_{1}\) : There is a difference: \(\sigma_1^2 \neq \sigma_2^2\)

    • We hope to retain \(H_{0}\)!

  • T,P&C

var.test(extra~group, data = sleep)

Note: From the variance test, it is clear that the p-value is much greater than 0.05. Therefore we retain the null hypothesis and conclude that the variances seem to be equal.