In the world of hypothesis testing for data which follows normal distribution, Z-test is commonly used when the sample size is large (>30) or when the population variance is known. But, if the sample size is not large and the population variance is unknown, t-test is preferred.
We can build a simple simulation to compare the type I error rates for the Z – test and t – test, and check whether the t – test performs better when the sample size is small and the variance is unknown. Just to recall, a type I error is rejecting a true null hypothesis.
The steps for this simulation are as follows:
- Generate 25 observations from the normal distribution with mean 160 and standard deviation 20.
- Select different values for level of significance (the probability of type I error).
- Calculate the test statistic and the corresponding p- value (probability value is the probability of obtaining test results at least as extreme as the results actually observed during the test, assuming that the null hypothesis is correct) for a two sided t-test and Z – test
- Calculate the type I error rates for different values of alpha.
The R code is as follows:
Hence, we can see that for normal data with small sample size and unknown variance, the t-test outperforms the Z-test with respect to the type I error rates.