An ANOVA test is a statistical test used to determine if there is a statistically significant difference between two or more categorical groups by testing for differences of means using a variance.
Another Key part of ANOVA is that it splits the independent variable into two or more groups.
For example, one or more groups might be expected to influence the dependent variable, while the other group is used as a control group and is not expected to influence the dependent variable.
In This Article
Assumptions of ANOVA
The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:
- An ANOVA can only be conducted if there is no relationship between the subjects in each sample. This means that subjects in the first group cannot also be in the second group (e.g., independent samples/between groups).
- The different groups/levels must have equal sample sizes.
- An ANOVA can only be conducted if the dependent variable is normally distributed so that the middle scores are the most frequent and the extreme scores are the least frequent.
- Population variances must be equal (i.e., homoscedastic). Homogeneity of variance means that the deviation of scores (measured by the range or standard deviation, for example) is similar between populations.
Types of ANOVA Tests
There are different types of ANOVA tests. The two most common are a “One-Way” and a “Two-Way.”
The difference between these two types depends on the number of independent variables in your test.
A one-way ANOVA (analysis of variance) has one categorical independent variable (also known as a factor) and a normally distributed continuous (i.e., interval or ratio level) dependent variable.
The independent variable divides cases into two or more mutually exclusive levels, categories, or groups.
The one-way ANOVA test for differences in the means of the dependent variable is broken down by the levels of the independent variable.
An example of a one-way ANOVA includes testing a therapeutic intervention (CBT, medication, placebo) on the incidence of depression in a clinical sample.
Note: Both the One-Way ANOVA and the Independent Samples t-Test can compare the means for two groups. However, only the One-Way ANOVA can compare the means across three or more groups.
Two-way (factorial) ANOVA
A two-way ANOVA (analysis of variance) has two or more categorical independent variables (also known as a factor) and a normally distributed continuous (i.e., interval or ratio level) dependent variable.
The independent variables divide cases into two or more mutually exclusive levels, categories, or groups. A two-way ANOVA is also called a factorial ANOVA.
An example of factorial ANOVAs include testing the effects of social contact (high, medium, low), job status (employed, self-employed, unemployed, retired), and family history (no family history, some family history) on the incidence of depression in a population.
What are “Groups” or “Levels”?
Levels are different groupings within the same independent variable.
For example, if the independent variable is “eggs,” the levels might be Non-Organic, Organic, and Free Range Organic. The dependent variable could then be the price per dozen eggs.
ANOVA F -value
The test statistic for an ANOVA is denoted as F. The formula for ANOVA is F = variance caused by treatment/variance due to random chance.
The ANOVA F value can tell you if there is a significant difference between the levels of the independent variable, when p < .05. So, a higher F value indicates that the treatment variables are significant.
Note that the ANOVA alone does not tell us specifically which means were different from one another. To determine that, we would need to follow up with multiple comparisons (or post-hoc) tests.
When the initial F test indicates that significant differences exist between group means, post hoc tests are useful for determining which specific means are significantly different when you do not have specific hypotheses that you wish to test.
Post hoc tests compare each pair of means (like t-tests), but unlike t-tests, they correct the significance estimate to account for the multiple comparisons.
What Does “Replication” Mean?
Replication requires a study to be repeated with different subjects and experimenters. This would enable a statistical analyzer to confirm a prior study by testing the same hypothesis with a new sample.
How to run an ANOVA?
For large datasets, it is best to run an ANOVA in statistical software such as R or Stata. Let’s refer to our Egg example above.
Non-Organic, Organic, and Free-Range Organic Eggs would be assigned quantitative values (1,2,3). They would serve as our independent treatment variable, while the price per dozen eggs would serve as the dependent variable. Other erroneous variables may include “Brand Name” or “Laid Egg Date.”
Using data and the aov() command in R, we could then determine the impact Egg Type has on the price per dozen eggs.
ANOVA vs. t-test?
T-tests and ANOVA tests are both statistical techniques used to compare differences in means and spreads of the distributions across populations.
The t-test determines whether two populations are statistically different from each other, whereas ANOVA tests are used when an individual wants to test more than two levels within an independent variable.
Referring back to our egg example, testing Non-Organic vs. Organic would require a t-test while adding in Free Range as a third option demands ANOVA. Rather than generate a t-statistic, ANOVA results in an f-statistic to determine statistical significance.