Skip to article frontmatterSkip to article content

What is Analysis of Variance

Definition: Analysis of Variance, aka ANOVA, is a statistical method used to compare the means of more than two groups to determine whether there is a significant difference between them.

The primary purpose of ANOVA is to test the hypothesis that the means of several populations are equal. It helps determine whether any observed differences between group means are real or occurred by chance.

When to use ANOVA: ANOVA is used when comparing more than two group means. It is particularly useful in experimental settings where a single factor (independent variable, one-way ANOVA) is manipulated at multiple levels (e.g., low, medium, high), and the researcher wants to investigate the effect of this manipulation on the outcome (dependent variable).

Assumptions of ANOVA

ANOVA is based on the following assumptions:

Note: It is essential to verify that these assumptions are met before conducting ANOVA. If assumptions are not satisfied, the validity of the F-test may be compromised. In such cases, alternative methods such as repeated measure ANOVA, Welch’s ANOVA, non-parametric tests or more robust techniques may be considered.

Steps of ANOVA

The analysis process for t-test, ANOVA, and post-hoc test.

The analysis process for t-test, ANOVA, and post-hoc test.

Types of ANOVA

Here only some commonly used types are introduced. Other types, e.g,. two-ways, three-ways, nested, Welch’s ANOVA... are not covered here since those data is less common, and the concept is similar (anyway).

One-way ANOVA

Concept One-way ANOVA is a statistical technique used to compare the target variable’s means of three or more independent groups with a single independent variable (or factor). It helps determine whether any observed differences between group means are real or occurred by chance.

Example To compare the average test scores of students who received different teaching methods.

Hypothesis Tested in One-way ANOVA

The One-way ANOVA comparing the four target variables between the three species.

The One-way ANOVA comparing the four target variables between the three species.

Table 1:The ANOVA result for the four target variables. Each row display the result from a single ANOVA. All were significant.

varietyddof1ddof2Fp-uncnp2
petal.length21471180.160.0000.941
petal.width2147960.0070.0000.929
sepal.length2147119.2650.0000.619
sepal.width214749.160.0000.401

Repeated Measures ANOVA

Concept Repeated measures ANOVA, also known as within-subjects ANOVA, is a statistical technique used to compare means of three or more dependent groups when the same subjects are observed under various conditions or at different points in time.

Example To compare the reaction times of a group of participants before, during, and after receiving a treatment.

Hypothesis Tested in Repeated Measures ANOVA

The physiological responses before, during and after two types of VR treatments.

The physiological responses before, during and after two types of VR treatments.

The tables about the repeated measures and post-hoc results: (left) low decibel, (right) high decibel.

The tables about the repeated measures and post-hoc results: (left) low decibel, (right) high decibel.

Mixed-Design ANOVA

Concept Mixed-design ANOVA, also known as split-plot ANOVA, is a statistical method used for analyzing data from studies with both between-subjects factors and within-subjects factors. It combines features of one-way ANOVA and repeated measures ANOVA, allowing researchers to examine the effects of multiple independent variables on a dependent variable.

Key Points

Assumptions The assumptions of mixed-design ANOVA are similar to those of one-way and repeated measures ANOVA, including normality, homogeneity of variances, and independence. Additionally, the assumption of sphericity must be met for the within-subjects factor.

Use case By using mixed-design ANOVA, researchers can gain a deeper understanding of the complex relationships between multiple factors and their effects on the dependent variable. It’s a powerful tool for analyzing data from various experimental designs, particularly in fields such as psychology, medicine, and education.

The mixed ANOVA result between classes, time (pre-test vs. post-test), and the interaction between class and time.

The mixed ANOVA result between classes, time (pre-test vs. post-test), and the interaction between class and time.

The post-hoc test after mixed ANOVA, to observe the time effects in different condition.

The post-hoc test after mixed ANOVA, to observe the time effects in different condition.

F-test, F-distribution, and F-statistic

The F-test is a statistical test used in ANOVA to compare variances and determine whether there is a significant difference between group means. The F-statistic (or F-ratio) is calculated as the ratio of the between-group variance to the within-group variance:

F=(Between-group variance)/(Within-group variance)F = (\text{Between-group variance}) / (\text{Within-group variance})

The F-statistic reflects the extent to which the variation between groups exceeds the variation within groups. A higher F-value indicates a greater difference between group means relative to the variability within each group.

Under the null hypothesis that all group means are equal, the F-statistic follows an F-distribution, which is a family of continuous probability distributions characterized by two parameters: degrees of freedom for the numerator (ddof1\text{ddof}_1) and degrees of freedom for the denominator (ddof2\text{ddof}_2).

The F-distribution is used to determine the critical F-value for a given significance level (α\alpha) and degrees of freedom (ddof1\text{ddof}_1 and ddof2\text{ddof}_2). This critical value is compared with the calculated F-statistic to make a decision about the null hypothesis.

Making decisions in ANOVA using the F-value

Post-hoc Tests

Purpose of Post-hoc Tests

Post-hoc tests are follow-up analyses conducted after a significant overall F-test in ANOVA. Their purpose is to determine which specific group means are significantly different from one another. Since ANOVA only tells us that at least two means are different, post-hoc tests help pinpoint where those differences lie.

Common post-hoc tests

Tukey’s HSD Test:

Pairwise t-tests:

Roadmap for choosing ANOVA and post-hoc tests

The roadmap for choosing ANOVA and post-hoc tests approaches. by Pingouin

The roadmap for choosing ANOVA and post-hoc tests approaches. by Pingouin