From learning about the one-way ANOVA, we know that ANOVA is used to identify the mean difference between more than two groups. A one-way ANOVA is used when we have one grouping variable and a continuous outcome. But what should we do if we have two grouping variables? As you’ve probably guessed, we can conduct a two-way ANOVA. Because this situation is fairly common, I created the page below to provide a step-by-step guide to calculating a two-way ANOVA in Excel. As always, if you have any questions, please email me a MHoward@SouthAlabama.edu!
As mentioned, an ANOVA is used to identify the mean difference between more than two groups, and a two-way ANOVA is used to identify the mean difference between more than two groups when you have a two grouping variables and a continuous outcome. So, a two-factor ANOVA is used to answer questions that are similar to the following:
- What is the mean difference of test grades between left- and right-handed students, students in Dr. Howard’s and Dr. Smith’s classes, and the combinations of these groups?
- What is the mean difference in total output of factories defined by location as well as industry?
- What is the mean difference in performance for four different training programs, each performed at four different locations, and the combination of training program and location.
Also, in testing these effects, a two-way ANOVA can determine whether Variable 1 has an effect, whether Variable 2 has an effect, and whether there is an interaction between Variable 1 and Variable 2. An interaction indicates that the effect of Variable 1 depends on Variable 2 and the effect of Variable 2 depends on Variable 1. One way to think about it is: Variable 1 may have an effect, Variable 2 may have an effect, but an interaction occurs when something special happens when Variable 1 and Variable 2 are studied together. For instance, the effects may be multiplicative when studied together.
Now that we know what a two-way ANOVA is used for, we can now calculate a two-way ANOVA in Excel. To begin, open your data in Excel. If you don’t have a dataset, download the example dataset here. In the example dataset, we are simply comparing the means two different grouping variables, each with three different groups, on a single continuous outcome. The variables are Variable 1 (Group A, B, and C) and Variable 2 (Groups 1, 2, and 3). You can imagine that the groups and the outcome are anything that you want.
Before continuing, I must note that Excel is pretty bad at calculating a two-way ANOVA, so I recommend using SPSS or R instead. Nevertheless, if you need to use Excel, it can calculate a result for you.
With that noted, your dataset should look like the image below.
Like most other analyses, we want to begin by going to the Data tab and clicking on Data Analysis. If you don’t have the Data Analysis button, read my guide on how to activate Data Analysis in Excel.
Click on “Anova: Two-Factor With Replication”, then click on “OK”.
Click on the button highlighted below, which tells Excel where your data is located.
Highlight all your data AND your labels. Then press the other button highlighted below.
Now, we need to tell Excel how many participants are in our groups. As you can see above, there are two grouping variables each with three groups. This makes a total of nine groups: 1A, 2A, 3A, 1B, 2B, 3B, 1C, 2C, and 3C. In each of these groups, we have three numbers. For instance, Group 1A has values of 10, 7, and 10. This means that we have three participants per group. So, if you are using the example dataset, enter “3” in the box highlighted below.
Now your screen should look like the following. Press “OK”.
Your output should appear as the following:
Woah! That is a lot of output! Don’t worry, we will walk through it step-by-step.
First, we are going to determine whether the variable that defined our rows has a statistically significant effect. In the example dataset, this was Variable 1 (Group A, B, and C). To determine whether it was statistically signifiant, let’s first look at the F-statistic, which is our test statistic.
The F-statistic for Variable 1 was 5.127. While we would want to report it, the F-statistic doesn’t tell us a lot on its own. So, we should look at the p-value.
The p-value was .017. Because this is less than .05, we consider the effect of Variable 1 to be statistically significant. So, there is a significant difference among Groups A, B, and C.
Now, let’s look at Variable 2’s results (Groups 1, 2, and 3.).
The F-statistic is 1.901. Again, we should report this value, but also look at the p-value to determine how to interpret our results.
The p-value is .178. Because this is greater than .05, our result is not statistically significant. Therefore, we would say that Variable 2 does not have a significant effect, and there is not a notable difference between Groups 1, 2, and 3.
Lastly, let’s look at the interaction term.
The F-statistic is 1.660. Report this, but let’s also look at the p-value.
The p-value is .203. This result is not statistically significant, and we would therefore say that there is no significant interaction between Variable 1 and Variable 2.
Together, we know that there is a significant effect for Variable 1, but there was not a significant effect for Variable 2 or the interaction. Because Variable 1 was significant, we know that there is some type of difference between Groups A, B, and C. But how are they different?
If we were using a more advanced program, such as SPSS or R, we could perform post hoc tests or planned comparisons. Unfortunately, Excel does not have this feature easily implemented. So, we are just going to look at the group means.
The highlighted means are for Groups A, B, and C. From looking at the means, we can tell that Group A has the highest value, followed by Group C, followed by Group B. While we don’t know whether these specific comparisons are significant, we can get a general feel for how the groups differ.
But what if Variable 2 was statistically significant? Well, we can look at the values highlighted below.
From these means, we can see that Group 3 had a slightly higher mean than Groups 1 and 2; however, because Variable 2 did not have a significant effect, these differences are not statistically significant.
Finally, if we had a significant interaction effect, we would need to look at the means of all groups. They are highlighted in the picture below.
From this image, we can see that Group C3 had a value of 10, Group A1 had a value of 9, and the groups descended from there. Because there was not a significant interaction effect, these individual group comparisons are not that meaningful.
Phew! That was a lot, but we are done now. If you have any questions or comments, please email me at MHoward@SouthAlabama.edu.