# One-Way ANOVA in R

T-tests are used to identify the mean difference between two groups.  But what do you do if you want to compare the mean difference of more than two groups?  Well, as you’ve probably guessed, you can perform an ANOVA.  Because ANOVA is a commonly-used statistical tool, I created the page below to provide a step-by-step guide to calculating an ANOVA in R.  This page is for a one-way ANOVA, which is when you have a single grouping variable and a continuous outcome.  As always, if you have any questions, please email me a MHoward@SouthAlabama.edu!

As mentioned, an ANOVA is used to identify the mean difference between more than two groups, and a one-way ANOVA is used to identify the mean difference between more than two groups when you have a single grouping variable and a continuous outcome.  So, a one-way ANOVA is used to answer questions that are similar to the following:

• What is the mean difference of test grades between Dr. Howard’s class, Dr. Smith’s class, and Dr. Kim’s class?
• What is the mean difference in total output of five different factories?
• What is the mean difference in performance of four different training groups?

Now that we know what an one-way ANOVA is used for, we can now calculate an one-way ANOVA in R.  To begin, open your data in R.  If you don’t have a dataset, download the example dataset here. In the example dataset, we are simply comparing the means of three different groups on a single continuous outcome. You can imagine that the groups and the outcome are anything that you want.

Also, this dataset is in the .xlsx format, and the current guide requires the file to be in .csv format.  For this reason, you must convert this file from .xlsx format to .csv format before you can follow along using this dataset.  If you do not know how to do this, please visit my page on converting a file to .csv format.  After converting the file, you can continue with this guide.

First, you must import your data to R. For the current examples, we are going to label our data as: MyData.  For a more in-depth review of opening data in R, please visit my guide on the topic.

Before conducting our ANOVA, we need to first indicate that our grouping variable within our dataset is indeed a grouping variable.  To do this, we first refer to our grouping variable in our dataset, which is MyData\$Groups, followed by an arrow, <- .  So, we would type MyData\$Groups <- .

Now, we use the command as.factor() to indicate that our grouping variable is a grouping variable.  Type: as.factor( .

Finally, type your grouping variable again and close your bracket by typing: MyData\$Groups) .

We are going to assign our output to the following term: fit .  So, please type: fit <- .

For our analysis, we are going to use the aov() command.  Type: aov( .

In the aov() command, the first variable that we enter is our outcome variable.  For our dataset, out outcome variable is labeled Outcome.  We also need to identify that this is our outcome variable within the command by typing a ~.  So, we would type: Outcome ~ .

Next, we identify our grouping variable.  In the current dataset, our grouping variable is labeled Groups.  We should thereby type: Groups, .

Now, we just need to identify our dataset and close our command.  Type: data=MyData).

Because we are assigning our output to “fit,” we need to tell R to provide a summary of fit.  On the following line, we should therefore type: summary(fit) .