R can be used to generate plots. The following example uses the data set PlantGrowth, which comes as an example data set along with R
Type int the following all lines into the R prompt which do not start with ##. Lines starting with ## are meant to document the result which R will return.
## 'data.frame': 30 obs. of 2 variables:
## $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
## $ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
anova(lm(weight ~ group, data = PlantGrowth)) ## Analysis of Variance Table
## Response: weight
## Df Sum Sq Mean Sq F value Pr(>F) ## group 2 3.7663 1.8832 4.8461 0.01591 *
## Residuals 27 10.4921 0.3886 ## ---
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The following plot is created:
data(PlantGrowth) loads the example data set PlantGrowth, which is records of dry masses of plants which were subject to two diﬀerent treatment conditions or no treatment at all (control group). The data set is made available under the name PlantGrowth. Such a name is also called a Variable.
To load your own data, the following two documentation pages might be helpful:
- Reading and writing tabular data in plain-text ﬁles (CSV, TSV, etc.) I/O for
- foreign tables (Excel, SAS, SPSS, Stata)
str(PlantGrowth) shows information about the data set which was loaded. The output indicates that PlantGrowth is a data.frame, which is R's name for a table. The data.frame contains of two columns and 30 rows. In this case, each row corresponds to one plant. Details of the two columns are shown in the lines starting with $: The ﬁrst
column is called weight and contains numbers (num, the dry weight of the respective plant). The second column, group, contains the treatment that the plant was subjected to. This is categorial data, which is called factor in R. Read more information about data frames.
To compare the dry masses of the three diﬀerent groups, a one-way ANOVA is performed using anova(lm( ... )). weight ~ group means "Compare the values of the column weight, grouping by the values of the column group". This is called a Formula in R. data = ... speciﬁes the name of the table where the data can be found.
The result shows, among others, that there exists a signiﬁcant diﬀerence (Column Pr(>F)), p = 0.01591) between some of the three groups. Post-hoc tests, like Tukey's Test, must be performed to determine which groups' means diﬀer signiﬁcantly.
boxplot(...) creates a box plot of the data. where the values to be plotted come from. weight ~ group means:
"Plot the values of the column weight versus the values of the column group. axis. More ylab = ... speciﬁes the label of the y information: Base plotting
Type q() or Ctrl - D to exit from the R session.