Tutorial 5.2 - The Grammar of Graphics in R (ggplot2)

24 Jun 2017

This Tutorial has been thrown together a little hastily and is therefore not very well organized - sorry! Graphical features are demonstrated either via tables of properties or as clickable graphics that reveal the required R code. Click on a graphic to reveal/toggle the source code or to navigate to an expanded section.

This tutorial is intended to be viewed sequentially. It begins with the basic ggplot framework and then progressively builds up more and more features as default elements are gradually replaced to yeild more customized graphics.

Having said that, I am going to start with a sort of showcase of graphics which should act as quick navigation to entire sections devoted to the broad series of graphs related to each of the featured graphics. I have intentionally titled each graph according to the main feature it encapsulates rather than any specific functions that are used to produce the features as often a single graphic requires a combination of features and thus functions. Furthermore, the grammar of graphics specifications are sufficiently unfamiliar to many that the relationships between the types of graphical features a researcher wishes to produce and the specific syntax required to achieve the desired result can be difficult to recognize.

Each graphic is intended to encapsulate a broad series of related graph types.

Basic plot types

Boxplots	Histograms	Density plots	Scatterplots	Line graphs	Smoothers
Trendlines	Bar charts	Stacked bar charts	Bar graphcs	Interaction plots	Scatterplot matrix
Heat maps	Contour maps

Additions

Segments	Confidence bands	Error bars	Horizontal lines	Vertical lines	Range bars
Axes rugs	Text plots

Customizations

Axes scales

Colours

Line types

Plotting symbols

Transparency

Sizes

Layouts

facet_wrap

facet_grid

grids and viewports

The Grammar of Graphics

The Grammar of Graphics was first introduced/presented by Wilkinson and Wills (2006) as a new graphics philosophy that laid down a series of rules to govern the production of quantitative graphics. Essentially the proposed graphics infrastructure considers a graphic as comprising a plot (defined by a coordinate system, scales and panelling) over which one or more data layers are applied.

Each layer is defined as:

the data - a data frame
mapping specifications that establish the visual aesthetics (colour, line type and thickness, shapes etc) of each variable
statistical methods that determine how the data rows should be summarised (stat)
geometric instructions (geom) on how each summary should be represented (bar, line, point etc)
positional mechanism for dealing with overlapping data (position)

The visual aspects of all the graphical features are then governed by themes.

Following a very short example, the next section will largely concentrate on describing each of the above graphical components. Having then established the workings of these components, we can then put them together to yield specific graphics.

Hadley Wickham's interpretation of these principals in an R context is implimented via the ggplot2 package. In addition the following packages are also commonly used alongside ggplot so as to expand on the flexibility etc.

grid
gridExtra
scales

library(ggplot2)
library(tidyverse)
library(grid)
library(gridExtra)

The following very simple graphic will be used to illustrate the above specification by implicitly stating many of the default specifications. It will use a cartesian coordinate system, continuous axes scales, a single facet (panel) and then define a single layer with a dataframe (BOD), with red points, identity (no summarisation) statistic visualised as a point geometric.

 p <- ggplot() +
  coord_cartesian() + #cartesian coordinates
  scale_x_continuous() + #continuous x axis
  scale_y_continuous() + #continuous y axis
  #single layer
  layer( data=BOD, #data.frame
    mapping=aes(y=demand,x=Time),
    stat="identity", #use original data
    geom="point", #plot data as points
    position="identity" # how to handle overlapping data
  )+
  layer( data=BOD, #data.frame
    mapping=aes(y=demand,x=Time),
    stat="identity", #use original data
    geom="line", #plot data as a line
    position="identity" # how to handle overlapping data
  )
p #print the plot

OR, by leaving out all the default stuff

p <- ggplot(data=BOD, map=aes(y=demand,x=Time)) + geom_point()+
  geom_line()
p

Note, the following important features of the grammar of graphics as implemented in R:

the order in which each of the above components in the first code snippet were added is unimportant. They each add additional information to the overall graphical object. The object itself is evaluated as a whole when it is printed.
multiple layers are laid down in the order that they appear in the statement
in the second code snippet (the shorter version), a layer is created for each of the two geoms
the data and mapping used by both geom_point() and geom_line are inherited from the main ggplot() function.
since layers are ordered, the points are drawn first and the line over the top

In an attempt to illustrate the use of ggplot for elegant graphics, we will drill down into each of the plot and layer specifications. Although the geoms and thus layers are amongst the last features to be constructed by the system, the data and aesthetic features of the data impact on how the coordinate system, scales and panelling work. Therefore, we will explore the geoms first.

Geometric objects - `geom_` and `stat_`

Geometric objects (geoms) are visual representations of observations. For example, there is a geom to represent points based on a set of x,y coordinates. All graphics need at least one geom and each geom is mapped to its own layer. Multiple geoms can be added to a graphic and the order that they are added to the expression determines the order that their respective layer is constructed.

When a ggplot expression is being evaluated, geoms are coupled together with a stat_ function. This function is responsible for generating data appropriate for the geom. For example, the stat_boxplot is responsible for generating the quantiles, whiskers and outliers for the geom_boxplot function.

In addition to certain specific stat_ functions, all geoms can be coupled to a stat_identity function. In mathematical contexts, identity functions map each element to themselves - this essentially means that each element passes straight through the identity function unaltered. Coupling a geom to an identity function is useful when the characteristics of the data that you wish to represent are present in the data frame. For example, your dataframe may contain the x,y coordinates for a series of points and you wish for them to be used unaltered as the x,y coordinates on the graph. Moreover, your dataframe may contain pre-calculated information about the quantiles, whiskers and outliers and you wish these to be used in the construction of a boxplot (rather than have the internals of ggplot perform the calculations on raw data).

Since geom_ and stats_ functions are coupled together, a geometric representation can be expressed from either a geom_ function OR a stats_ function. That is, you either:

specify a geom_ function that itself calls a stat_ function to provide the data for the geom function..
ggplot(CO2)+geom_smooth(aes(x=conc,y=uptake), stat="smooth")
specify a stat_ function that itself calls a geom_ function to visually represent the data..
ggplot(CO2)+stat_smooth(aes(x=conc,y=uptake), geom="smooth")

It does not really make any difference which way around you do this. For the remainder of this tutorial, we will directly engage the geom_ function for all examples.

The geom_ functions all have numerous arguments, many of which are common to all geoms_.

data - the data frame containing the data. Typically this is inherited from the ggplot function.
mapping - the aesthetic mapping instructions. Through the aesthetic mapping the aesthetic visual characteristics of the geometric features can be controlled (such as colour, point sizes, shapes etc). The aesthetic mapping can be inherited from the ggplot function. Common aesthetic features (mapped via a aes function) include:
- alpha - transparency
- colour - colour of the geometric features
- fill - fill colour of geometric features
- linetype - type of lines used in geometric features (dotted, dashed, etc)
- size - size of geometric features such as points or text
- shape - shape of geometric features such as points
- weight - weightings of values
stat - the stat_ function coupled to the geom_ function
position - the position adjustment for overlapping objects
- identity - leave objects were they are
- dodge - shift objects to the side to prevent overlapping
- stack - stack objects on top of each other
- fill - stack objects on top of each other and standardize each group to equal height

Currently, there are a large number of available geoms_ and stat_ functions within the ggplot system. This tutorial is still a work in progress and therefore does not include all of them - I have focused on the more commonly used ones.

In an attempt to break up the set of geoms_ and stat_ functions, I have somewhat arbitrarily divided them up into primary and secondary geometric features. Primary geometric features are those that could be viewed as graphics in their own right, whereas secondary geometric features are those that are added to other geometric features to provide additional information (but would rarely be considered a graphic in their own right).

Primary geometric objects

The following icon matrix provides navigation and an overview to the geometric features described in this section.

geom_bar	geom_bar	geom_bar	geom_bar	geom_boxplot	geom_density
geom_point	geom_line	geom_smooth	geom_smooth	geom_tile	geom_contour

`geom_bar` and `stats_bin`

geom_bar constructs barcharts and histograms. By default, the bins of each bar along with the associated bar heights are calculated by the stats_bin function. The following list describes the mapping aesthetic properties associated with geom_bar and stats_bin. The entries in bold are compulsory.

`geom_bar`	`stat_bar`
x - x axis value (categorical) alpha - transparency colour - colour of the lines fill - colour of the bar linetype - type of lines used to construct bar size - symbol size for outliers weight - weightings of values	x - a vector that is to be binned y - optional y axis value (continuous)

The following table illustrates the first six rows of the diamonds dataset (comes with R) that will be used for the following examples.

	carat	cut	color	clarity	depth	table	price	x	y	z
1	0.23	Ideal	E	SI2	61.50	55.00	326	3.95	3.98	2.43
2	0.21	Premium	E	SI1	59.80	61.00	326	3.89	3.84	2.31
3	0.23	Good	E	VS1	56.90	65.00	327	4.05	4.07	2.31
4	0.29	Premium	I	VS2	62.40	58.00	334	4.20	4.23	2.63
5	0.31	Good	J	SI2	63.30	58.00	335	4.34	4.35	2.75
6	0.24	Very Good	J	VVS2	62.80	57.00	336	3.94	3.96	2.48

Feature	geom	stat	position	Aesthetic parameters / Notes
barchart	`_bar`	`_bin`	stack	x,y,size,linetype,colour,fill,alpha, weight bar heights represent number of items in each level of the categorical vector
ggplot(diamonds) + geom_bar(aes(x = cut))
barchart	`_bar`	`_bin`	stack	Bar heights represent the number of items in each level of a categorical vector and stacked according to another categorical vector
ggplot(diamonds) + geom_bar(aes(x = cut, fill = clarity))
barchart	`_bar`	`_bin`	dodge	bar heights represent the number of items in each combination of levels of multiple categorical vectors displayed side by side
ggplot(diamonds) + geom_bar(aes(x = cut, fill = clarity), position = "dodge")
barchart	`_bar`	`_identity`	stack	bar heights represent value of y for each x
diamonds1 <- as.data.frame(table(diamonds$cut)) ggplot(diamonds1) + geom_bar(aes(x = Var1, y = Freq), stat = "identity")
bargraph	`_bar`	`_summary`	stack	bar heights represent mean y within each level of categorical vector
ggplot(diamonds) + geom_bar(aes(x = cut, y = carat), stat = "summary", fun.y = mean)
histogram	`_bar`	`_bin`	stack	bar heights represent counts within a binned continuous vector
ggplot(diamonds) + geom_bar(aes(x = carat))

`geom_boxplot` and `stat_boxplot`

geom_boxplot constructs boxplots. The values of the various elements of the boxplot (quantiles, whiskers etc) are calculated by its main pairing function (stat_boxplot). The following list describes the mapping aesthetic properties associated with geom_boxplot. The entries in bold are compulsory. Note that boxplots are usually specified via the geom_boxplot function which will engage the stat_boxplot to calculate the quantiles, whiskers and outliers. Therefore, confusingly, when calling geom_boxplot, the compulsory paramters are actually those required by stat_boxplot (unless you indicated to use stat_identity).

`geom_boxplot`	`stat_boxplot`
x - x axis value (categorical) lower - value of the lower box line (25% percentile) middle - value of the middle box line (50% percentile - median) lower - value of the upper box line (75% percentile) ymax - value of the upper whisker ymax - value of the lower whisker alpha - transparency colour - colour of the lines fill - colour of the boxplot linetype - type of lines used to construct boxplot shape - symbol shape for outliers size - symbol size for outliers weight - weightings of values	x - x axis value (categorical) y - y axis value (continuous)

geom_boxplot

stat_boxplot

x - x axis value (categorical)
lower - value of the lower box line (25% percentile)
middle - value of the middle box line (50% percentile - median)
lower - value of the upper box line (75% percentile)
ymax - value of the upper whisker
ymax - value of the lower whisker
alpha - transparency
colour - colour of the lines
fill - colour of the boxplot
linetype - type of lines used to construct boxplot
shape - symbol shape for outliers
size - symbol size for outliers
weight - weightings of values

x - x axis value (categorical)
y - y axis value (continuous)

Feature	geom	stat	position	Notes, additional parameters	Example
boxplot	`_boxplot`	`_boxplot`	dodge	Plot of quantiles, whiskers and outliers `outlier.colour` `outlier.shape` `outlier.size` `notch` - whether to include a notch or not `notchwidth` - width of notch (fraction of box width)
ggplot(diamonds) + geom_boxplot(aes(x = "carat", y = carat))

`geom_density` and `stat_density`

geom_density constructs smooth density distributions from continuous vectors. The actual smoothed densities are calculated by its main pairing function (stat_density). The following list describes the mapping aesthetic properties associated with geom_density and stat_density. The entries in bold are compulsory. Note that density plots are usually specified via the geom_density function which will engage the stat_density. Therefore, confusingly, when calling geom_density, the compulsory paramaters are actually those required by stat_density (unless you indicated to use stat_identity).

`geom_density`	`stat_density`
x - x axis value (continuous) y - y axis value (densities) alpha - transparency colour - colour of the lines fill - colour of the density linetype - type of lines used to construct density shape - symbol shape for outliers size - symbol size for outliers weight - weightings of values	x - a continuous vector from which to create density distribution fill - fill colour y

Feature	geom	stat	position	Notes, additional parameters	Example
density	`_density`	`_density`	dodge	Density plot of a distribution of a vector `adjust` - smoothness `kernel` - kernel density `trim` - whether to trim densities to data range
ggplot(diamonds) + geom_density(aes(x = carat))

`geom_point`

geom_point draws points (scatterplot). Typically the stat used is stat_identity as we wish to use the values in two continuous vectors as the coordinates of each point. The following list describes the mapping aesthetic properties associated with geom_point. The entries in bold are compulsory.

`geom_point`
x - x axis value (continuous) y - y axis value (densities) alpha - transparency colour - colour of the lines fill - colour of the point linetype - type of lines used to construct point shape - symbol shape for outliers size - symbol size for outliers

Note, it is possible to combine geom_point with other stats (such as stat_summary), so as to plot summaries of the data rather than raw data.

Feature	geom	stat	position	Notes, additional parameters
point	`_point`	`_identity`	identity	Scatterplot
ggplot(BOD) + geom_point(aes(x = Time, y = demand))
means point	`_point`	`_summary`	identity	Means plot
ggplot(CO2) + geom_point(aes(x = conc, y = uptake), stat = "summary", fun.y = mean)

`geom_line`

geom_line draws lines joining coordinates. Typically the stat used is stat_identity as we wish to use the values in two continuous vectors as the coordinates of each line segment. The following list describes the mapping aesthetic properties associated with geom_line. The entries in bold are compulsory.

`geom_line`
x - x axis value (continuous) y - y axis value (densities) alpha - transparency colour - colour of the lines fill - colour of the line linetype - type of lines used to construct line size - symbol size for outliers

Note, it is possible to combine geom_line with other stats (such as stat_summary), so as to plot summaries of the data rather than raw data.

Feature	geom	stat	position	Notes, additional parameters
line	`_line`	`_identity`	identity	Line plot
ggplot(BOD) + geom_line(aes(x = Time, y = demand))
means line	`_line`	`_summary`	identity	Means line plot
ggplot(CO2) + geom_line(aes(x = conc, y = uptake), stat = "summary", fun.y = mean)

`geom_smooth` and `stat_smooth`

geom_smooth draws smooths lines (and 95% confidence intervals) through data clouds. Typically the stat used is stat_smooth which in turn engages one of the available smoothing methods (e.g. lm, glm, gam, loess or rlm). The following list describes the mapping aesthetic properties associated with geom_smooth and stat_smooth. The entries in bold are compulsory.

`geom_smooth`	`stat_smooth`
x - x axis value (continuous) y - y axis value (densities) alpha - transparency colour - colour of the smooths fill - colour of the smooth linetype - type of smooths used to construct smooth size - symbol size for outliers weight - for weighting data	x - x axis value (continuous) y - y axis value (densities)

stat_smooth also has the following optional arguments:

method - the smoothing method (function). One of "lm", "glm", "gam", "loess" or "rlm"
formula - the formula for the smoothing function, expressed relative to x and y. E.g. "y~x", "y~s(x)"
se - whether to display confidence intervals
fullrange - whether the fit should span the full range of the data
level - confidence level (e.g. 0.95)
n - number of points to evaluate smoother at

Feature	geom	stat	position	Notes, additional parameters
smooth	`_smooth`	`_identity`	identity	Linear smoother
ggplot(CO2) + geom_smooth(aes(x = conc, y = uptake), method = "lm")
Lowess smoother	`_smooth`	`_stat`	identity	Lowess smoother
ggplot(CO2) + geom_smooth(aes(x = conc, y = uptake), method = "loess")
Gam smoother	`_smooth`	`_stat`	identity	Cupic regression spline smoother
library(mgcv) ggplot(CO2) + geom_smooth(aes(x = conc, y = uptake), method = "gam", formula = y ~ s(x, bs = "cr", k = 4))

`geom_tile` and `geom_raster`

geom_tile constructs heat maps given x,y coordinates and a z value to associate with the fill of each tile. The following list describes the mapping aesthetic properties associated with geom_tile and stat_tile. Similarly, geom_raster generates heat maps, however, unlike geom_tile, geom_raster is applied when all the tiles are the same size and is able to interpolate when the grid is not regular - however this can be very slow for large grids. The entries in bold are compulsory.

`geom_tile` and `geom_raster`
x - x axis value (continuous) y - y axis value (continuous) alpha - transparency colour - colour of the borders around tiles fill - colour of the fill of tiles linetype - type of lines to use as borders to each tile size - line thickness interpolate - perform interpolation (only for `geom_raster`).

Feature	geom	stat	position	Notes, additional parameters
tile	`_tile`	`_identity`	identity	Heat map
volcano.df <- reshape2:::melt(volcano, varnames = c("X", "Y")) ggplot(volcano.df) + geom_tile(aes(x = X, y = Y, fill = value))
raster	`_raster`	`_identity`	identity	Heat map
volcano.df <- reshape2:::melt(volcano, varnames = c("X", "Y")) ggplot(volcano.df) + geom_raster(aes(x = X, y = Y, fill = value))

`geom_contour` and `stat_contour`

geom_contour constructs contour maps given x,y coordinates and a z value from which to calculate each contour. The following list describes the mapping aesthetic properties associated with geom_contour and stat_contour. The entries in bold are compulsory.

`geom_contour`	`stat_contour`
x - x axis value (continuous) y - y axis value (continuous) alpha - transparency colour - colour of the contour lines linetype - line type of the contour lines size - line thickness weight	x - x axis value (continuous) y - y axis value (continuous) z - z axis value (continuous) order

Feature	geom	stat	position	Notes, additional parameters	Example
contour	`_contour`	`_contour`	identity	Heat map
volcano.df <- reshape2:::melt(volcano, varnames = c("X", "Y")) ggplot(volcano.df) + geom_contour(aes(x = X, y = Y, z = value))

Secondary geometric objects

geom_segment	geom_ribbon	geom_errorbar	geom_hline	geom_vline	geom_pointrange
geom_rug	geom_text

`geom_segment`

geom_segment draws segments joining coordinates. The following list describes the mapping aesthetic properties associated with geom_segment. The entries in bold are compulsory.

`geom_segment`
x - x coordinates for the start of lines xend - x coordinates for the end of lines y - y coordinates for the start of lines yend - y coordinates for the end of lines alpha - transparency colour - colour of the segments fill - colour of the segment linetype - type of segments used to construct segment size - symbol size for outliers

geom_segment also has the following optional arguments:

arrow - specification of how arrows should be constructed
lineend - style of the line end

Feature	geom	stat	position	Notes, additional parameters
segment	`_segment`	`_identity`	identity	Segments on a plot - useful for drawing lots of lines or arrows
BOD.lm <- lm(demand ~ Time, data = BOD) BOD$fitted <- fitted(BOD.lm) BOD$resid <- resid(BOD.lm) ggplot(BOD) + geom_segment(aes(x = Time, y = demand, xend = Time, yend = fitted))
segment	`_identity`	`_identity`	identity	Segments on a plot - useful for drawing lots of lines or arrows
BOD.lm <- lm(demand~Time, data=BOD) BOD$fitted <- fitted(BOD.lm) BOD$resid <- resid(BOD.lm) ggplot(BOD)+geom_segment(aes(x=Time,y=demand, xend=Time,yend=fitted), arrow = arrow(length=unit(0.5, "cm")))

`geom_ribbon`

geom_ribbon draws ribbons based on upper and lower levels of y associated with each level of x. The following list describes the mapping aesthetic properties associated with geom_ribbon. The entries in bold are compulsory.

`geom_ribbon`
x - x coordinates ymin - y coordinates of the lower limits ymax - y coordinates of the upper limits alpha - transparency colour - colour of the ribbons fill - colour of the ribbon linetype - type of lines used to construct the borders of the ribbon size - thickness of the lines used to border the ribbon

Feature geom stat position Notes, additional parameters Example

ribbon

Feature	geom	stat	position	Notes, additional parameters	Example
ribbon	`_identity`	`_identity`	identity	Ribbons on a plot - useful for depicting confidence envelopes
BOD.lm <- lm(demand ~ Time, data = BOD) xs <- seq(min(BOD$Time), max(BOD$Time), l = 100) pred <- data.frame(predict(BOD.lm, newdata = data.frame(Time = xs), interval = "confidence")) pred$x <- xs ggplot(pred) + geom_ribbon(aes(x = x, ymin = lwr, ymax = upr))

_identity

identity

Ribbons on a plot - useful for depicting confidence envelopes

BOD.lm <- lm(demand ~ Time, data = BOD)
xs <- seq(min(BOD$Time), max(BOD$Time), l = 100)
pred <- data.frame(predict(BOD.lm, newdata = data.frame(Time = xs),
    interval = "confidence"))
pred$x <- xs
ggplot(pred) + geom_ribbon(aes(x = x, ymin = lwr, ymax = upr))

`geom_errorbar`

geom_errorbar draws errorbars based on upper and lower levels of y associated with each level of x. The following list describes the mapping aesthetic properties associated with geom_errorbar. The entries in bold are compulsory.

`geom_errorbar`
x - x coordinates ymin - y coordinates of the lower limits ymax - y coordinates of the upper limits alpha - transparency colour - colour of the errorbars fill - colour of the errorbar linetype - type of lines used to construct the borders of the errorbar size - thickness of the lines used to border the errorbar width - width of the errorbars

Feature	geom	stat	position	Notes, additional parameters
errorbar	`_identity`	`_identity`	identity	Error bars on a plot - useful for adding to means plots etc
warpbreaks.df = warpbreaks %>% group_by(wool) %>% do({ mean_cl_boot(.$breaks) }) ggplot(warpbreaks.df) + geom_errorbar(aes(x = wool, ymin = ymin, ymax = ymax))
errorbar	`_identity`	`_summary`	identity	Error bars on a plot - useful for adding to means plots etc
ggplot(warpbreaks) + geom_errorbar(aes(x = wool, y = breaks), stat = "summary", fun.data = "mean_cl_boot")

`geom_hline` and `geom_vline`

geom_hline draws a horizontal line based on a yintercept. The following list describes the mapping aesthetic properties associated with geom_hline. The entries in bold are compulsory.

`geom_hline` and `geom_vline`
yintercept or xintercept - y/x coordinate(s) alpha - transparency colour - colour of the line(s) linetype - type of lines used to construct the borders of the line(s) size - thickness of the lines used to lines(s)

Feature	geom	stat	position	Notes, additional parameters
hline	`_hline`	`_identity`	identity	Horizontal line(s) on a plot
ggplot(warpbreaks) + geom_point(aes(y = breaks, x = wool)) + geom_hline(yintercept = 30)
vline	`_vline`	`_identity`	identity	Vertical line(s) on a plot
ggplot(BOD) + geom_point(aes(y = demand, x = Time)) + geom_vline(xintercept = 4, linetype = "dashed")

`geom_pointrange` and `geom_linerange`

geom_linerange draws vertical lines representing intervals similar to errorbars, yet without the horizontal ends. In addition to the interval, geom_pointrange draws a point. The following list describes the mapping aesthetic properties associated with geom_pointrange and geom_linerange. The entries in bold are compulsory.

`geom_pointrange` and `geom_linerange`
x - x coordinate(s) ymin - minimum values for the intervals ymax - maximum values for the intervals alpha - transparency colour - colour of the line(s) linetype - type of lines used to construct the borders of the line(s) size - thickness of the lines used to lines(s)

Feature	geom	stat	position	Notes, additional parameters
pointrange	`_pointrange`	`_identity`	identity	Horizontal line(s) on a plot
warpbreaks.df = warpbreaks %>% group_by(wool) %>% do({ mean_cl_boot(.$breaks) }) ggplot(warpbreaks.df) + geom_pointrange(aes(x = wool, y = y, ymin = ymin, ymax = ymax))
pointrange	`_pointrange`	`_summary`	identity	Error bars on a plot - useful for adding to means plots etc
ggplot(warpbreaks) + geom_pointrange(aes(x = wool, y = breaks), stat = "summary", fun.data = "mean_cl_boot")

`geom_rug`

geom_rug draws small marks along an axis to mark the presence of observations. The following list describes the mapping aesthetic properties associated with geom_rug. The entries in bold are compulsory.

`geom_rug`
alpha - transparency colour - colour of the ribbons fill - colour of the ribbon linetype - type of lines used to construct the borders of the ribbon size - thickness of the lines used to border the ribbon

geom_rug also has the following optional arguments:

side - indicating which axes the rug should be drawn on ('t','b','l','r','bl','tblr',etc)

Feature	geom	stat	position	Notes, additional parameters
rug	`_identity`	`_identity`	identity	Rug on a plot - useful for depicting the location of observations
ggplot(BOD, aes(x = Time, y = demand)) + geom_point() + geom_rug(side = "tblr")
rug	`_identity`	`_identity`	identity	Rug on a plot - useful for depicting the location of observations
ggplot(BOD, aes(x = Time, y = demand)) + geom_point() + geom_rug(side = "tblr")

`geom_text`

geom_text adds text at given coordinates. The following list describes the mapping aesthetic properties associated with geom_text. The entries in bold are compulsory.

`geom_text`
label - string of text annotations x,y - coordinates alpha - transparency angle - rotation angle colour - colour of the ribbons family - font family fontface - bold, italic etc hjust - horizontal justification linetype - spacing between lines of text size - font size vjust - vertical justification

Feature	geom	stat	position	Notes, additional parameters
text	`_identity`	`_identity`	identity	Text on a plot - useful for depicting the location of observations
ggplot(CO2, aes(x = conc, y = uptake)) + geom_text(aes(label = Treatment))
text	`_identity`	`_identity`	identity	Text on a plot - useful for depicting the location of observations
ggplot(CO2, aes(x = conc, y = uptake)) + geom_text(aes(label = toupper(substr(Treatment, 1, 1))))

`geom_label`

geom_label adds text surrounded by a box at given coordinates. The following list describes the mapping aesthetic properties associated with geom_text. The entries in bold are compulsory.

`geom_text`
label - string of text annotations x,y - coordinates alpha - transparency angle - rotation angle colour - colour of the ribbons family - font family fontface - bold, italic etc hjust - horizontal justification linetype - spacing between lines of text size - font size vjust - vertical justification

Feature	geom	stat	position	Notes, additional parameters
label	`_identity`	`_identity`	identity	Label on a plot - useful for depicting the location of observations
ggplot(CO2, aes(x = conc, y = uptake)) + geom_label(aes(label = Treatment))
label	`_identity`	`_identity`	identity	Label on a plot - useful for depicting the location of observations
iris.sum = iris %>% group_by(Species) %>% summarize_at(vars(Sepal.Length, Sepal.Width), mean) %>% ungroup %>% mutate_at(vars(Sepal.Length, Sepal.Width), funs(m = mean)) ggplot(iris, aes(y = Sepal.Length, x = Sepal.Width)) + geom_point(aes(color = Species)) + geom_segment(data = iris.sum, aes(yend = Sepal.Length_m, xend = Sepal.Width_m), arrow = arrow(end = "first", length = unit(0.5, "lines"))) + geom_label(data = iris.sum, aes(label = Species, fill = Species), vjust = "outward", hjust = "outward", alpha = 0.5)

Coordinate system - coord

The coordinate system controls the nature and scale of the axes.

System	Parameters	Example
Regular cartesian coordinate system `coord_cartesian`	`xlim` - x limits `ylim` - y limits
ggplot(BOD) + coord_cartesian()+geom_line(aes(y=demand,x=Time))
Polar coordinate system `coord_polar`	`theta="x"` - angle variable `start=0` - initial angle from 12 oclock `direction=1` - 1=clockwise, -1=anticlockwise
ggplot(BOD) + coord_polar()+geom_line(aes(y=demand,x=Time))
Flipped the axes `coord_flip`	`xlim=NULL` - y limits `ylim=NULL` - x limits
ggplot(BOD) + coord_flip()+geom_line(aes(y=demand,x=Time))
Fix the ratio of axes dimesions `coord_fixed`	`ratio=1` - y/x ratio `xlim=NULL` - x limits `ylim=NULL` - y limits
ggplot(BOD) + coord_fixed(ratio=0.25)+geom_line(aes(y=demand,x=Time))
1:1 (equal) ratio of axes dimesions same as `coord_fixed(ratio=1)` `coord_equal`	`xlim=NULL` - x limits `ylim=NULL` - y limits
ggplot(BOD) + coord_equal()+geom_line(aes(y=demand,x=Time))
Map projection coordinate system `coord_map`	`projection="mercator"` - mapping projection `orientation=c(90,0,mean(range(x)))` - map orientation
#get high resolution map of Australia (and islands) data library(maps) library(mapdata) aus <- map_data("worldHires", region="Australia") #Orthographic coordinates ggplot(aus, aes(x=long, y=lat, group=group)) + coord_map("ortho", orientation=c(-20,125,23.5))+geom_polygon()

Altering the axes scales via the coordinate system

Modifying scales with coords affects the zoom on the graph. That is, it defines the extent and nature of the axes coordinates. By contrast, altering limits via scale_ routines will alter the scope of data included in a manner analogous to operating on a subset of the data.

Default scale

Scale via coord_ (Zoom)

Scale via scale_

# Default scales
ggplot(BOD,aes(y=demand,x=Time)) +
geom_point()+geom_smooth(method="lm")

# Zoom on x-axis
ggplot(BOD,aes(y=demand,x=Time)) +
coord_cartesian(xlim=c(2,6))+
geom_point()+geom_smooth(method="lm")

# Scale (subset) the x data
ggplot(BOD,aes(y=demand,x=Time)) +
scale_x_continuous(limits=c(2,6))+
geom_point()+geom_smooth(method="lm")

In addition to altering the zoom of the axes, axes (coordinate system) scales can be transformed to other scales via the coord_trans function. Transformations of the coordinate system take place after statistics have been calculated and geoms derived. Therefore the shape of geoms are altered.

The coord_trans() function has the following argments:

x: a transformer that will operate on the x scale
y: a transformer that will operate on the y scale
limx: limits of the x axis
limy: limits of the y axis

A transformer is a function that defines a transformation along with its inverse and rules on how to generate pretty breaks (tick marks) and their labels

To illustrate the distinction between coord_trans and scale_, we will generate some curvilinear data.

set.seed(1)
n<-50
dat <- data.frame(x = exp((1:n+rnorm(n,sd=2))/10), y = 1:n+rnorm(n,sd=2))

Linear scales	`coord_trans`	`scales_`
Linear spacing of axis ticks	Log₁₀ spacing of axis ticks on a linear scale	Linear spacing of axis ticks on a log₁₀ scale
ggplot(dat, aes(y=y,x=x)) + geom_point() ggplot(dat, aes(y=y,x=x)) + geom_point()+coord_trans(x=log10_trans()) ggplot(dat, aes(y=y,x=x)) + geom_point()+scale_x_continuous(trans=log10_trans())
Linear trend applied to curved data	Linear trend applied to curved data, then bent by coordinates rescaling	Linear trend applied to scaled (linear) data
ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm") ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+coord_trans(x=log10_trans()) ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+scale_x_continuous(trans=log10_trans())

More details of transformers

Transformers

`trans_new`

The trans_new function itself defines and returns a list structure comprising;

name
transform
inverse
breaks
format
domain

To illustrate the trans_new function, lets define a natural log (ln) transformer to apply to our artificial data.

ggplot(dat, aes(y=y,x=x)) + geom_point() +
  geom_smooth(method="lm") +
  scale_x_continuous(trans=trans_new(name="ln",
                                transform=function(x) log(x),
                                inverse=function(x) exp(x),
                                breaks=function(x) pretty(x),
                                domain=c(1e-100,Inf)))

ln_trans <- function() {
    name <- "ln"
        trans <- function(x) log(x)
    inv <- function(x) exp(x)
        breaks <- function(x) pretty(x)
    format <- function(x) x
    domain <- c(1e-100,Inf)
    trans_new(name,transform=trans,inverse=inv,
              breaks=breaks, domain=domain)
}

ggplot(dat, aes(y=y,x=x)) + geom_point()+
  geom_smooth(method="lm")+
  scale_x_continuous(trans=ln_trans())

coord_trans(xtrans = "identity", ytrans = "identity",limx = NULL, limy = NULL)

Finer control of the transformation can be exersized. Consider the following examples using the same dataset.

In order to be able to use trans_new effectively, it is necessary to understand the data parsed to each of the functions within the transformer. In the following demonstration, I have placed print statements within each of the functions so as to illustrate the sequence in which the functions are called (relative to each other and other external functions) as well as the input data of each function. Note for this demonstration, I have ommitted the smoother as it would also result in calls to these functions and therefore compound the sequencing.

p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+# + geom_smooth(method="lm") +
  scale_x_continuous(trans=trans_new(name="",transform=function(x) {
     cat("**Tranform begin**\n");
     print(x);
     log10(x);
         },
                                inverse=function(x) {
     cat("**Inverse**\n");
     print(x);
     10^(x);
     },
                                breaks=function(x) {
         cat("**Breaks**\n");
     print(x);
         pretty(x);
     },
                            format=function(x) {cat("**Format**\n");print(x);x;},
                                domain=c(1e-100,Inf)))
p+ theme(plot.background = element_rect(fill = "transparent",colour = NA))

**Tranform begin**
 [1]   0.9750264   1.2670973   1.1421064
 [4]   2.0524951   1.7610347   1.5463639
 [7]   2.2199525   2.5796789   2.7597946
[10]   2.5572241   4.0647662   3.5893411
[13]   3.2405786   2.6040294   5.6124435
[16]   4.9087203   5.4562511   7.3065210
[19]   7.8793407   8.3209819   9.8138581
[22]  10.5531803  10.1240421   7.4048091
[25]  13.7902959  13.3134427  14.4232415
[28]  12.2539646  16.5166669  21.8366186
[31]  29.1290486  24.0333501  29.2984415
[34]  29.6433848  25.1432152  33.6832637
[37]  37.3802320  44.1740415  61.5595546
[40]  63.6013788  58.3871179  63.3913657
[43]  84.7234714  91.0430653  78.4333219
[46]  86.3579587 118.2636427 141.6992119
[49] 131.3060368 177.0127113
**Inverse**
[1] -0.123933  2.360954
**Breaks**
[1]   0.7517388 229.5904709
**Tranform begin**
[1]   0.7517388 229.5904709
**Tranform begin**
[1]   0  50 100 150 200 250
**Inverse**
[1]       NA 1.698970 2.000000 2.176091 2.301030
[6]       NA
**Format**
[1]  NA  50 100 150 200  NA

The sequence is as follows;

The first call to the **Transform** function is parsed the raw data
The first call to the **Inverse** function is parsed the computed axes limits (on the log₁₀ scale). These originate in another part of the ggplot engine. Following the action of the transformation function, other functions determine the limits of the axes based on the transformed data as well as the nominated axis expansion factor (places a buffers beyond the data such that geoms do not overlapp axes). The inverse function then converts these limits into limits in the original raw data scale.
The first call to the **Breaks** function is parsed the axes limits on the scale of the raw data and defines the spacing of axes tick marks
The second call to the **Transform** function takes the axis limits and rescales them into the log₁₀ scale
The third call to the **Transform** function takes the axis tick mark spacing from **Breaks** and rescales them into the log₁₀ scale
The second call to the **Inverse** function takes the axis tick marks spacing on the log₁₀ scale and rescales into the scale of the raw data
Finally, the **Format** function is used to define the labels to be applied to the tick marks

Axes in the scale of observations

p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous(trans=log10_trans(), breaks=trans_breaks(function(x) log10(x), function(x) 10^(x))) p p <-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous(trans=log10_trans(), breaks=trans_breaks(function(x) x, function(x) x)) p p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous(trans=log10_trans(),breaks=as.vector(c(1,2,5) %o% 10^(-1:2))) p
Axes in the scale of logarithms

p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous("x log",trans=log10_trans(), breaks=trans_breaks(function(x) log10(x), function(x) 10^(x)), labels=trans_format(function(x) log10(x), format=format_format()) ) p p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous(trans=log10_trans(), breaks=trans_breaks(function(x) log10(x), function(x) 10^(x)), labels=trans_format(function(x) log10(x), format=scientific_format()) ) p p<-ggplot(dat, aes(y=y,x=x)) + geom_point()+geom_smooth(method="lm")+ scale_x_continuous(trans=log10_trans(), breaks=trans_breaks(function(x) log10(x), function(x) 10^(x)), labels=trans_format(function(x) log10(x), format=math_format(10^.x)) ) p

`*_trans` transformers

The _trans family of transformers are convienient wrappers for the trans_new function.

Transformer	Desciption
`asn_trans()`	Arc-sin square-root transformation (of proportions/percentages).
`atanh_trans()`	Arc-tangent transformation
`boxcox_trans(p)`	Box-Cox power transformation When the power exponent (`p`) is equal to 0, values are logged For exponents other than zero, 1 is subtracted from the value are raised to the power of the exponent and this is then divided by the exponent.
`date_trans`
`exp_trans`
`identity_trans`
`log10_trans`
`log1p_trans`
`log2_trans`
`log_trans`
`logit_trans`
`probability_trans`
`probit_trans`
`reciprocal_trans`
`reverse_trans`
`sqrt_trans`
`time_trans`

Transform axes scale (logs)	1:1 axes scales
Error: `xtrans` arguments is deprecated; please use `x` instead. (Defunct; last used in version 1.0.1) Error in eval(expr, envir, enclos): could not find function "opts"
# log10 axes scales ggplot(BOD) + coord_trans(xtrans="log10",ytrans="log10")+geom_line(aes(y=demand,x=Time))

Transform axes scale (logs)

1:1 axes scales

Error: `xtrans` arguments is deprecated; please use `x` instead. (Defunct; last used in version 1.0.1)

Error in eval(expr, envir, enclos): could not find function "opts"

# log10 axes scales
ggplot(BOD) +
                  coord_trans(xtrans="log10",ytrans="log10")+geom_line(aes(y=demand,x=Time))

Modifying scales with coords affects the zoom on the graph. That is, it defines the extent and nature of the axes coordinates. By contrast, altering limits via scale_ routines will alter the scope of data included in a manner analogous to operating on a subset of the data.

Scales

The idea of scales is that you present the plotting engine with data or characteristics in one scale and use the various scale_ functions to convert those data into another scale. In the grammar of graphics, scales are synonymous for units of data, colors, shapes, sizes etc of plotting features and the axes and guides (legends) provide a visual cue for what the scales are. For example;

you might include data that ranges from 10 to 20 units, yet you wish to produce a plot that zooms in on the range 12-16.
you have presented grouped data (data with multiple trends) and instructed the graphing engine to assign different colour codes to each trend. You can then define a colour scale to adjust the exact colours rendered.
similarly, you might have indicated how plotting symbol shape and size are to be distinguished in your data set. You can then assign scales that define the exact shapes and symbol sizes rendered.

Technically, scales determine how attributes of the data are mapped into aesthetic geom properties. The majority of geom's (geometric objects) have the following aesthetic properties:

x - the x position (coordinates) of the geom
y - the y position (coordinates) of the geom
size - the size of the geom (e.g. the size of a point)
shape - the shape of the geom
linetype - the type of line associated with the geom's outline (solid, dashed etc)
colour - the colour of the geom's outline (note the English spelling of the word colour)
fill - the colour of the geom's fill
alpha - the transparency of the geom (0=transparent, through to 1=opaque)

In turn, each of these properties are mapped to a scale - the defaults of which are automatically selected according to what is appropriate for the sort of data. For example, data can be on a continuous or discrete (categorical) scale. Most data type have the following possible scales for each of the above properties:

_continuous - when you want the scale increments (such as the different point sizes, colours etc) to be determined from a continuous vector in your data frame.
_discrete - when you want the scale increments (such as the different point sizes colours etc) to be determined from a categorical vector in your data frame.
_manual - is a variation on _discrete and is used when you wish to manually indicate the characteristic of each increment. You need to provide as many values as there are levels of your discrete vector.
_identity - is another variation on _discrete and is used when you wish for the values in your categorical vector to be used un-scaled as the characteristics of the data. For example, your data frame might contain a vector of colour names or point sizes.

Some properties, such as colour also have additional scales that are specific to the characteristic. The scales effect not only the characteristics of the geoms, they also effect the guides (legends) that accompany the geoms.

Scaling functions comprise the prefix scale_, followed by the name of an aesthetic property and suffixed by the type of scale. Hence a function to manually define a colour scale would be scale_colour_manual.

All scales have the following arguments available:

name - a title applied to the scale. In the case of scales for x and y (the x,y coordinates of geoms), the name is the axis title. For all other scales, the name is the title of the guide (legend).
breaks - the increments on the guide. For scale_x_ and scale_y_, breaks are the axis tick locations. For all other scales, the breaks indicate the increments of the characteristic in the legend (e.g. how many point shapes are featured in the legend).
labels - the labels given to the increments on the guide. For scale_x_ and scale_y_, labels are the axis tick labels. For all other scales, the labels are the labels given to items in the legend.
limits - the span/range of data represented in the scale. Note, if the range is inside the range of the data, the data are sub-setted.
trans - scale transformations applied - obviously this is only relevant to scales that are associated with continuous data.

Scaling the x and y values (`scale_x_`)

The scale_x_ and scale_y_ scales control the x and y axes and in addition to the common arguments listed above, the following optional arguments available for specific scales:

expand - a vector of length two that indicates multiplicative and additive constants used to expand the axes away from the data thereby ensuring that geoms do not intersect with the axes.
minor_breaks - the increments for the minor breaks along the axis. The minor breaks have a grid line yet no tick marks or labels.

`scale_x_continuous`	`scale_x_continuous`	`scale_x_continuous`
linear scaling	linear with nice title	linear with more space

#Linear axes scales with altered axis title ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_continuous(name="CO2 conc") #Linear axes scales with more complex title ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_continuous(name=expression(paste("Ambient ",CO[2]," concentration (mg/l)", sep=""))) #Linear axes scales with more space along the x axis ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_continuous(name="CO2 conc", expand=c(0,200))
`scale_x_log10`	`scale_x_sqrt`	`scale_x_reverse`
Log10 scale	Square-root scale	Reverse scale
Shortcut for `scale_x_continuous(trans= log10_trans())`	Shortcut for `scale_x_continuous(trans= sqrt_trans())`
# log10 axes scales ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_log10(name="CO2 conc", breaks=as.vector(c(1,2,5,10) %o% 10^(-1:2))) # square-root transformation ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_sqrt(name="CO2 conc") # reverse the data ggplot(CO2, aes(y=uptake,x=conc)) + geom_point()+ scale_x_reverse(name="CO2 conc")
`scale_x_date`	`scale_x_datetime`	`scale_x_discrete`
For more info on date breaks see `date_breaks` For more date formats see `strptime`	For more info on date breaks see `date_breaks` For more date formats see `strptime`
# Date format library(scales) CO2$Date <- as.Date(paste(2000+as.numeric(as.factor(CO2$conc)), "-01-01", sep="")) ggplot(CO2, aes(y=uptake,x=Date)) + geom_point()+ scale_x_date(name="Year", date_breaks="2 years", date_minor_breaks="6 month", labels=date_format("%Y")) # POSIX format library(scales) CO2$DateTime <- as.POSIXct(paste(2000, "-0",as.numeric(as.factor(CO2$conc)),"-01 09:00:00", sep="")) ggplot(CO2, aes(y=uptake,x=DateTime)) + geom_point()+ scale_x_datetime(name="Time (days)", date_breaks="2 months", date_minor_breaks="1 months", labels=date_format("%b")) # categorical axis ggplot(CO2, aes(y=uptake,x=Treatment)) + geom_point()+ scale_x_discrete(name="Treatment")

Scaling the size of geoms (`scale_size_`)

The scale_size_ scales control the size of geoms (such as the size of points) and in addition to the common scale arguments, the following optional arguments available:

range - the minimum and maximum size
values - the specific sizes to use (for _manual scale)
guide - whether to include a guide and what sort of guide to include (e.g. "legend")

`scale_size_continuous`	`scale_size_discrete`	`scale_size_manual`
Scale the geoms according to a continuous vector	Scale the geoms according to a categorical vector	Manually determine the size of geoms
`range` - minimum and maximum geom size		`values` - a set of values to use for sizes
# size determined by continuous covariate set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=cv))+ scale_size_continuous(name="Temperature") # Discrete sizes ranging in size from 2 to 4 ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=Type))+ scale_size_discrete(name="Type", range=c(2,4)) # Manual sizes of exactly 2 and 4 ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=Type))+ scale_size_manual(name="Type", values=c(2,4))
`scale_size_identity`	`scale_size`	`scale_area`
Size the geoms according to the values of a continuous vector (don't scale)	Size geoms according to the values of a continuous vector (with legend)	Size geoms (area) according to the values of a continuous vector (with legend)
`guide` - whether to include a guide (legend)	`guide` - whether to include a guide (legend)	`guide` - whether to include a guide (legend)
# Sizes provided by a covariate set.seed(123) CO2$Count <- runif(nrow(CO2),0,10) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=Count))+ scale_size_identity(name="Type")
# Sizes provided by a covariate set.seed(123) CO2$Count <- runif(nrow(CO2),0,10) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=Count))+ scale_size(name="Type", guide='legend')
# Sizes provided by a covariate set.seed(123) CO2$Count <- runif(nrow(CO2),0,10) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(size=Count))+ scale_size_area(name="Type", guide='legend')

Scaling the shape of geoms (`scale_shape_`)

The scale_shape_ scales control the shape of geoms (such as the shape of the plotting point) an in addition to all of the regular arguments, the following optional arguments are available:

solid - whether the shapes should be solid (TRUE) or outlined (FALSE)

scale_shape_discrete scale_shape_manual scale_shape_identity

Geom shapes determined (scaled) by categorical variable Geom shapes determined (scaled) manually Geom shapes determined by categorical variable (no scaling)

`scale_shape_discrete`	`scale_shape_manual`	`scale_shape_identity`
Geom shapes determined (scaled) by categorical variable	Geom shapes determined (scaled) manually	Geom shapes determined by categorical variable (no scaling)
	`values` - a set of values (or shape names) to use for shapes
# Discrete shapes determined by the combination of Type and Treatment # The items in the guide are then rearranged and re-labelled CO2$Comb <- interaction(CO2$Type,CO2$Treatment) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Comb))+ scale_shape_discrete(name="Type", breaks=c("Quebec.nonchilled","Quebec.chilled","Mississippi.nonchilled","Mississippi.chilled"), labels=c("Quebec non-chilled","Quebec chilled","Miss. non-chilled","Miss. chilled")) # Manual shapes ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Treatment), size=2)+ scale_shape_manual(name="Treatment", values=c(16,21)) # Identity shapes set.seed(123) CO2$Count <- cut(runif(nrow(CO2),0,10), breaks=5) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Count))+ scale_shape(name="Species", guide="legend")

plot of chunk plotGgplotScaleShapeDiscrete

plot of chunk plotGgplotScaleShapeManual

values - a set of values (or shape names) to use for shapes

plot of chunk plotGgplotScaleShapeIdentity

# Discrete shapes determined by the combination of Type and Treatment
# The items in the guide are then rearranged and re-labelled
CO2$Comb <- interaction(CO2$Type,CO2$Treatment)
ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Comb))+
  scale_shape_discrete(name="Type", breaks=c("Quebec.nonchilled","Quebec.chilled","Mississippi.nonchilled","Mississippi.chilled"),
  labels=c("Quebec non-chilled","Quebec chilled","Miss. non-chilled","Miss. chilled"))

# Manual shapes
ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Treatment), size=2)+
  scale_shape_manual(name="Treatment", values=c(16,21))

# Identity shapes
set.seed(123)
CO2$Count <- cut(runif(nrow(CO2),0,10), breaks=5)
ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(shape=Count))+
  scale_shape(name="Species", guide="legend")

Scaling the linetype associated with geoms (`scale_linetype_`)

The scale_size_ scales control the type of lines used in geoms and have the following additional optional arguments available:

values - values supplied to manually determine the line types

scale_linetype_discrete scale_linetype_manual scale_linetype_identity

Geom linetypes determined (scaled) by categorical variable Geom linetypes determined (scaled) manually Geom linetypes determined by categorical variable (no scaling)

plot of chunk plotGgplotScaleLinetypeDiscrete

plot of chunk plotGgplotScaleLinetypeManual

values - a set of values (or linetype names) to use for linetypes

plot of chunk plotGgplotScaleLinetypeIdentity

# Discrete shapes determined by the combination of Type and Treatment
# The items in the guide are then rearranged and re-labelled
CO2$Comb <- interaction(CO2$Type,CO2$Treatment)
ggplot(CO2, aes(y=uptake,x=conc)) + geom_smooth(aes(linetype=Comb))+
  scale_linetype_discrete(name="Type", breaks=c("Quebec.nonchilled","Quebec.chilled","Mississippi.nonchilled","Mississippi.chilled"),
  labels=c("Quebec non-chilled","Quebec chilled","Miss. non-chilled","Miss. chilled"))

# Manual linetypes
ggplot(CO2, aes(y=uptake,x=conc)) + geom_smooth(aes(linetype=Treatment))+
  scale_linetype_manual(name="Treatment", values=c("dashed","dotted"))

# Identity linetypes
CO2$Lines <- factor(CO2$Treatment, levels=c("nonchilled","chilled"), labels=c("dotted","dashed"))
ggplot(CO2, aes(y=uptake,x=conc)) + geom_smooth(aes(linetype=Lines))+
  scale_linetype_identity(name="Temperature", guide="legend",breaks=c("dotted","dashed"), labels=c("Low","High"))

Scaling the colour (or fill) associated with geoms (`scale_colour_` & `scale_fill_`)

The scale_size_ scales control the colour of geoms and have the following additional optional arguments available:

low - colour of low end of the colour spectrum
high - colour of high end of the colour spectrum
guide - what sort of legend (e.g. colorbar)

`scale_colour_continuous`	`scale_colour_gradient`	`scale_colour_gradient2`
Geom colours determined (scaled) by continuous variable	Geom colours determined (scaled) palette	Geom colours determined by a different palette

# colour determined by continuous covariate set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=cv))+ scale_colour_continuous(name="Temperature", low="blue", high="red") # colour determined by continuous covariate set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=cv))+ scale_colour_gradient(name="Temperature") # colour determined by continuous covariate set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=cv))+ scale_colour_gradient2(name="Temperature")
`scale_colour_gradientn`	`scale_colour_gradientn` (own palette)
Geom colours determined (scaled) by a specific palette	Geom colours determined (scaled) by a user defined palette

# colour determined by continuous covariate # use a predefined gradient based colour palette set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=cv))+ scale_colour_gradientn(name="Temperature", colours=terrain.colors(5)) # colour determined by continuous covariate # use a own gradient based colour palette my_palette = colorRampPalette(colors=c('red','green','blue')) set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=cv))+ scale_colour_gradientn(name="Temperature", colours=my_palette(5))

`scale_colour_hue`	`scale_colour_grey`	`scale_colour_brewer`
Evenly spaced geom colours determined (scaled) by hue	Geom colours determined (scaled) palette	Geom colours determined by a different palette
		See the color brewer site for more info
# Discrete colours for hue set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=Temp))+ scale_colour_hue(name="Temperature", l=80,c=130) # Discrete colours set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=Temp))+ scale_colour_grey(name="Temperature", start=0.2, end=0.8) # Discrete colours selected from a colour brewer palette # it automatically knows how many colours are required set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=Temp))+ scale_colour_brewer(name="Temperature", type="seq", palette="Reds")
`scale_colour_manual`	`scale_colour_identity`
Geom colours determined (scaled) a specific palette

# Manual colours set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(colour=Temp))+ scale_colour_manual(name="Temperature", values=c("red","#00AA00",1)) #identity colours set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("red","#00AA00",1)) ggplot(CO2, aes(y=uptake,x=conc)) + geom_smooth(aes(colour=Temp))+ scale_colour_identity(name="Temperature", guide="legend",labels=c("Low","Medium","High"))

Scaling the alpha level of colour associated with geoms (`scale_alpha_`)

The scale_alpha_ scales control the transparency of geoms and have the following additional optional arguments available:

range - the alpha range (0,1)
values - alpha values between 0 and 1
guide - what sort of legend (e.g. colorbar)

`scale_alpha_continuous`	`scale_alpha_discrete`	`scale_alpha_manual`
Evenly spaced geom alphas determined (scaled) by continuous	Geom alphas determined (scaled) palette	Geom alphas determined by a different palette

# colour determined by continuous covariate set.seed(123) CO2$cv<- runif(nrow(CO2),10,50) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(alpha=cv))+ scale_alpha_continuous(name="Temperature", range=c(0.3,1)) # Discrete alphas set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(alpha=Temp))+ scale_alpha_discrete(name="Temperature") # Manual alphas set.seed(123) CO2$cv <- runif(nrow(CO2),0,100) CO2$Temp <- cut(CO2$cv,breaks=c(0,33,66,100), labels=c("Low","Medium","High")) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(alpha=Temp))+ scale_alpha_manual(name="Temperature", values=c(0.3,0.6,0.95))
`scale_alpha_identity`
Geom alphas determined (scaled) a specific palette

# Identity alphas set.seed(123) CO2$Alpha <- runif(nrow(CO2),0,1) ggplot(CO2, aes(y=uptake,x=conc)) + geom_point(aes(alpha=Alpha))+ scale_alpha_identity(name="Temperature")

Faceting splits the data up into a matrix of panels on the basis of one or more categorical vectors. Since facets display subsets of the data, they are very useful for examining trends in hierarchical designs.

There are two faceting function, that reflect two alternative approaches:

facet_wrap(~cell) - creates a set of panels based on a factor and wraps the panels into a 2-d matrix. cell represents a categorical vector or set of categorical vectors
facet_wrap(row~column) - creates a set of panels based on a factor and wraps the panels into a 2-d matrix. row and column represents the categorical vectors used to define the rows and columns of the matrix respectively

facet_wrap

facet_grid

The following list describes the mapping aesthetic properties associated with facet_wrap and facet_grid functions. The entries in bold are compulsory.

facet_wrap

facet_grid

facets - formula specifying faceting variables to use in faceting
nrow - number of rows
ncol - number of columns
scales - should all scaled be fixed or free
- "fixed" - (default) all scales the same
- "free" - all scales free
- "free_x" - all x-axis scales free
- "free_y" - all y-axis scales free
as.table - if TRUE, laid out from top left to bottom right, if FALSE: bottom left to top right
drop - drop factor combinations that lack data

facets - formula specifying faceting variables to use in faceting
margins - whether to include marginal trends
scales - should all scaled be fixed or free
- "fixed" - (default) all scales the same
- "free" - all scales free
- "free_x" - all x-axis scales free
- "free_y" - all y-axis scales free
space- should all panels take up the same space
- "fixed" - (default) all panels take up the same space
- "free" - panel heights and widths vary
- "free_x" - panel widths vary
- "free_y" - panel heights vary
labeller - a function used to label the panel strips
as.table - if TRUE, laid out from top left to bottom right, if FALSE: bottom left to top right
drop - drop factor combinations that lack data

Facet	Notes, additional parameters	Example
`_wrap`	Matrix of panels split by a single categorical vector
ggplot(CO2, aes(y=uptake, x=conc)) + geom_smooth() + geom_point() + facet_wrap(~Plant)
`_wrap`	Matrix of panels split by a single categorical vector with different y-axis scale range for each panel
ggplot(CO2, aes(y=uptake, x=conc)) + geom_smooth() + geom_point() + facet_wrap(~Plant, scales="free_y")
`_grid`	Matrix of panels split by a single categorical vector with different y-axis scale range for each panel
ggplot(CO2, aes(y=uptake, x=conc)) + geom_smooth() + geom_point() + facet_grid(Type~Treatment)
`_grid`	Matrix of panels split by a single categorical vector with different y-axis scale range for each panel
ggplot(CO2, aes(y=uptake, x=conc)) + geom_smooth() + geom_point() + facet_grid(Type~Treatment, scales="free_y")

More complex arrangements of multiple panels and figures are discussed in the section on arranging multiple figures on a page

Themes

Themes govern the overall style of the graphic. In particular, they control:

the look and positioning of the axes (and their ticks, titles and labels)
the look and positioning of the legends (size,alignment, font, direction)
the look of plots (spacing and titles)
the look of panels (background, grid lines)
the look of panels strips (background, alignment, font)

Theme	Notes, additional parameters	Example
`_bw`	Black and white theme
ggplot(CO2, aes(y = uptake, x = conc)) + geom_smooth() + geom_point() + theme_bw()
`_classic`	Classic theme
ggplot(CO2, aes(y = uptake, x = conc)) + geom_smooth() + geom_point() + theme_classic()
`_grey`	Grey theme
ggplot(CO2, aes(y = uptake, x = conc)) + geom_smooth() + geom_point() + theme_grey()
`_minimal`	Minimal theme
ggplot(CO2, aes(y = uptake, x = conc)) + geom_smooth() + geom_point() + theme_minimal()

Along with these pre-fabricated themes, it is possible to create your own theme. This is done via the theme() function. Each themable element comprises of either a line, rectangle or text. Therefore, they can all be modified via one of the following functions:

element_blank() - remove the element
element_line() - set the properties of a line
element_rect() - set the properties of a rectangle
element_text() - set the properties of text

library(gridExtra)
ggplot(CO2, aes(y=uptake, x=conc)) + geom_smooth(aes(colour=Type)) + geom_point() +
theme(panel.grid.major = element_blank(), # no major grid lines
  panel.grid.minor = element_blank(), # no minor grid lines
  panel.background = element_blank(), # no background
  panel.border = element_blank(), # no plot border
  axis.title.y=element_text(size=15, vjust=0,angle=90), # y-axis title
  axis.text.y=element_text(size=12), # y-axis labels
  axis.title.x=element_text(size=15, vjust=-2), # x-axis title
  axis.text.x=element_text(size=12), # x-axis labels
  axis.line = element_line(),
  legend.position=c(1,0),
  legend.justification=c(1,0),
  plot.margin=unit(c(0.5,0.5,2,2),"lines")) # plot margins

Exporting graphics

By default, all graphics are sent to the screen graphics device (X11, quartz or windows depending on your operating system and configurations). This is fine for developing a figure, however, in order to share the figures or incorporate them into other documents, it is necessary to output them to one of a number of graphics formats.

The available formats differs according to what graphics devices are available on your system. Nevertheless, most systems support the generation of pdf's (a scalable vector graphics format) and png's (a bitmap format), therefore we will focus on these two formats.

Although it is possible to export graphics instructions to a graphics device using the traditional method:

pdf(file = "test.pdf", width = 6, height = 6)
...
dev.off()

the ggsave() function arguably provides a more convenient interface. The ggsave() function automatically determines the output format from the supplied filename. Moreover, if a ggplot object is not supplied to the function, ggsave() assumes that you want to base the export on the last graphic created (in the screen device) and will attempt to match that graphics device's dimensions.

Portable Document Format (PDF)

## Generate the ggplot plotting instructions
p <- ggplot(data = BOD, map = aes(y = demand, x = Time)) +
    geom_point() + geom_line()
## Export to 6x6 inch pdf
ggsave(file = "test.pdf", p, width = 6, height = 6,
    units = "in")

Portable Network Graphics (PNG)

## Generate the ggplot plotting instructions
p <- ggplot(data = BOD, map = aes(y = demand, x = Time)) +
    geom_point() + geom_line()
## Export to 6x6 inch png (at 300dpi)
ggsave(file = "test.png", p, width = 6, height = 6,
    units = "in", dpi = 300)

Arranging multiple figures on a page

Whilst faceting does provide a way to arrange multiple graphs together on a single page, there are numerous restrictions imposed to ensure consistency of style etc.

More complex graphical manipulations require a more thorough understanding of the the grid framework on which ggplot is built. This framework comprises:
Viewports: these describe (by location and size) a rectangular region on the graphical device in which objects can be drawn. Note, the viewport() function only describes the context for the graphics instructions. Before it can be used, it must be 'pushed to the tree' with the pushViewport() function. The tree can be flushed clean for a new graphic by issuing the grid.newpage() function.

library(grid)
grid.newpage()
vp = viewport()
pushViewport(vp)

Here are two different viewports that are visualized by the grid.show.viewport() function.

library(grid)
vp = viewport()
grid.show.viewport(vp)

library(grid)
vp = viewport(x = unit(0.6, "npc"), y = unit(0.5, "npc"),
    width = unit(2, "in"), height = unit(3, "in"))
grid.show.viewport(vp)

It is also possible to partition the graphical device up into a grid containing multiple viewports.
Shapes and grobs: all shapes drawn on a viewport are graphical objects (grobs) and each grob contains a description of the shape including its col, cex, lwd etc (think base graphics parameters, see get.gpar() for a list of graphical object parameters). There are many primative grobs that can be generated using either grid. of .grob functions

library(grid)
grid.newpage()
vp = viewport()
pushViewport(vp)
grid.rect(width = unit(0.5, "npc"), height = unit(0.4,
    "npc"), gp = gpar(fill = "red"))

There are numerous other functions for generating primative shapes (including circles, polygons, lines, point, text) as well features such as axes and legends. With these functions, it is possible to construct a graph..

library(grid)
grid.newpage()
vp = viewport(x = 0.5, y = 0.5, width = 0.8, height = 0.8,
    xscale = c(0, 10))
pushViewport(vp)
grid.xaxis()
grid.yaxis()
set.seed(1)
grid.points(x = 1:9, y = 0.2 + 0.05 * 1:9 + rnorm(9,
    0, 0.3))

The same thing can be achieved by packaging all the grobs together into a grob tree, defining the viewport and finally issuing the grid.draw() function to trigger the drawing of all the shapes from the grob tree.

library(grid)
xaxis = xaxisGrob()
yaxis = yaxisGrob()
pts = pointsGrob(x = 1:9, y = 0.2 + 0.05 * 1:9 + rnorm(9,
    0, 0.3))
g = grobTree(xaxis, yaxis, pts)
grid.newpage()
vp = viewport(x = 0.5, y = 0.5, width = 0.8, height = 0.8,
    xscale = c(0, 10))
pushViewport(vp)
grid.draw(g)

Although constructing graphics this way does give extrordinary flexibility and power, it does require you to control all aspects of the graphing including the scaling of the coordinate system and aesthetics. The ggplot framework sits on top of this grid framework and looks after all these aspects in a manner consistent with the grammar of graphics. As a minimum, all we need to do is map data to specific features and scales and the ggplot framework will take care of the rest.

Now in order to demonstrate arranging separate figures together into a multifigure plot, I will use colored squares to represent separate figures. Multiple figures are arranged together using the grid.arrange() (or similarly, arrangeGrob()) function. There are numerous arguments that can be supplied to the grid.arrange() function (see below) and many of these will seem daunting. The following examples will help illustrate the most common of these.

...: grobs, gtables or ggplot objects
grobs: a list of grobs to be arranged together
layout_matrix: an optional layout matrix that defines the layout
vp: the viewport in which to place the grobs (defaults to main viewport)
as.table: whether to arrange the grid from bottom-left to top-right (default) or top-left to bottom-right (FALSE)
respect,clip:
nrow,ncol: the number of rows and columns in the grid table
widths,heights:relative widths and heights of cells in the grid table
top,bottom,left,right: strings to add to the respective outer margins
padding: amount of padding to add around the margin texts

library(grid)
gs <- lapply(1:7, function(ii) grobTree(rectGrob(gp = gpar(fill = ii,
    alpha = 0.5)), textGrob(ii)))
library(gridExtra)
grid.arrange(grobs = gs, ncol = 3)

library(gridExtra)
grid.arrange(grobs = gs, ncol = 3, widths = c(2, 1,
    1))

library(gridExtra)
grid.arrange(grobs = gs, ncol = 3, layout_matrix = rbind(c(1,
    1, 2, 3), c(4, 5, 6, NA), c(7, 7, 7, 7)))

So what if we wanted to have the grobs of the middle row evenly spread across the entire row rather than have a blank space at the end. To do this, we would first create three separate grob trees (one for each row), and then arrange these together. I will also make the bottom row twice as tall as the other rows.

library(gridExtra)
g1 = arrangeGrob(grobs = gs[1:3], widths = c(2, 1,
    1))
g2 = arrangeGrob(grobs = gs[4:6], widths = c(1, 1,
    1))
g3 = arrangeGrob(grobs = gs[7])
grid.arrange(g1, g2, g3, nrow = 3, heights = c(1, 1,
    2))

We just created a list of grobs. These can all be arranged in a grid using the grid.arrange

p1 <- ggplot(CO2, aes(y = uptake, x = Treatment, fill = Treatment)) +
    geom_boxplot()
p2 <- ggplot(CO2, aes(x = uptake, fill = Type)) + geom_density(alpha = 0.4)

library(gridExtra)
grid.arrange(p1, p2, nrow = 2)

Yes, we have arranged three separate figures on a single page, however, in this case the result is not all that pleasing (and certainly not publication quality). Firstly, the y-axes of the top and bottom figures do not align and secondly, do we really need three legends?

To rectify the first of these issues, we need to work at a lower level with the grobs themselves. The following procedure works by setting the widths of all grobs to be the same (the maximum of all corresponding grobs across the two figures).

library(gtable)
g1 <- ggplotGrob(p1)
g2 <- ggplotGrob(p2)
g <- rbind(g1, g2, size = "first")
g$widths <- unit.pmax(g1$widths, g2$widths)
grid.newpage()
grid.draw(g)

It can sometimes be difficult to work out dimensions and spacing within ggplot, gtable or grobs. It is therefore useful to be able to visualize the names of all constituent grobs and also their dimensions

library(gtable)
g1 <- ggplotGrob(p1)
showGrob(grid.force(g1))

Since this is a bit messy, we can focus on individuals items.

library(gtable)
g1 <- ggplotGrob(p1)
g1

TableGrob (10 x 9) "layout": 18 grobs
    z         cells       name
1   0 ( 1-10, 1- 9) background
2   5 ( 5- 5, 3- 3)     spacer
3   7 ( 6- 6, 3- 3)     axis-l
4   3 ( 7- 7, 3- 3)     spacer
5   6 ( 5- 5, 4- 4)     axis-t
6   1 ( 6- 6, 4- 4)      panel
7   9 ( 7- 7, 4- 4)     axis-b
8   4 ( 5- 5, 5- 5)     spacer
9   8 ( 6- 6, 5- 5)     axis-r
10  2 ( 7- 7, 5- 5)     spacer
11 10 ( 4- 4, 4- 4)     xlab-t
12 11 ( 8- 8, 4- 4)     xlab-b
13 12 ( 6- 6, 2- 2)     ylab-l
14 13 ( 6- 6, 6- 6)     ylab-r
15 14 ( 6- 6, 8- 8)  guide-box
16 15 ( 3- 3, 4- 4)   subtitle
17 16 ( 2- 2, 4- 4)      title
18 17 ( 9- 9, 4- 4)    caption
                                      grob
1         rect[plot.background..rect.6143]
2                           zeroGrob[NULL]
3     absoluteGrob[GRID.absoluteGrob.6116]
4                           zeroGrob[NULL]
5                           zeroGrob[NULL]
6                gTree[panel-1.gTree.6096]
7     absoluteGrob[GRID.absoluteGrob.6109]
8                           zeroGrob[NULL]
9                           zeroGrob[NULL]
10                          zeroGrob[NULL]
11                          zeroGrob[NULL]
12 titleGrob[axis.title.x..titleGrob.6099]
13 titleGrob[axis.title.y..titleGrob.6102]
14                          zeroGrob[NULL]
15                       gtable[guide-box]
16  zeroGrob[plot.subtitle..zeroGrob.6140]
17     zeroGrob[plot.title..zeroGrob.6139]
18   zeroGrob[plot.caption..zeroGrob.6141]

showGrob(grid.force(g1), "axis-b", grep = TRUE)

p1 <- ggplot(CO2, aes(y = uptake, x = Treatment, fill = Type)) +
    geom_boxplot()
p2 <- ggplot(CO2, aes(y = uptake, x = conc, colour = Type)) +
    geom_smooth()
p3 <- ggplot(CO2, aes(x = uptake, fill = Type)) + geom_density(alpha = 0.4) +
    facet_grid(~Treatment)
grid.arrange(p1, p2, p3, nrow = 2, layout_matrix = rbind(c(1,
    2), c(3, 3)))

To rectify the first of these issues, we need to work at a lower level with the grobs themselves.

View Baptiste's code

gtable_frame <- function(g, width = unit(1, "null"),
    height = unit(1, "null")) {
    panels <- g[["layout"]][grepl("panel", g[["layout"]][["name"]]),
        ]
    ll <- unique(panels$l)
    tt <- unique(panels$t)

    fixed_ar <- g$respect
    if (fixed_ar) {
        # there lies madness, we want to align with aspect
        # ratio constraints
        ar <- as.numeric(g$heights[tt[1]])/as.numeric(g$widths[ll[1]])
        print(ar)
        height <- width * ar
        g$respect <- FALSE
    }

    core <- g[seq(min(tt), max(tt)), seq(min(ll), max(ll))]
    top <- g[seq(1, min(tt) - 1), ]
    bottom <- g[seq(max(tt) + 1, nrow(g)), ]
    left <- g[, seq(1, min(ll) - 1)]
    right <- g[, seq(max(ll) + 1, ncol(g))]

    fg <- nullGrob()
    lg <- if (length(left))
        g[seq(min(tt), max(tt)), seq(1, min(ll) - 1)] else fg
    rg <- if (length(right))
        g[seq(min(tt), max(tt)), seq(max(ll) + 1, ncol(g))] else fg
    grobs = list(fg, g[seq(1, min(tt) - 1), seq(min(ll),
        max(ll))], fg, lg, g[seq(min(tt), max(tt)),
        seq(min(ll), max(ll))], rg, fg, g[seq(max(tt) +
        1, nrow(g)), seq(min(ll), max(ll))], fg)
    widths <- unit.c(sum(left$widths), width, sum(right$widths))
    heights <- unit.c(sum(top$heights), height, sum(bottom$heights))
    all <- gtable_matrix("all", grobs = matrix(grobs,
        ncol = 3, nrow = 3, byrow = TRUE), widths = widths,
        heights = heights)
    all[["layout"]][5, "name"] <- "panel"  # make sure knows where the panel is
    if (fixed_ar)
        all$respect <- TRUE
    all
}

library(gtable)
g1 = ggplotGrob(p1)
g2 = ggplotGrob(p2)
g3 = ggplotGrob(p3)
fg1 = gtable_frame(g1)
fg2 = gtable_frame(g2)
fg12 = gtable_frame(cbind(fg1, fg2))
fg3 = gtable_frame(g3)
fg123 = gtable_frame(rbind(fg12, fg3))
grid.newpage()
grid.draw(fg123)

Lets start by generating a number of single panel and multipanel plots that can be used to illustrate different options for arranging graphics.

Controlling panel dimensions

The following two plots represent time series of a few different metrics (pce: personal consumption expendatures in billions $US, pop: total US population in thousands, unemploy: number of unemployed in thousands, psavert: personal savings rate and unempmed: median duration of employment in weeks) compiled in the ggplot2 package from http://research.stlouisfed.org/fred2.

g1 = ggplot(economics_long %>% filter(variable %in%
    c("pce", "pop", "unemploy")), aes(y = value, x = date)) +
    geom_point() + facet_grid(variable ~ ., scales = "free_y") +
    theme_bw()
g1

g2 = ggplot(economics_long %>% filter(variable %in%
    c("psavert", "uempmed")), aes(y = value, x = date)) +
    geom_point() + facet_grid(variable ~ ., scales = "free_y") +
    theme_bw()
g2

What if we wished to produce two separate figures (one for g1 and one for g2, yet we wanted the panels within each figure to be the same sizes. Currently, you will notice that g1 has three panels and g2 has only two and that both figures are the same height - therefore g1's panels are shorter. We might alternatively want the panels of the two figures to be a consistent size and the size of the overal figure to be sized accordingly. Unfortunately, this is not as simple as just making g2 2/3 the height of g1, since this does not take into account the vertical space occupied by the xaxis (which only appears once in each figure).

Baptiste Auguié provides a solution on https://github.com/baptiste/gridextra/wiki/arranging-ggplot. The following function fixes the panels at a given width and height (and margin) and either returns a ggplotTable or exports the graphics to file.

set_panel_size <- function(p = NULL, g = ggplotGrob(p),
    file = NULL, margin = unit(1, "mm"), width = unit(4,
        "cm"), height = unit(4, "cm")) {

    panels <- grep("panel", g$layout$name)
    panel_index_w <- unique(g$layout$l[panels])
    panel_index_h <- unique(g$layout$t[panels])
    nw <- length(panel_index_w)
    nh <- length(panel_index_h)

    if (getRversion() < "3.3.0") {

        # the following conversion is necessary because
        # there is no `[<-`.unit method so promoting to
        # unit.list allows standard list indexing
        g$widths <- grid:::unit.list(g$widths)
        g$heights <- grid:::unit.list(g$heights)

        g$widths[panel_index_w] <- rep(list(width),
            nw)
        g$heights[panel_index_h] <- rep(list(height),
            nh)

    } else {

        g$widths[panel_index_w] <- rep(width, nw)
        g$heights[panel_index_h] <- rep(height, nh)

    }

    if (!is.null(file))
        ggsave(file, g, width = convertWidth(sum(g$widths) +
            margin, unitTo = "in", valueOnly = TRUE),
            height = convertHeight(sum(g$heights) +
                margin, unitTo = "in", valueOnly = TRUE))

    g
}

Now we can indicate a panel height and width. Note, the new function (set_panel_size) always returns a ggplot grob object. In this case, we are only interested in the side effect (producing the plot). Therefore I have directed the output to a throw away object (a).

a = set_panel_size(p = g1, file = "images/g1sized.png",
    margin = unit(1, "mm"), width = unit(2, "in"),
    height = unit(1, "in"))

a = set_panel_size(p = g2, file = "images/g2sized.png",
    margin = unit(1, "mm"), width = unit(2, "in"),
    height = unit(1, "in"))

Notice that both the height and width of these figures vary to ensure that all panels are identical in size.

Examples

Exploring distributions

Boxplots - geom_boxplot & stat_boxplot

Univariate boxplots

Basic boxplot

Plain boxplot

# Univariate boxplot
ggplot(BOD) + geom_boxplot(aes(y=demand,x="Demand"))

#Conditional boxplot
p <- ggplot(BOD) +
 geom_boxplot(aes(y=demand,x=1)) +
 scale_y_continuous("Biochemical oxygen demand (mg/l)") +
 scale_x_continuous(limits=c(0,2),breaks=NULL)

p + theme(panel.grid.major = element_blank(),
  panel.grid.minor = element_blank(),
  panel.background = element_blank(),
  panel.border = element_blank(),
  axis.title.y=element_text(size=15, vjust=0,angle=90),
  axis.text.y=element_text(size=12),
  axis.title.x=element_blank(),
  axis.text.x=element_blank(),
  axis.line = element_line(),
  plot.margin=unit(c(0.5,0.5,0.5,2),"lines")
 )

Conditional (factorial) boxplots

Basic factorial boxplot	Plain factorial boxplot

#Conditional boxplot ggplot(warpbreaks) + geom_boxplot(aes(y=breaks,x=wool)) #Plain conditional boxplot p <- ggplot(warpbreaks) + geom_boxplot(aes(y=breaks,x=wool)) + scale_y_continuous("Number of wool breaks") + p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15, vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Basic factorial boxplot	Plain factorial boxplot

ggplot(warpbreaks) + geom_boxplot(aes(y=breaks,x=wool, fill=tension)) p <- ggplot(warpbreaks) + geom_boxplot(aes(y=breaks,x=wool, fill=tension)) + scale_y_continuous("Number of wool breaks") + scale_x_discrete("Type of wool")+ labels=c("Low","Medium","High"),start=0.5,end=1) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15, vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )

Violin Plot - geom_violin

Violin plot

Plain violin plot

ggplot(warpbreaks, aes(y=breaks, x=wool))+geom_violin()

library(grid)
library(scales)
p<-ggplot(warpbreaks, aes(y=breaks, x=wool))+
  geom_violin()+
  scale_x_discrete("Wool type")+
  scale_y_continuous("Number of breaks", expand=c(0.05,0), labels=comma)
p +       theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,0.2),legend.justification=c(1,0),
        plot.margin=unit(c(0.5,0.5,2,2),"lines"),
    legend.key=element_blank()
  )

Histograms - geom_histogram, geom_bar & stat_bin

Univariate histograms

Basic histogram	Plain histogram
	Error: stat_count requires the following missing aesthetics: x
ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers)) #OR ggplot(data=data.frame(rivers)) + geom_histogram(aes(x=rivers)) p <- ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers),colour='black',fill='gray')+ scale_x_continuous("Length of rivers (miles)")+ scale_y_continuous("Frequency", expand=c(0,0))+ coord_cartesian(xlim=c(0,4000)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Number of bins	Plain bin width
	Error: stat_bin requires the following missing aesthetics: x
# Histogram with customized bin widths ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers),binwidth=50) #OR ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers)) #Plain histogram with custom bin widths #use the expand() to scale the axis zero to 0 p <- ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers),binwidth=50,colour='black',fill='gray')+ scale_x_continuous("Length of rivers (miles)", expand=c(0,0))+ scale_y_continuous("Frequency", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )

Scaled x-values	Plain transformed x-values
	Error: stat_count requires the following missing aesthetics: x
# Histogram on log transformed data ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers)) +scale_x_continuous(trans="log10") #OR ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers)) +scale_x_log10() # Plain histogram of log transformed data #define a new axis label formattter p <- ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers),colour='black',fill='gray')+ scale_x_continuous("Length of rivers (miles)", expand=c(0,0),trans="log10")+ scale_y_continuous("Frequency", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Scaled x-axis	Plain transformed x- coordinates
	Error: stat_count requires the following missing aesthetics: x
# Histogram of linear data on log transformed axis ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers)) +coord_trans(x="log1p") # Plain histogram of linear data on log transformed axis #define a new axis label formattter p <- ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers),colour='black',fill='gray')+ scale_x_continuous("Length of rivers (miles)", expand=c(0,0))+ coord_trans(x="log1p")+ scale_y_continuous("Frequency", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Gradient fill	Plain scaled grey gradient fill
	Error: stat_count requires the following missing aesthetics: x
ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers, fill=..count..)) # Plain histogram with gradient fill #define a new axis label formattter p <- ggplot(data=data.frame(rivers)) + geom_bar(aes(x=rivers,fill=..count..))+ geom_bar(aes(x=rivers, fill=..count..),colour="black",guide=FALSE)+ scale_x_continuous("Length of rivers (miles)", expand=c(0,0))+ scale_y_continuous("Frequency", expand=c(0,0))+ scale_fill_gradient(low="grey90", high="grey40") p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )

Conditional (factorial) histograms

Basic histogram	Plain histogram

ggplot(data=iris) + geom_bar(aes(x=Sepal.Length, fill=Species) , position="identity") #OR ggplot(data=iris) + geom_histogram(aes(x=Sepal.Length, fill=Species) , position="identity") #Conditional histogram p <- ggplot(data=iris) + geom_bar(aes(x=Sepal.Length, fill=Species), position="identity")+ scale_x_continuous("Sepal length (mm)", expand=c(0,0))+ scale_y_continuous("Frequency", expand=c(0,0))+ scale_fill_grey() p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Basic histogram	Plain histogram

ggplot(data=iris) + geom_bar(aes(x=Sepal.Length, fill=Species), position="dodge") #OR ggplot(data=iris) + geom_histogram(aes(x=Sepal.Length, fill=Species), position="dodge") #Transparent Conditional Histogram p <-ggplot(data=iris)+ geom_bar(aes(x=Sepal.Length,fill=Species), alpha=0.5, stat="bin", position="identity")+ geom_step(aes(x=Sepal.Length, colour=Species,fill=Species),stat="bin", position=position_identity())+#, scale_x_continuous("Sepal length (mm)", expand=c(0,0))+ scale_y_continuous("Frequency", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )

Density plots - geom_density & stat_density

Univariate density plots

Basic density plot	Plain density plot
	Error: stat_density requires the following missing aesthetics: x
ggplot(data=data.frame(rivers)) + geom_density(aes(x=rivers)) p <- ggplot(data=data.frame(rivers)) + geom_density(aes(x=rivers),colour='black',fill='grey90')+ scale_x_continuous("Length of rivers (miles)", expand=c(0,0))+ scale_y_continuous(expression(paste("Density (",phantom() %% 10^-4,")")), expand=c(0,0), labels=function(x){format(x10000,nsmall=1,scientific=FALSE)}) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=12), axis.line = element_line(), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Basic smoother density plot	Plain smoother density plot
	Error: stat_density requires the following missing aesthetics: x
ggplot(data=data.frame(rivers)) + geom_density(aes(x=rivers), adjust=5) myF <- function(x) { format(x * 10000, nsmall = 1, scientific = FALSE) } p <- ggplot(data = data.frame(rivers)) + geom_density(aes(x = rivers), adjust = 5, colour = "black", fill = "grey90") + scale_x_continuous("Length of rivers (miles)", expand = c(0, 0)) + scale_y_continuous("Density (/10000)", expand = c(0, 0), labels = myF) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y = element_text(size = 15, vjust = 0, angle = 90), axis.text.y = element_text(size = 12), axis.title.x = element_text(size = 15, vjust = -1), axis.text.x = element_text(size = 12), axis.line = element_line(), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Basic smoother density plot

Plain smoother density plot

Error: stat_density requires the following missing aesthetics: x

ggplot(data=data.frame(rivers)) + geom_density(aes(x=rivers)) +scale_x_continuous(trans="log10")

p <- ggplot(data=data.frame(rivers)) +
  geom_density(aes(x=rivers),colour='black',fill='grey90')+
  scale_x_continuous("Length of rivers (miles)", expand=c(0,0),
    trans="log10", breaks=c(250,500,1000,2000,3000),label=c(250,500,1000,2000,3000))+
  scale_y_continuous("Density", expand=c(0,0))
p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
        )

Factorial density

Basic factorial density plot	Plain factorial density plot

ggplot(data=iris) + geom_density(aes(x=Sepal.Length, colour=Species)) # Plain conditional density plot p <- ggplot(data=iris) + geom_density(aes(x=Sepal.Length, colour=Species))+ scale_x_continuous("Sepal length (mm)", expand=c(0,0))+ scale_y_continuous("Density", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=10), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )
Basic factorial density plot	Plain factorial density plot

# Conditional density plot ggplot(data=iris) + geom_density(aes(x=Sepal.Length, fill=Species)) # Plain conditional density plot p <- ggplot(data=iris) + geom_density(aes(x=Sepal.Length,fill=Species), alpha=0.4, colour=NA)+ geom_density(aes(x=Sepal.Length,fill=Species, colour=Species), alpha=0.0, show_guide=FALSE)+ scale_x_continuous("Sepal length (mm)", expand=c(0,0))+ scale_y_continuous("Density", expand=c(0,0)) p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), panel.border = element_blank(), axis.title.y=element_text(size=15, vjust=0,angle=90), axis.text.y=element_text(size=12), axis.title.x=element_text(size=15,vjust=-1), axis.text.x=element_text(size=10), axis.line = element_line(), legend.position=c(1,1),legend.justification=c(1,1), plot.margin=unit(c(0.5,0.5,2,2),"lines") )

Line graphs - geom_line

Basic line graph

Plain line graph

ggplot(BOD) + geom_line(aes(y=demand,x=Time))

# Plain line plot
p <- ggplot(data=BOD) +
  geom_line(aes(y=demand,x=Time),size=2)+
  scale_x_continuous("Time (days)", expand=c(0.05,0), limits=c(0,8))+
  scale_y_continuous("Demand (mg/l)", expand=c(0.05,0), limits=c(8,20))
p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Basic line graph

Plain line graph

ggplot(BOD,aes(y=demand,x=Time)) + geom_line()+geom_point()

# Plain line plot
p <- ggplot(data=BOD) +
  geom_line(aes(y=demand,x=Time),size=2)+
  scale_x_continuous("Time (days)", expand=c(0.05,0), limits=c(0,8))+
  scale_y_continuous("Demand (mg/l)", expand=c(0.05,0), limits=c(8,20))
p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Scatterplots - geom_point, geom_line, geom_smooth, stat_smooth & stat_summary

Simple scatterplots

Basic scatterplot

Plain scatterplot

ggplot(BOD) + geom_point(aes(y=demand,x=Time))

# Plain scatterplot
p <- ggplot(data=BOD) +
  geom_point(aes(y=demand,x=Time),size=3)+
  scale_x_continuous("Time (days)", expand=c(0,0), limits=c(0,8))+
  scale_y_continuous("Demand (mg/l)", expand=c(0,0), limits=c(8,20))
p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Trends and smoothers

Linear trend

Plain linear trend with
95% confidence and

plot of chunk ggplotScatterplotLinearPlain

ggplot(BOD) + geom_point(aes(y=demand,x=Time)) +geom_smooth(aes(y=demand,x=Time), method="lm")

# fit linear model (in order to get confidence bands)
BOD.lm <- lm(demand~Time, data=BOD)
xs <- seq(min(BOD$Time), max(BOD$Time), l=1000)
BOD.predict <- predict(BOD.lm,
  newdata=data.frame(Time=xs),interval='confidence', se=TRUE)
BOD.predict <- data.frame(BOD.predict$fit, se=BOD.predict$se.fit,Time=xs)
# Create a plain scatterplot with smoother and confidence bands
p <- ggplot(data=BOD) +
  geom_point(aes(y=demand, x=Time),colour='grey',size=2)+
  geom_line(aes(y=demand,x=Time),stat="smooth", method="lm")+
  scale_x_continuous("Time (days)", limits=c(1,7))+
  scale_y_continuous("Demand (mg/l)")
p <- p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )
# add the confidence bands
p+geom_line(data=BOD.predict, aes(y=upr,x=Time), linetype=2)+
  geom_line(data=BOD.predict, aes(y=lwr,x=Time), linetype=2)

Loess smoother

Plain loess smoother with standard error

plot of chunk ggplotScatterplotLoessPlain

ggplot(BOD) + geom_point(aes(y=demand,x=Time)) +geom_smooth(aes(y=demand,x=Time),
                method="loess", degree=1,se=TRUE)

# Fit a loess smoother
BOD.loess <- loess(demand~Time, data=BOD, degree=1)
xs <- seq(min(BOD$Time), max(BOD$Time), l=1000)
BOD.predict <- predict(BOD.loess,newdata=data.frame(Time=xs), se=TRUE)
BOD.predict <- with(BOD.predict,data.frame(fit,lwr=fit-se.fit,upr=fit+se.fit,Time=xs))
# Plain scatterplot with loess smoother and confidence bands
p <- ggplot(data=BOD) +
  geom_point(aes(y=demand, x=Time),colour='grey',size=2)+
  geom_line(data=BOD.predict,aes(y=fit,x=Time))+
  scale_x_continuous("Time (days)", limits=c(1,7))+
  scale_y_continuous("Demand (mg/l)")
p <- p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.background = element_blank(),
    panel.border = element_blank(),
    axis.title.y=element_text(size=15, vjust=0,angle=90),
    axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )
# add confidence bands
p+geom_line(data=BOD.predict, aes(y=upr,x=Time), linetype=2)+
  geom_line(data=BOD.predict, aes(y=lwr,x=Time), linetype=2)

Generalized additive model (GAM)

Plain gam with 95% CI based on
1.96 (dashed; ggplot default) and qt(0.975,df)

          library(mgcv)
ggplot(CO2, aes(y=uptake, x=conc)) +geom_point()+stat_smooth(geom="smooth",method="gam",formula=y~s(x,k=6))

#manually fit a GAM
library(mgcv)
CO2.gam <- gam(uptake~s(conc,k=6), data=CO2)
xs <- seq(min(CO2$conc), max(CO2$conc), l=1000)
CO2.predict <- predict(CO2.gam,newdata=data.frame(conc=xs),se.fit=TRUE)
df <- sum(CO2.gam$edf[-1])
#generate 95% CI predictions based on 1.96SE and degrees of freedom	  
CO2.predict1 <-with(CO2.predict,data.frame(fit,lwr=fit-(1.96*se.fit),
                                                               upr=fit+(1.96*se.fit),conc=xs))
CO2.predict2 <- with(CO2.predict,data.frame(fit,lwr=fit-(qt(0.975,df)*se.fit),
                                                upr=fit+(qt(0.975,df)*se.fit),conc=xs))
p <- ggplot(data=CO2) +
  geom_point(aes(y=uptake, x=conc),colour='grey',size=2)+
  #stat_smooth(aes(y=uptake, x=conc),geom="smooth",method="gam",formula=y~s(x,k=6))+
  scale_x_continuous(expression(paste("Ambient ",CO[2]," concentration (mg/l)", sep="")))+
  scale_y_continuous(expression(paste(CO[2]," uptake rate (",mu*mol/m^2/sec,")", sep="")))
p <- p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.background = element_blank(),
    panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )
# add the trendline and confidence bands
p+geom_line(data=CO2.predict1, aes(y=fit,x=conc),)+
  geom_line(data=CO2.predict1, aes(y=upr,x=conc), linetype=2)+
  geom_line(data=CO2.predict1, aes(y=lwr,x=conc), linetype=2)+
  geom_line(data=CO2.predict2, aes(y=upr,x=conc), linetype=3)+
  geom_line(data=CO2.predict2, aes(y=lwr,x=conc), linetype=3)

Means plot

Plain gam with 95% CI based on
1.96 (dashed; ggplot default) and qt(0.975,df)

p<-ggplot(CO2, aes(y=uptake, x=conc)) +geom_pointrange(stat="summary", fun.data="mean_cl_boot")
p

p <- ggplot(data=CO2, aes(y=uptake, x=conc)) +
  scale_x_continuous(expression(paste("Ambient ",CO[2]," concentration (mg/l)", sep="")))+
  scale_y_continuous(expression(paste(CO[2]," uptake rate (",mu*mol/m^2/sec,")", sep="")))
p <- p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )
# add the errorbars
p+geom_pointrange(stat="summary", fun.data="mean_cl_normal")

Means plot

Plain gam with 95% CI based on
1.96 (dashed; ggplot default) and qt(0.975,df)

ggplot(CO2, aes(y=uptake, x=conc)) + geom_errorbar(stat="summary", fun.data="mean_cl_boot")+geom_point(stat="summary",fun.y="mean")

p <- ggplot(data=CO2, aes(y=uptake, x=conc)) +
  stat_smooth(geom="smooth",method="gam",formula=y~s(x,k=6),se=FALSE, colour="gray")+
  geom_errorbar(stat="summary", fun.data="mean_cl_boot") +geom_point(stat="summary",fun.y="mean")+
  scale_x_continuous(expression(paste("Ambient ",CO[2]," concentration (mg/l)", sep="")))+
  scale_y_continuous(expression(paste(CO[2]," uptake rate (",mu*mol/m^2/sec,")", sep="")))
p +
  theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Bargraphs (dynamite plots) - geom_bar & stat_summary

Simple bargraphs

Bargraph

Plain bargraph

ggplot(warpbreaks, aes(y=breaks,x=tension))+geom_bar(stat="summary", fun.y=mean)+ geom_errorbar(stat="summary", fun.data="mean_cl_normal", width=0.1)

p <- ggplot(data=warpbreaks, aes(y=breaks, x=tension)) +
  geom_bar(stat="summary", fun.y=mean,color="black",fill="grey80")+
  geom_errorbar(stat="summary", fun.data="mean_cl_normal", width=0.1)+
  scale_x_discrete("Tension")+
  scale_y_continuous("Number of breaks", expand=c(0,0))
p + theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Conditional (factorial) bargraphs

Conditional bargraph

Plain conditional bargraph

plot of chunk ggplotFactorialBargraphPlain

ggplot(warpbreaks, aes(y=breaks,x=tension,group=wool))+
geom_bar(aes(fill=wool),position=position_dodge(0.9),stat="summary", fun.y=mean)+
geom_errorbar(position=position_dodge(0.9),stat="summary", fun.data="mean_cl_normal", width=0.1)

p <- ggplot(data = warpbreaks, aes(y = breaks, x = tension,
    group = wool)) + geom_bar(aes(fill = wool), position = "dodge",
    stat = "summary", fun.y = mean) + geom_bar(aes(fill = wool),
    position = "dodge", stat = "summary", fun.y = mean,
    color = "black", show_guide = FALSE) + scale_fill_grey("Wool type") +
    geom_errorbar(position = position_dodge(0.9), stat = "summary",
        fun.data = "mean_cl_normal", width = 0.1) +
    scale_x_discrete("Tension") + scale_y_continuous("Number of wool breaks",
    expand = c(0, 0))
p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), panel.border = element_blank(),
    axis.title.y = element_text(size = 15, vjust = 0,
        angle = 90), axis.text.y = element_text(size = 12),
    axis.title.x = element_text(size = 15, vjust = -1),
    axis.text.x = element_text(size = 10), axis.line = element_line(),
    legend.position = c(1, 1), legend.justification = c(1,
        1), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Bar charts - geom_bar

Stacked barchart

Plain stacked barchart

#based on pre-calculated counts
warpbreaks.c<-ddply(warpbreaks,~wool+tension, function(x) data.frame(count=sum(x$breaks)))
ggplot(warpbreaks.c, aes(x=tension,y=count,fill=wool))+geom_bar(stat='identity')+ylab("Number of breaks")

#based on pre-calculated counts
warpbreaks.c<-ddply(warpbreaks,~wool+tension, function(x) data.frame(count=sum(x$breaks)))
p <- ggplot(warpbreaks.c, aes(x=tension,y=count,fill=wool))+
  geom_bar(aes(fill=wool), stat='identity')+
  geom_bar(aes(fill=wool), stat='identity', colour="black",show_guide=FALSE)+
  scale_fill_grey("Wool type")+
  scale_x_discrete("Tension")+
  scale_y_continuous("Number of wool breaks", expand=c(0,0))
p +
  theme(panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.background = element_blank(),
    panel.border = element_blank(),
    axis.title.y=element_text(size=15, vjust=0,angle=90),
    axis.text.y=element_text(size=12),
    axis.title.x=element_text(size=15,vjust=-1),
    axis.text.x=element_text(size=10),
    axis.line = element_line(),
    legend.position=c(1,1),legend.justification=c(1,1),
    plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Horizontal stacked barchart

Plain horizontal stacked barchart

#based on pre-calculated counts
warpbreaks.c<-ddply(warpbreaks,~wool+tension, function(x) data.frame(count=sum(x$breaks)))
ggplot(warpbreaks.c, aes(x=tension,y=count,fill=wool))+geom_bar(stat='identity')+ylab("Number of breaks") + coord_flip()

#based on pre-calculated counts
warpbreaks.c<-ddply(warpbreaks,~wool+tension, function(x) data.frame(count=sum(x$breaks)))
p <- ggplot(warpbreaks.c, aes(x=tension,y=count,fill=wool))+ coord_flip()+
  geom_bar(aes(fill=wool), stat='identity')+
  geom_bar(aes(fill=wool), stat='identity', colour="black",show_guide=FALSE)+
  scale_fill_grey("Wool type")+
  scale_x_discrete("Tension")+
  scale_y_continuous("Number of wool breaks", expand=c(0.05,0))
p +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_blank(),
        axis.title.y=element_text(size=15, vjust=0,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        axis.line = element_line(),
        legend.position=c(1,1),legend.justification=c(1,1),
        plot.margin=unit(c(0.5,0.5,2,2),"lines")
  )

Interaction plots - geom_point, geom_line, geom_smooth

Interaction plot

Plain interaction plot

plot of chunk ggplotInteractionPlotPlain

ggplot(ToothGrowth, aes(y=len, x=dose, colour=supp))+geom_point()+geom_smooth(method="lm")

p <- ggplot(ToothGrowth, aes(y = len, x = dose, linetype = supp)) +
    geom_point(aes(shape = supp)) + geom_smooth(method = "lm") +
    scale_linetype_manual(name = "Suppliment type",
        values = c(1, 2), breaks = c("OJ", "VC"), labels = c("Orange juice",
            "Vitamine C")) + scale_shape_manual(name = "Suppliment type",
    values = c(21, 16), breaks = c("OJ", "VC"), labels = c("Orange juice",
        "Vitamine C")) + scale_x_continuous("Dose (mg)",
    labels = comma) + scale_y_continuous("Tooth length (mm)",
    expand = c(0.05, 0), labels = comma)
p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), panel.border = element_blank(),
    axis.title.y = element_text(size = 15, vjust = 0,
        angle = 90), axis.text.y = element_text(size = 12),
    axis.title.x = element_text(size = 15, vjust = -1),
    axis.text.x = element_text(size = 10), axis.line = element_line(),
    legend.position = c(1, 0), legend.justification = c(1,
        0), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Interaction plot with errorbars

Plain interaction plot with errorbars

plot of chunk ggplotInteractionPlotPlain2

ggplot(ToothGrowth, aes(y=len, x=dose, colour=supp))+geom_errorbar(stat="summary", fun.data="mean_cl_boot")+geom_point(stat="summary", fun.y="mean")+geom_line(stat="summary", fun.y="mean")

p <- ggplot(ToothGrowth, aes(y = len, x = dose, group = supp)) +
    geom_errorbar(stat = "summary", fun.data = "mean_cl_boot",
        width = 0.05) + geom_line(aes(linetype = supp),
    stat = "summary", fun.y = "mean") + geom_point(aes(shape = supp,
    fill = supp), stat = "summary", fun.y = "mean") +
    scale_shape_manual(name = "Suppliment type", values = c(21,
        16), breaks = c("OJ", "VC"), labels = c("Orange juice",
        "Vitamine C")) + scale_fill_manual(name = "Suppliment type",
    values = c("white", "black"), breaks = c("OJ",
        "VC"), labels = c("Orange juice", "Vitamine C")) +
    scale_linetype_manual(name = "Suppliment type",
        values = c(1, 2), breaks = c("OJ", "VC"), labels = c("Orange juice",
            "Vitamine C")) + scale_x_continuous("Dose (mg)",
    labels = comma) + scale_y_continuous("Tooth length (mm)",
    expand = c(0.05, 0), labels = comma)
p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), panel.border = element_blank(),
    axis.title.y = element_text(size = 15, vjust = 0,
        angle = 90), axis.text.y = element_text(size = 12),
    axis.title.x = element_text(size = 15, vjust = -1),
    axis.text.x = element_text(size = 10), axis.line = element_line(),
    legend.position = c(1, 0), legend.justification = c(1,
        0), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"),
    legend.key = element_blank())

Interaction plot with dodge

Plain interaction plot with dodge

plot of chunk ggplotInteractionPlotPlain4

ggplot(ToothGrowth, aes(y=len, x=dose, colour=supp))+geom_errorbar(stat="summary", fun.data="mean_cl_boot", position=position_dodge(0.2))+geom_point(stat="summary", fun.y="mean", position=position_dodge(0.2))+geom_line(stat="summary", fun.y="mean", position=position_dodge(0.2))

p <- ggplot(ToothGrowth, aes(y = len, x = dose, group = supp)) +
    geom_errorbar(stat = "summary", fun.data = "mean_cl_boot",
        width = 0.05, position = position_dodge(0.2)) +
    geom_line(aes(linetype = supp), stat = "summary",
        fun.y = "mean", position = position_dodge(0.2)) +
    geom_point(aes(shape = supp, fill = supp), size = 3,
        stat = "summary", fun.y = "mean", position = position_dodge(0.2)) +
    scale_shape_manual(name = "Suppliment type", values = c(21,
        16), breaks = c("OJ", "VC"), labels = c("Orange juice",
        "Vitamine C")) + scale_fill_manual(name = "Suppliment type",
    values = c("white", "black"), breaks = c("OJ",
        "VC"), labels = c("Orange juice", "Vitamine C")) +
    scale_linetype_manual(name = "Suppliment type",
        values = c(1, 2), breaks = c("OJ", "VC"), labels = c("Orange juice",
            "Vitamine C")) + scale_x_continuous("Dose (mg)",
    labels = comma) + scale_y_continuous("Tooth length (mm)",
    expand = c(0.05, 0), labels = comma)
p + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), panel.border = element_blank(),
    axis.title.y = element_text(size = 15, vjust = 0,
        angle = 90), axis.text.y = element_text(size = 12),
    axis.title.x = element_text(size = 15, vjust = -1),
    axis.text.x = element_text(size = 10), axis.line = element_line(),
    legend.position = c(1, 0), legend.justification = c(1,
        0), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"),
    legend.key = element_blank())

Scatterplot matrix - ggpairs

Scatterplot matrix

Plain gam with 95% CI based on
1.96 (dashed; ggplot default) and qt(0.975,df)

library(GGally)
ggpairs(iris, colour='Species')

library(GGally)
ggpairs(iris, colour = "Species", upper = list(continuous = "density",
    combo = "box"), diag = list(continuous = "density"),
    lower = list(continuous = "smooth"), axisLabels = "show")

2D grid of panels - facet_grid

Grid of panels

Plain grid of panels

ggplot(CO2,aes(y=uptake,x=conc))+geom_smooth()+geom_point()+facet_grid(Type~Treatment)

library(grid)
library(scales)
# Create a new instance of the dataset to facilitate more informative panel titles
CO2.a <- CO2
# Re-define the factor labels
CO2.a$Treatment <- factor(CO2.a$Treatment, labels=c(expression(paste("Non-Chilled ",
   (symbol("\076")*15 * degree * C), sep="")),
   expression(paste("Chilled ",(symbol("\074")*5 * degree * C), sep=""))))
CO2.a$Type <- factor(CO2.a$Type, labels=c("Origin:Quebec", "Origin:Mississippi"))
p<-ggplot(CO2.a,aes(y=uptake,x=conc))+
  geom_ribbon(aes(ymin=..ymin.., ymax=..ymax..),linetype=2,fill="transparent",colour="black",
    stat='smooth',method='loess')+
  geom_smooth(se=FALSE)+
  geom_point()+
  facet_grid(Type~Treatment, labeller=label_parsed)+
  scale_x_continuous(expression(paste("Ambient ",CO[2]," concentration (mg/l)", sep="")))+
  scale_y_continuous(expression(paste(CO[2]," uptake rate (",mu*mol/m^2/sec,")", sep="")))
p +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        panel.border = element_rect(fill="transparent",colour="black"),
        axis.title.y=element_text(size=15, vjust=1,angle=90),
        axis.text.y=element_text(size=12),
        axis.title.x=element_text(size=15,vjust=-1),
        axis.text.x=element_text(size=10),
        #axis.line = element_line(),
    strip.background=element_rect(fill="transparent", colour="black"),
        #legend.position=c(1,0.2),legend.justification=c(1,0),
        plot.margin=unit(c(0.5,0.5,2,2),"lines"),
    legend.key=element_blank()
  )

Multiple graphs per graphic

grid.arrange

Grid of panels

Plain grid of panels

p1 <- ggplot(CO2, aes(y = uptake, x = Treatment, fill = Type)) + geom_boxplot()
p2 <- ggplot(CO2, aes(y = uptake, x = conc, colour = Type)) + geom_smooth()
p3 <- ggplot(CO2, aes(x = uptake, fill = Type)) + geom_density(alpha = 0.4) + facet_grid(~Treatment)

library(gridExtra)
grid.arrange(p1, p2, p3, nrow = 2)

p1 <- ggplot(CO2, aes(y = uptake, x = Treatment, fill = Type)) + geom_boxplot()
p2 <- ggplot(CO2, aes(y = uptake, x = conc, colour = Type)) + geom_smooth()
p3 <- ggplot(CO2, aes(x = uptake, fill = Type)) + geom_density(alpha = 0.4) + facet_grid(~Treatment) +
    scale_x_continuous("Uptake", expand = c(0, 0)) + scale_y_continuous("Concentration", expand = c(0,
    0))

library(gridExtra)

myTheme <- theme_classic() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
    panel.background = element_blank(), panel.border = element_blank(), axis.title.y = element_text(size = rel(1.2),
        vjust = 2, angle = 90), axis.text.y = element_text(size = rel(1)), axis.title.x = element_text(size = rel(1.2),
        vjust = -1), axis.text.x = element_text(size = rel(0.8)), axis.line = element_line(), legend.position = c(0.5,
        1.4), legend.justification = c(0.5), legend.direction = "horizontal", legend.text = element_text(size = rel(0.25)),
    legend.title = element_text(size = rel(0.5)), legend.key.size = unit(1, "lines"), plot.margin = unit(c(0.5,
        0.5, 2, 2), "lines"), legend.key = element_blank(), strip.background = element_blank())

noLegend <- theme(legend.position = "none")
p1 <- p1 + myTheme + noLegend
p2 <- p2 + myTheme + noLegend

grid.newpage()
grid.draw(arrangeGrob(arrangeGrob(p1, p2, nrow = 1), p3 + myTheme))

Viewports

Grid of panels

Plain grid of panels

p1 <- ggplot(CO2, aes(y = uptake, x = Treatment, fill = Type)) + geom_boxplot()
p2 <- ggplot(CO2, aes(y = uptake, x = conc, colour = Type)) + geom_smooth()
p3 <- ggplot(CO2, aes(x = uptake, fill = Type)) + geom_density(alpha = 0.4) + facet_grid(~Treatment)

grid.newpage()
pushViewport(viewport(layout = grid.layout(4, 5)))

pushViewport(viewport(layout.pos.col = 1:2, layout.pos.row = 1:2))
print(p1 + theme(legend.position = "none"), newpage = FALSE)
popViewport(1)

pushViewport(viewport(layout.pos.col = 3:4, layout.pos.row = 1:2))
print(p2 + theme(legend.position = "none"), newpage = FALSE)
popViewport(1)

pushViewport(viewport(layout.pos.col = 1:4, layout.pos.row = 3:4))
print(p3 + theme(legend.position = "none"), newpage = FALSE)
popViewport(1)

library(gridExtra)
tmp <- ggplot_gtable(ggplot_build(p3))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
# legend <- gTree(legend, vp=viewport(layout.pos.col=5, layout.pos.row=1:4)) using grid.arrange for
# convenience could also manually push viewports grid.arrange(arrangeGrob(p3 +
# theme(legend.position='none'), p2 + theme(legend.position='none'), main ='this is a title', left =
# 'This is my global Y-axis title'), legend, widths=unit.c(unit(1, 'npc') - legend$width,
# legend$width), nrow=1)

pushViewport(viewport(layout.pos.col = 5, layout.pos.row = 1:4))
# print(p3+theme(keep='legend_box'), newpage=FALSE) print(p3+theme(keep='legend_box'), newpage=FALSE)
grid.draw(legend)
popViewport(0)

#Create a new plain element
element_plain <- function (base_size = 12,base_family=""){ #
  structure(list(
    axis.line = element_line(),
    #axis.text.x = element_text(family = base_family, size = base_size * 0.8, vjust = 1, lineheight = 0.9), 
    #axis.text.y = element_text(family = base_family, size = base_size * 0.8, hjust = 1, lineheight = 0.9),
    #axis.ticks = element_line(colour = "black", size = 0.2), 
    #axis.title.x = element_text(family = base_family, size = base_size, vjust = 0.5), 
    #axis.title.y = element_text(family = base_family, size = base_size, vjust = 0.5, angle = 90), 
    #axis.ticks.length = unit(0.15, "cm"), 
    #axis.ticks.margin = unit(0.1, "cm"), 
    #legend.background=element_blank(),
    #legend.margin=unit(0.2,"cm"),
    #legend.key = element_rect(colour = "grey80"),
    #legend.key.size=unit(1.2,"lines"),
    #legend.key.height=NULL,
    #legend.key.width=NULL,
    #legend.text=element_text(family = base_family, size = base_size * 0.8),
    #legend.text.align=NULL,
    #legend.title=element_text(family = base_family, face = "bold", size = base_size * 0.8, hjust = 0),
    #legend.title.align=NULL,
    #legend.position = "right",
    #legend.direction=NULL,
    #legend.justification="center",
    #legend.box=NULL,
    #panel.background = element_blank(), 
    #panel.border = element_blank(), 
    #panel.grid.major = element_blank(), 
    #panel.grid.minor = element_blank(), 
    #panel.margin = unit(0.25, "lines"),
    #strip.background=element_rect(fill="transparent", colour="black"),
    #strip.text.x=element_text(family = base_family, size = base_size * 0.8),
    #strip.text.y=element_text(family = base_family, size = base_size * 0.8, angle = -90),
    #plot.background = element_blank(),
    plot.margin=unit(c(0.5,0.5,1,1),"lines"),
    plot.title=element_blank()
  ), class = "theme")
}

element_plain <- theme(text=element_text(size = 12,family=""),
    axis.line = element_line(),
    plot.margin=unit(c(0.5,0.5,1,1),"lines"),
    plot.title=element_blank()
)

#Construct the boxplots
p1 <- ggplot(CO2, aes(y=uptake,x=Treatment,fill=Type))+
  geom_boxplot(alpha=0.4)+scale_fill_manual(values=c("white","grey"))+
  element_plain

p2 <- ggplot(CO2, aes(y=uptake, x=conc, linetype=Type))+
 geom_smooth(color="black",se=FALSE)+
 geom_smooth(color="black", show_guide=FALSE)+
 scale_linetype()+
 element_plain

p3<-ggplot(CO2, aes(x=uptake, fill=Type)) +
 geom_density(alpha=0.4, colour=NA)+
 geom_density(alpha=0, show_guide=FALSE)+
 facet_grid(~Treatment)+
 scale_fill_manual(values=c("white","grey"))+
 element_plain

library(gridExtra)
tmp <- ggplot_gtable(ggplot_build(p3))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]

tmp <- ggplot_gtable(ggplot_build(p2))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend1 <- tmp$grobs[[leg]]
##legend <- gTree(legend, vp=viewport(layout.pos.col=5, layout.pos.row=1:4))

 grid.newpage()
  pushViewport(viewport(layout=grid.layout(4,5)))

  pushViewport(viewport(layout.pos.col=1:2, layout.pos.row=1:2))
  print(p1+theme(legend.position="none"), newpage=FALSE)
  popViewport(1)

  pushViewport(viewport(layout.pos.col=3:4, layout.pos.row=1:2))
  print(p2+theme(legend.position="none"), newpage=FALSE)
  popViewport(1)

  pushViewport(viewport(layout.pos.col=1:4, layout.pos.row=3:4))
  print(p3+theme(legend.position="none"), newpage=FALSE)
  popViewport(1)

pushViewport(viewport(layout.pos.col=5, layout.pos.row=1:4))
grid.draw(legend)
grid.draw(legend1)
  popViewport(0)

Tutorial 5.2 - The Grammar of Graphics in R (ggplot2)

The Grammar of Graphics

Geometric objects - geom_ and stat_

Primary geometric objects

geom_bar and stats_bin

geom_boxplot and stat_boxplot

geom_density and stat_density

geom_point

geom_line

geom_smooth and stat_smooth

geom_tile and geom_raster

geom_contour and stat_contour

Secondary geometric objects

geom_segment

geom_ribbon

geom_errorbar

geom_hline and geom_vline

geom_pointrange and geom_linerange

geom_rug

geom_text

geom_label

Coordinate system - coord

Altering the axes scales via the coordinate system

Transformers

trans_new

*_trans transformers

Scales

Scaling the x and y values (scale_x_)

Scaling the size of geoms (scale_size_)

Scaling the shape of geoms (scale_shape_)

Scaling the linetype associated with geoms (scale_linetype_)

Scaling the colour (or fill) associated with geoms (scale_colour_ & scale_fill_)

Scaling the alpha level of colour associated with geoms (scale_alpha_)

Facets (panels)

Themes

Exporting graphics

Portable Document Format (PDF)

Portable Network Graphics (PNG)

Arranging multiple figures on a page

Controlling panel dimensions

Examples

Exploring distributions

Boxplots - geom_boxplot & stat_boxplot

Univariate boxplots

Conditional (factorial) boxplots

Violin Plot - geom_violin

Histograms - geom_histogram, geom_bar & stat_bin

Univariate histograms

Conditional (factorial) histograms

Density plots - geom_density & stat_density

Univariate density plots

Factorial density

Line graphs - geom_line

Scatterplots - geom_point, geom_line, geom_smooth, stat_smooth & stat_summary

Simple scatterplots

Trends and smoothers

Bargraphs (dynamite plots) - geom_bar & stat_summary

Simple bargraphs

Conditional (factorial) bargraphs

Bar charts - geom_bar

Interaction plots - geom_point, geom_line, geom_smooth

Scatterplot matrix - ggpairs

Multi-panel (facetted) plot - facet_grid & facet_wrap

2D grid of panels - facet_grid

Multiple graphs per graphic

grid.arrange

Viewports

Welcome to the end of this tutorial

Geometric objects - `geom_` and `stat_`

`geom_bar` and `stats_bin`

`geom_boxplot` and `stat_boxplot`

`geom_density` and `stat_density`

`geom_point`

`geom_line`

`geom_smooth` and `stat_smooth`

`geom_tile` and `geom_raster`

`geom_contour` and `stat_contour`

`geom_segment`

`geom_ribbon`

`geom_errorbar`

`geom_hline` and `geom_vline`

`geom_pointrange` and `geom_linerange`

`geom_rug`

`geom_text`

`geom_label`

`trans_new`

`*_trans` transformers

Scaling the x and y values (`scale_x_`)

Scaling the size of geoms (`scale_size_`)

Scaling the shape of geoms (`scale_shape_`)

Scaling the linetype associated with geoms (`scale_linetype_`)

Scaling the colour (or fill) associated with geoms (`scale_colour_` & `scale_fill_`)

Scaling the alpha level of colour associated with geoms (`scale_alpha_`)