Workshop 9.9a - Split-plot and complex repeated measures designs (mixed effects models)

4 August 2011

Topics

Randomized block
- Exercise 2

Basic statistics references

Logan (2010) - Chpt 12-14
Quinn & Keough (2002) - Chpt 9-11

Very basic overview of Nested ANOVA

Very basic overview of Randomized block

Split-plot

In an attempt to understand the effects on marine animals of short-term exposure to toxic substances, such as might occur following a spill, or a major increase in storm water flows, a it was decided to examine the toxicant in question, Copper, as part of a field experiment in Honk Kong. The experiment consisted of small sources of Cu (small, hemispherical plaster blocks, impregnated with copper), which released the metal into sea water over 4 or 5 days. The organism whose response to Cu was being measured was a small, polychaete worm, Hydroides, that attaches to hard surfaces in the sea, and is one of the first species to colonize any surface that is submerged. The biological questions focused on whether the timing of exposure to Cu affects the overall abundance of these worms. The time period of interest was the first or second week after a surface being available.

The experimental setup consisted of sheets of black perspex (settlement plates), which provided good surfaces for these worms. Each plate had a plaster block bolted to its centre, and the dissolving block would create a gradient of [Cu] across the plate. Over the two weeks of the experiment, a given plate would have pl ain plaster blocks (Control) or a block containing copper in the first week, followed by a plain block, or a plain block in the first week, followed by a dose of copper in the second week. After two weeks in the water, plates were removed and counted back in the laboratory. Without a clear idea of how sensitive these worms are to copper, an effect of the treatments might show up as an overall difference in the density of worms across a plate, or it could show up as a gradient in abundance across the plate, with a different gradient in different treatments. Therefore, on each plate, the density of worms (#/cm²) was recorded at each of four distances from the center of the plate.

Download Copper data set

Format of copper.csv data file

COPPER	PLATE	DIST	WORMS
..	..	..	..

COPPER	Categorical listing of the copper treatment (control = no copper applied, week 2 = copper treatment applied in second week and week 1= copper treatment applied in first week) applied to whole plates. Factor A (between plot factor).
PLATE	Substrate provided for polychaete worm colonization on which copper treatment applied. These are the plots (Factor B). Numbers in this column represent numerical labels given to each plate.
DIST	Categorical listing for the four concentric distances from the center of the plate (source of copper treatment) with 1 being the closest and 4 the furthest. Factor C (within plot factor)
WORMS	Density (#/cm₂) of worms measured. Response variable.

Open the Copper data set

Show code

> copper <- read.table("../downloads/data/copper.csv", header = T,
+     sep = ",", strip.white = T)
> head(copper)

   COPPER PLATE DIST WORMS
1 control   200    4 11.50
2 control   200    3 13.00
3 control   200    2 13.50
4 control   200    1 12.00
5 control    39    4 17.75
6 control    39    3 13.75

Notice that both the PLATE variable and the DIST variable contain only numbers. Make sure that you define both of these as factors (HINT)

Show code

> copper$PLATE <- factor(copper$PLATE)
> copper$DIST <- factor(copper$DIST)

Q3-1. What are the main hypotheses being tested?

H₀ Main Effect 1 (Factor A):
H₀ Main Effect 2 (Factor C):
H₀ Main Effect 3 (A*C):

Q3-2. The usual ANOVA assumptions apply to split-plot designs, and these can be tested by constructing boxplots for each of the main effects. However, it is important to consider what data the boxplots should be based upon. For each of the main hypothesis tests, describe what data should be used to construct boxplots (remember that the assumptions of normality and homogeneity of variance apply to the residuals of each hypothesis test, and therefore the data used in the construction of boxplots for each hypothesis test should represent the respective residuals, or sources of unexplained variation).

H₀ Main Effect 1 (Factor A):
H₀ Main Effect 2 (Factor C):
H₀ Main Effect 3 (A*C):

Q3-3. For each of the hypothesis tests, indicate which Mean Square term should be used as the residual (denominator) in the F-ratio calculation. Note, COPPER and DIST are fixed factors and PLATE is a random factor.

H₀ Main Effect 1 (Factor A): F-ratio = MS_COPPER/MS
H₀ Main Effect 2 (Factor C): F-ratio = MS_DIST/MS
H₀ Main Effect 3 (A*C): F-ratio = MS_COPPER:DIST/MS

Q3-4. Construct a boxplot to investigate the assumptions as they apply to the test of H₀ Main Effect 1 (Factor A): This is done in two steps

Aggregate the data set by the mean number of WORMS within each plate
(HINT)

Show code

> library(plyr)
> cu.ag <- ddply(copper, ~COPPER + PLATE, function(df) data.frame(WORMS = mean(df$WORMS)))
> #OR
> library(plyr)
> cu.ag <- ddply(copper, ~COPPER + PLATE, summarise, WORMS = mean(WORMS,
+     na.rm = TRUE))
> #OR
> cu.ag <- aggregate(copper, list(COPPER = copper$COPPER, PLATE = copper$PLATE),
+     mean)
> #OR
> library(nlme)
> cu.ag <- gsummary(copper, form = ~COPPER/PLATE, base:::mean)

Construct a boxplot of aggregated mean number of WORMS against COPPER treatment
(HINT)

Show code

> boxplot(WORMS ~ COPPER, data = cu.ag)

Any evidence of violations of the assumptions (y or n)?

Q3-5. Construct a boxplot to investigate the assumptions as they apply to the test of H₀ Main Effect 2 (Factor C): Since Factor C is tested against the overal residual in this case, this is a relatively straight forward procedure. (HINT)

Show code

> boxplot(WORMS ~ DIST, data = copper)

Any evidence of violations of the assumptions (y or n)?

Q3-6. Construct a boxplot to investigate the assumptions as they apply to the test of H₀ the main interaction effect (A:C): Since A:C is tested against the overal residual, this is a relatively straight forward procedure.(HINT)

Show code

> boxplot(WORMS ~ COPPER:DIST, data = copper)

Any evidence of violations of the assumptions (y or n)?

Q3-7. In addition to the above assumptions, the test of PLATE assumes that there is no PLATE by DIST interaction as this is the overal residual (the replicates). That is, the test assumes that the effect of DIST is consistent in all PLATES. Construct an interaction plot to examine whether there is any evidence of an interaction between PLATE and DISTANCE (HINT)

Show code

> library(car)
> with(copper, interaction.plot(PLATE, DIST, WORMS))

> library(alr3)
> tukey.nonadd.test(lm(WORMS ~ PLATE + DIST, data = copper))

      Test     Pvalue 
-4.122e+00  3.749e-05

Any evidence of an interaction (y or n)?

Is the design (model) unbalanced ?

(HINT) (Yor N)

Show code

> !is.list(replications(WORMS ~ COPPER + Error(PLATE) + DIST + COPPER:DIST,
+     data = copper))

[1] TRUE

Is there any evidence of issues with sphericity
?(HINT) (Y or N)

Show code

> library(biology)
> epsi.GG.HF(aov(WORMS ~ COPPER + Error(PLATE) + DIST + COPPER:DIST,
+     data = copper))

$GG.eps
[1] 0.5203

$HF.eps
[1] 0.5841

$sigma
[1] 1.807

Q3-8. Write out the linear model

Q3-9. Perform a split-plot ANOVA

(HINT), and complete the following table (HINT). To obtain the hypothesis test for the random factor (Factor B: PLATE), run the model as if all factors were fixed and thus all terms are tested against the overall residuals, HINT)

Show code

> copper.aov <- aov(WORMS ~ COPPER + DIST + COPPER:DIST + Error(PLATE),
+     data = copper)
> AnovaM(copper.aov, RM = T)

   Sphericity Epsilon Values 
-------------------------------
 Greenhouse.Geisser Huynh.Feldt
             0.5203      0.5841


Anova Table (Type I tests)
Response: WORMS
Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
COPPER     2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
            Df Sum Sq Mean Sq F value  Pr(>F)    
DIST         3  153.8    51.3   28.38 1.4e-09 ***
COPPER:DIST  6   53.4     8.9    4.93   9e-04 ***
Residuals   36   65.0     1.8                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Greenhouse-Geisser corrected ANOVA table
Response: WORMS
Error: PLATE
             Df Sum Sq Mean Sq F value Pr(>F)    
COPPER     1.04    784     392     128  8e-08 ***
Residuals 12.00     37       3                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
               Df Sum Sq Mean Sq F value  Pr(>F)    
DIST         1.56  153.8    51.3   28.38 2.7e-07 ***
COPPER:DIST  3.12   53.4     8.9    4.93  0.0052 ** 
Residuals   36.00   65.0     1.8                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Huynh-Feldt corrected ANOVA table
Response: WORMS
Error: PLATE
             Df Sum Sq Mean Sq F value  Pr(>F)    
COPPER     1.17    784     392     128 5.3e-08 ***
Residuals 12.00     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
               Df Sum Sq Mean Sq F value  Pr(>F)    
DIST         1.75  153.8    51.3   28.38 1.1e-07 ***
COPPER:DIST  3.50   53.4     8.9    4.93   0.004 ** 
Residuals   36.00   65.0     1.8                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Source of variation	df	Mean Sq	F-ratio	P-value
COPPER
PLATE
DIST
COPPER:DIST
Residuals

Q3-10.Construct an interaction plot

showing the density of worms against distance from Cu source, with each treatment as different lines (or different bars). HINT

Show code

> interaction.plot(copper$DIST, copper$COPPER, copper$WORMS)

Q3-11. What conclusions would you draw from the analysis (and graph)?

Q3-12. In order to tease out the interaction, we could analyse the effects of the main factor (COPPER) separately for each Distance and/or investigate the effects of Distance separately for each of the three copper treatments. Recall that when performing such simple main effects, it is necessary to use the residual terms from the global analyses as these are likely to be better estimates (as they are based on a larger amount of data).

Investigate the effects of copper separately for each of the four distances HINT

Show code

> AnovaM(mainEffects(copper.aov, at = DIST == 1), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        9    207   23.02    12.7 8.4e-09 ***
Residuals 36     65    1.81                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> AnovaM(mainEffects(copper.aov, at = DIST == 2), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        9    207   23.02    12.7 8.4e-09 ***
Residuals 36     65    1.81                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> AnovaM(mainEffects(copper.aov, at = DIST == 3), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        9    207   23.02    12.7 8.4e-09 ***
Residuals 36     65    1.81                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What conclusions would you draw from these four analyses?

Investigate the effects of distance separately for each of the three copper treatments HINT

Show code

> AnovaM(mainEffects(copper.aov, at = COPPER == "control"), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        6  188.6   31.43   17.40 2.5e-09 ***
DIST       3   18.6    6.21    3.44   0.027 *  
Residuals 36   65.0    1.81                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> AnovaM(mainEffects(copper.aov, at = COPPER == "Week 1"), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        6  188.1   31.34   17.35 2.6e-09 ***
DIST       3   19.2    6.39    3.54   0.024 *  
Residuals 36   65.0    1.81                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

> AnovaM(mainEffects(copper.aov, at = COPPER == "Week 2"), type = "III")

Error: PLATE
          Df Sum Sq Mean Sq F value  Pr(>F)    
INT        2    784     392     128 8.1e-09 ***
Residuals 12     37       3                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
          Df Sum Sq Mean Sq F value Pr(>F)    
INT        6   37.8     6.3    3.49 0.0081 ** 
DIST       3  169.4    56.5   31.26  4e-10 ***
Residuals 36   65.0     1.8                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What conclusions would you draw from these four analyses?

Mixed models

Show code

> copper.lme <- lme(WORMS ~ COPPER * DIST, random = ~1 | PLATE, data = copper,
+     method = "ML")
> copper.lme1 <- lme(WORMS ~ COPPER * DIST, random = ~1 | PLATE, data = copper,
+     method = "ML", correlation = corCompSymm(form = ~1 | PLATE))
> copper.lme2 <- lme(WORMS ~ COPPER * DIST, random = ~1 | PLATE, data = copper,
+     method = "ML", correlation = corAR1(form = ~1 | PLATE))
> anova(copper.lme, copper.lme1)

            Model df   AIC   BIC logLik   Test   L.Ratio p-value
copper.lme      1 14 228.3 257.6 -100.1                         
copper.lme1     2 15 230.3 261.7 -100.1 1 vs 2 1.025e-10       1

> #Check whether a first order autoregressive correlation structure is more
> #   appropriate
> copper.lme2 <- update(copper.lme, correlation = corAR1(form = ~1 |
+     PLATE))
> anova(copper.lme, copper.lme2)

            Model df   AIC   BIC  logLik   Test L.Ratio p-value
copper.lme      1 14 228.3 257.6 -100.14                       
copper.lme2     2 15 222.9 254.3  -96.47 1 vs 2   7.352  0.0067

> # Now fit the 'best'' model with REML
> copper.lme3 <- update(copper.lme2, method = "REML")
> anova.lme(copper.lme3)

            numDF denDF F-value p-value
(Intercept)     1    36   980.6  <.0001
COPPER          2    12    93.1  <.0001
DIST            3    36    25.1  <.0001
COPPER:DIST     6    36     4.3  0.0025

Show code

> detach(package:nlme)
> copper.lmer <- lmer(WORMS ~ COPPER * DIST + (1 | PLATE), data = copper)
> #Examine the fixed effects coefficients
> summary(copper.lmer)@coefs

                   Estimate Std. Error   t value
(Intercept)           10.85     0.6512  16.66170
COPPERWeek 1          -3.60     0.9209  -3.90910
COPPERWeek 2         -10.60     0.9209 -11.51013
DIST2                  1.15     0.8501   1.35275
DIST3                  1.55     0.8501   1.82327
DIST4                  2.70     0.8501   3.17601
COPPERWeek 1:DIST2    -0.05     1.2023  -0.04159
COPPERWeek 2:DIST2     0.05     1.2023   0.04159
COPPERWeek 1:DIST3    -0.30     1.2023  -0.24953
COPPERWeek 2:DIST3     2.20     1.2023   1.82990
COPPERWeek 1:DIST4     0.05     1.2023   0.04159
COPPERWeek 2:DIST4     4.90     1.2023   4.07568

> #OR
> #library(gmodels)
> #ci(copper.lmer)
> library(languageR)
> pvals.fnc(copper.lmer, withMCMC = FALSE, addPlot = F)

$fixed
                   Estimate MCMCmean HPD95lower HPD95upper  pMCMC Pr(>|t|)
(Intercept)           10.85  10.8445     9.5503     12.152 0.0001   0.0000
COPPERWeek 1          -3.60  -3.5913    -5.4902     -1.794 0.0006   0.0003
COPPERWeek 2         -10.60 -10.5975   -12.4019     -8.745 0.0001   0.0000
DIST2                  1.15   1.1658    -0.5512      2.997 0.1948   0.1825
DIST3                  1.55   1.5584    -0.3118      3.309 0.0950   0.0745
DIST4                  2.70   2.7097     0.8658      4.457 0.0048   0.0026
COPPERWeek 1:DIST2    -0.05  -0.0723    -2.5778      2.417 0.9516   0.9670
COPPERWeek 2:DIST2     0.05   0.0402    -2.5969      2.468 0.9698   0.9670
COPPERWeek 1:DIST3    -0.30  -0.3208    -2.8322      2.186 0.7888   0.8040
COPPERWeek 2:DIST3     2.20   2.1924    -0.3846      4.715 0.0902   0.0735
COPPERWeek 1:DIST4     0.05   0.0372    -2.5428      2.598 0.9712   0.9670
COPPERWeek 2:DIST4     4.90   4.8815     2.2577      7.404 0.0002   0.0002

$random
    Groups        Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper
1    PLATE (Intercept)   0.5599     0.2525   0.2804      0.000     0.7233
2 Residual               1.3442     1.4191   1.4341      1.149     1.7584

Broad interpretation: In general, compared to the number of worms settling on the control plates, the number of worms settling was less in the Week 1 treatment and even lower in the Week 2 treatment. The increases in number of settled worms in distances progressively further from the source (Dist 1) get progressively larger and the trend between DIST1 (base level) and DIST 4 is significantly greater for the Week 2 treatment compared to the control copper treatment.

Examine the model fit measures

Show code

> summary(copper.lmer)@AICtab

   AIC   BIC logLik deviance REMLdev
 218.3 247.6 -95.13    200.3   190.3

Examine the variance components

Show code

> data.frame(summary(copper.lmer)@REmat)

    Groups        Name Variance Std.Dev.
1    PLATE (Intercept)    0.314     0.56
2 Residual                1.807     1.34

Examine the predicted means and confidence intervals

Show code

> newdata <- expand.grid(COPPER = levels(copper$COPPER), DIST = levels(copper$DIST),
+     WORMS = 0)
> mod.mat <- model.matrix(terms(copper.lmer), newdata)
> newdata$WORMS = mod.mat %*% fixef(copper.lmer)

Error: no applicable method for 'fixef' applied to an object of class "mer"

> pvar1 <- diag(mod.mat %*% tcrossprod(vcov(copper.lmer), mod.mat))
> tvar1 <- pvar1 + VarCorr(copper.lmer)$PLATE[1]

Error: no applicable method for 'VarCorr' applied to an object of class "mer"

> newdata <- data.frame(newdata, plo = newdata$WORMS - 2 * sqrt(pvar1),
+     phi = newdata$WORMS + 2 * sqrt(pvar1), tlo = newdata$WORMS - 2 * sqrt(tvar1),
+     thi = newdata$WORMS + 2 * sqrt(tvar1))

Error: object 'tvar1' not found

> par(mar = c(5, 5, 1, 1))
> plot(WORMS ~ as.numeric(DIST), newdata, type = "n", axes = F, ann = F,
+     ylim = range(newdata$plo, newdata$phi), xlim = c(0.5, 4.5))

Error: need finite 'ylim' values

> ##Control
> with(subset(newdata, COPPER == "control"), arrows(as.numeric(DIST),
+     plo, as.numeric(DIST), phi, ang = 90, length = 0.05, code = 3))

Error: object 'plo' not found

> with(subset(newdata, COPPER == "control"), lines(as.numeric(DIST),
+     WORMS, lty = 1))
> with(subset(newdata, COPPER == "control"), points(as.numeric(DIST),
+     WORMS, pch = 21, bg = "black", col = "black"))
> ##Week 1
> with(subset(newdata, COPPER == "Week 1"), arrows(as.numeric(DIST),
+     plo, as.numeric(DIST), phi, ang = 90, length = 0.05, code = 3))

Error: object 'plo' not found

> with(subset(newdata, COPPER == "Week 1"), lines(as.numeric(DIST),
+     WORMS, lty = 2))
> with(subset(newdata, COPPER == "Week 1"), points(as.numeric(DIST),
+     WORMS, pch = 21, bg = "grey", col = "black"))
> ##Week 2
> with(subset(newdata, COPPER == "Week 2"), arrows(as.numeric(DIST),
+     plo, as.numeric(DIST), phi, ang = 90, length = 0.05, code = 3))

Error: object 'plo' not found

> with(subset(newdata, COPPER == "Week 2"), lines(as.numeric(DIST),
+     WORMS, lty = 3))
> with(subset(newdata, COPPER == "Week 2"), points(as.numeric(DIST),
+     WORMS, pch = 21, bg = "white", col = "black"))
> 
> axis(1, at = as.numeric(newdata$DIST), lab = newdata$DIST)
> axis(2, las = 1)
> mtext("Number of settled worms", 2, line = 3, cex = 1.25)
> mtext("Distance from Copper", 1, line = 3, cex = 1.25)
> legend("bottomright", legend = c("Control", "Week 1", "Week 2"),
+     pch = c(21, 21, 21), pt.bg = c("black", "grey", "white"), lty = c(1, 2,
+         3), bty = "n", title = "Treatment")
> box(bty = "l")

Show code

> library(AICcmodavg)
> data.frame(newdata, predictSE.mer(copper.lmer, newdata, print.matrix = T))

Error: no applicable method for 'fixef' applied to an object of class "mer"

Now modeled against a Poisson distribution with polynomial contrasts for the distance variable.

Show code

> contrasts(copper$DIST) <- contr.poly
> copper.lmer <- lmer(WORMS ~ COPPER * DIST + (1 | PLATE), data = copper,
+     family = "poisson")

Examine the fixed effects coefficients

Show code

> summary(copper.lmer)@coefs

                     Estimate Std. Error  z value  Pr(>|z|)
(Intercept)          2.498289    0.06422 38.90153 0.000e+00
COPPERWeek 1        -0.361811    0.10033 -3.60622 3.107e-04
COPPERWeek 2        -1.890271    0.25974 -7.27768 3.396e-13
DIST.L               0.156402    0.12875  1.21476 2.245e-01
DIST.Q              -0.006025    0.12844 -0.04691 9.626e-01
DIST.C               0.027693    0.12813  0.21613 8.289e-01
COPPERWeek 1:DIST.L  0.063304    0.20091  0.31509 7.527e-01
COPPERWeek 2:DIST.L  2.382689    0.63043  3.77948 1.572e-04
COPPERWeek 1:DIST.Q  0.016657    0.20066  0.08301 9.338e-01
COPPERWeek 2:DIST.Q -0.535793    0.51947 -1.03142 3.023e-01
COPPERWeek 1:DIST.C  0.032273    0.20041  0.16103 8.721e-01
COPPERWeek 2:DIST.C  0.062334    0.37717  0.16527 8.687e-01

> #OR
> #library(gmodels)
> #ci(copper.lmer)

Interpretation: Compared to the number of worms settling on control plates, the number of settling worms is less in Week 1 and even lower in the Week 2 copper treatment. Settlement increases with increasing distance from the copper source and this rate of increase is over twice the rate (significantly greater) in the Week 2 treatment.

Show code - ezAnova

> library(ez)
> ezANOVA(dv = .(WORMS), between = .(COPPER), within = .(DIST), wid = .(PLATE),
+     data = copper, type = 3)

$ANOVA
       Effect DFn DFd       F         p p<.05    ges
2      COPPER   2  12 128.021 8.051e-09     * 0.8851
3        DIST   3  36  28.375 1.356e-09     * 0.6018
4 COPPER:DIST   6  36   4.928 9.028e-04     * 0.3442

$`Mauchly's Test for Sphericity`
       Effect      W        p p<.05
3        DIST 0.2311 0.008005     *
4 COPPER:DIST 0.2311 0.008005     *

$`Sphericity Corrections`
       Effect    GGe     p[GG] p[GG]<.05    HFe     p[HF] p[HF]<.05
3        DIST 0.5203 6.356e-06         * 0.5841 2.048e-06         *
4 COPPER:DIST 0.5203 1.022e-02         * 0.5841 7.346e-03         *

Repeated Measures

In an honours thesis from (1992), Mullens was investigating the ways that cane toads ( Bufo marinus ) respond to conditions of hypoxia. Toads show two different kinds of breathing patterns, lung or buccal, requiring them to be treated separately in the experiment. Her aim was to expose toads to a range of O₂ concentrations, and record their breathing patterns, including parameters such as the expired volume for individual breaths. It was desirable to have around 8 replicates to compare the responses of the two breathing types, and the complication is that animals are expensive, and different individuals are likely to have different O₂ profiles (leading to possibly reduced power). There are two main design options for this experiment;

One animal per O₂ treatment, 8 concentrations, 2 breathing types. With 8 replicates the experiment would require 128 animals, but that this could be analysed as a completely randomized design
One O₂ profile per animal, so that each animal would be used 8 times and only 16 animals are required (8 lung and 8 buccal breathers)

Mullens decided to use the second option so as to reduce the number of animals required (on financial and ethical grounds). By selecting this option, she did not have a set of independent measurements for each oxygen concentration, by repeated measurements on each animal across the 8 oxygen concentrations.

Download Mullens data set

Format of mullens.csv data file

BREATH	TOAD	O2LEVEL	FREQBUC	SFREQBUC
lung	a	0	10.6	3.256
lung	a	5	18.8	4.336
lung	a	10	17.4	4.171
lung	a	15	16.6	4.074
...	...	...	...	...

BREATH	Categorical listing of the breathing type treatment (buccal = buccal breathing toads, lung = lung breathing toads). This is the between subjects (plots) effect and applies to the whole toads (since a single toad can only be one breathing type - either lung or buccal). Equivalent to Factor A (between plots effect) in a split-plot design
TOAD	These are the subjects (equivalent to the plots in a split-plot design: Factor B). The letters in this variable represent the labels given to each individual toad.
O2LEVEL	0 through to 50 represent the the different oxygen concentrations (0% to 50%). The different oxygen concentrations are equivalent to the within plot effects in a split-plot (Factor C).
FREQBUC	The frequency of buccal breathing - the response variable
SFREQBUC	Square root transformed frequency of buccal breathing - the response variable

Open the mullens data file. HINT.

Show code

> mullens <- read.table("../downloads/data/mullens.csv", header = T,
+     sep = ",", strip.white = T)
> head(mullens)

  BREATH TOAD O2LEVEL FREQBUC SFREQBUC
1   lung    a       0    10.6    3.256
2   lung    a       5    18.8    4.336
3   lung    a      10    17.4    4.171
4   lung    a      15    16.6    4.074
5   lung    a      20     9.4    3.066
6   lung    a      30    11.4    3.376

Notice that both the O2LEVEL variable contains only numbers. Make sure that you define both of this as a factors (HINT)

Show code

> mullens$O2LEVEL <- factor(mullens$O2LEVEL)

Q4-1. What are the main hypotheses being tested?

H₀ Main Effect 1 (Factor A):
H₀ Main Effect 2 (Factor C):
H₀ Main Effect 3 (A*C):

Q4-2. We will now address all the assumptions.
Construct a boxplot to investigate the assumptions as they apply to the test of H₀ Main Effect 1 (Factor A): This is done in two steps

Construct a boxplot to investigate the assumptions as they apply to the test of H₀ Main Effect 1 (Factor A): Start by aggregating the data set by TOAD
(since the toads are the replicates for the between subject effect - BREATH) (HINT). Then Construct a boxplot of aggregated mean FREQBUC against BREATH treatment
(HINT). Any evidence of violations of the assumptions (y or n)?
Show code
> library(plyr) > mullens.ag <- ddply(mullens, ~BREATH + TOAD, function(df) data.frame(FREQBUC = mean(df$FREQBUC))) > boxplot(FREQBUC ~ BREATH, mullens.ag)
Try a square-root transformation!

Show code

> library(plyr)
> mullens.ag <- ddply(mullens, ~BREATH + TOAD, function(df) data.frame(SFREQBUC = mean(sqrt(df$FREQBUC))))
> boxplot(SFREQBUC ~ BREATH, mullens.ag)

Construct a boxplot to investigate the assumptions as they apply to the test of H₀ Main Effect 2 (Factor C): Since Factor C is tested against the overall residual in this case, this is a relatively straight forward procedure.(HINT). Any evidence of violations of the assumptions (y or n)?

Show code

> boxplot(SFREQBUC ~ O2LEVEL, mullens)

Construct a boxplot to investigate the assumptions as they apply to the test of H₀ the main interaction effect (A:C): Since A:C is tested against the overal residual, this is a relatively straight forward procedure.(HINT). Any evidence of violations of the assumptions (y or n)?

Show code

> boxplot(SFREQBUC ~ BREATH * O2LEVEL, mullens)

In addition to the above assumptions, the test of TOAD assumes that there is no TOAD by O2LEVEL interaction as this is the overal residual (the replicates). That is, the test assumes that the effect of O2LEVEL is consistent in all TOADS. Construct an interaction plot to examine whether there is any evidence of an interaction between TOAD and O2LEVEL (HINT). Any evidence of an interaction (y or n)?

Show code

> with(mullens, interaction.plot(TOAD, O2LEVEL, SFREQBUC))

> library(alr3)
> tukey.nonadd.test(lm(SFREQBUC ~ TOAD + O2LEVEL, data = mullens))

    Test   Pvalue 
3.079699 0.002072

You must also check to see whether the proposed model balanced.

Show code

> !is.list(replications(SFREQBUC ~ BREATH + Error(TOAD) + O2LEVEL +
+     BREATH:O2LEVEL, data = mullens))

[1] FALSE

Is it? (Yor N)

Finally, since the design is a repeated measures, there is a good chance that the assumption of sphericity

has been violated balanced.

Show code

> library(biology)
> epsi.GG.HF(aov(SFREQBUC ~ BREATH + O2LEVEL + BREATH:O2LEVEL + Error(TOAD),
+     data = mullens))

$GG.eps
[1] 0.4282

$HF.eps
[1] 0.5173

$sigma
[1] 0.7531

Has it (HINT)? (Y or N)

Q4-3. Assume that the assumption of compound symmetry/sphericity will be violated and perform a split-plot ANOVA (repeated measures)

(HINT), and complete the following table with corrected p-values (HINT).

Show code

> mullens.aov <- aov(SFREQBUC ~ BREATH * O2LEVEL + Error(TOAD), data = mullens)
> library(biology)
> AnovaM(mullens.aov, RM = T)

   Sphericity Epsilon Values 
-------------------------------
 Greenhouse.Geisser Huynh.Feldt
             0.4282      0.5173


Anova Table (Type I tests)
Response: SFREQBUC
Error: TOAD
          Df Sum Sq Mean Sq F value Pr(>F)  
BREATH     1   39.9    39.9    5.76  0.027 *
Residuals 19  131.6     6.9                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
                Df Sum Sq Mean Sq F value  Pr(>F)    
O2LEVEL          7   39.0    5.57    7.39 1.7e-07 ***
BREATH:O2LEVEL   7   56.4    8.05   10.69 1.2e-10 ***
Residuals      133  100.2    0.75                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Greenhouse-Geisser corrected ANOVA table
Response: SFREQBUC
Error: TOAD
              Df Sum Sq Mean Sq F value Pr(>F)  
BREATH     0.428   39.9    39.9    5.76  0.048 *
Residuals 19.000  131.6     6.9                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
                Df Sum Sq Mean Sq F value  Pr(>F)    
O2LEVEL          3   39.0    5.57    7.39 0.00013 ***
BREATH:O2LEVEL   3   56.4    8.05   10.69 2.4e-06 ***
Residuals      133  100.2    0.75                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 


Huynh-Feldt corrected ANOVA table
Response: SFREQBUC
Error: TOAD
              Df Sum Sq Mean Sq F value Pr(>F)  
BREATH     0.517   39.9    39.9    5.76  0.044 *
Residuals 19.000  131.6     6.9                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Error: Within
                   Df Sum Sq Mean Sq F value  Pr(>F)    
O2LEVEL          3.62   39.0    5.57    7.39 4.1e-05 ***
BREATH:O2LEVEL   3.62   56.4    8.05   10.69 4.2e-07 ***
Residuals      133.00  100.2    0.75                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Source of variation	df	Mean Sq	F-ratio	P-value	GG.P	HF.P
BREATH
TOAD
O2LEVEL
BREATH:O2LEVEL
Residuals

Q4-4. Construct an interaction plot

showing the frequency of buccal breathing against oxygen level, with each breathing type as different lines (or different bars). HINT

Show code

> with(mullens, interaction.plot(O2LEVEL, BREATH, FREQBUC))

Q4-5. What conclusions would you draw from the analysis (and graph)?

Mixed models

Show code

> mullens.lmer <- lmer(SFREQBUC ~ BREATH * O2LEVEL + (1 | TOAD), data = mullens)

Examine the fixed effects coefficients

Show code

> summary(mullens.lmer)@coefs

                     Estimate Std. Error t value
(Intercept)            4.9410     0.3425 14.4264
BREATHlung            -3.3914     0.5549 -6.1117
O2LEVEL5              -0.2174     0.3404 -0.6386
O2LEVEL10             -0.5476     0.3404 -1.6087
O2LEVEL15             -0.8687     0.3404 -2.5521
O2LEVEL20             -1.5444     0.3404 -4.5371
O2LEVEL30             -1.4644     0.3404 -4.3021
O2LEVEL40             -2.2457     0.3404 -6.5973
O2LEVEL50             -2.4600     0.3404 -7.2271
BREATHlung:O2LEVEL5    1.0362     0.5515  1.8789
BREATHlung:O2LEVEL10   2.3327     0.5515  4.2298
BREATHlung:O2LEVEL15   2.3729     0.5515  4.3027
BREATHlung:O2LEVEL20   3.2985     0.5515  5.9809
BREATHlung:O2LEVEL30   2.9767     0.5515  5.3975
BREATHlung:O2LEVEL40   3.6168     0.5515  6.5582
BREATHlung:O2LEVEL50   3.4669     0.5515  6.2864

> #library(gmodels)
> #ci(mullens.lmer)
> library(languageR)
> pvals.fnc(mullens.lmer, withMCMC = FALSE, addPlot = F)

$fixed
                     Estimate MCMCmean HPD95lower HPD95upper  pMCMC Pr(>|t|)
(Intercept)            4.9410   4.9453     4.3055     5.5567 0.0001   0.0000
BREATHlung            -3.3914  -3.3902    -4.3822    -2.3809 0.0001   0.0000
O2LEVEL5              -0.2174  -0.2215    -0.9507     0.4798 0.5432   0.5240
O2LEVEL10             -0.5476  -0.5560    -1.3045     0.1490 0.1288   0.1097
O2LEVEL15             -0.8687  -0.8698    -1.5810    -0.1290 0.0160   0.0117
O2LEVEL20             -1.5444  -1.5478    -2.2561    -0.8058 0.0001   0.0000
O2LEVEL30             -1.4644  -1.4645    -2.1969    -0.7506 0.0002   0.0000
O2LEVEL40             -2.2457  -2.2462    -3.0119    -1.5349 0.0001   0.0000
O2LEVEL50             -2.4600  -2.4671    -3.1994    -1.7526 0.0001   0.0000
BREATHlung:O2LEVEL5    1.0362   1.0344    -0.1435     2.1923 0.0814   0.0622
BREATHlung:O2LEVEL10   2.3327   2.3378     1.1807     3.5274 0.0002   0.0000
BREATHlung:O2LEVEL15   2.3729   2.3715     1.1653     3.5353 0.0001   0.0000
BREATHlung:O2LEVEL20   3.2985   3.2962     2.1323     4.4427 0.0001   0.0000
BREATHlung:O2LEVEL30   2.9767   2.9748     1.8024     4.1244 0.0001   0.0000
BREATHlung:O2LEVEL40   3.6168   3.6087     2.4778     4.7871 0.0001   0.0000
BREATHlung:O2LEVEL50   3.4669   3.4722     2.2994     4.6385 0.0001   0.0000

$random
    Groups        Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper
1     TOAD (Intercept)   0.8786     0.5998   0.6079     0.4390     0.7956
2 Residual               0.8678     0.9309   0.9342     0.8194     1.0525

Source	A,C (Fix); B (Rand)
A	MS_A/MS_B
B	MS_B/MS_Residual
C	MS_C/MS_B:C
A:C	MS_A:C/MS_B:C
B:C	MS_B:C/MS_Residual