Tutorial 7.5b - Analysis of Covariance (Bayesian)
12 Jan 2018
Overview
Previous tutorials have concentrated on designs for either continuous (Regression) or categorical (ANOVA) predictor variables. Analysis of covariance (ANCOVA) models are essentially ANOVA models that incorporate one or more continuous and categorical variables (covariates). Although the relationship between a response variable and a covariate may itself be of substantial biological interest, typically covariate(s) are incorporated to reduce the amount of unexplained variability in the model (analogous to blocking - see tutorials ) and thereby increase the power of any treatment effects.
Linear model
The generic linear model is presented here purely for revisory purposes. If, it is unfamiliar to you or you are unsure about what parameters are to be estimated and tested, you are strongly advised to review the the tutorial on frequentist analysis of covariance.
The linear models for two and three factor design are: $$y_{ijk}=\mu+\alpha_i + \beta_{j} + \varepsilon_{ijk}$$ where $\mu$ is the overall mean, $\alpha$ is the effect of Factor A, $\beta$ is the effect of the covariate (Factor B), and $\varepsilon$ is the random unexplained or residual component.
ANCOVA in R
Data and scenario
Consider an experimental design aimed at exploring the effects of a categorical variable with three levels (Group A, Group B and Group C) on a response. From previous studies, we know that the response is influenced by another variable (covariate). Unfortunately, it was not possible to ensure that all sampling units were the same degree of the covariate. Therefore, in an attempt to account for this anticipated extra source of variability, we measured the level of the covariate for each sampling unit. Actually, in allocating treatments to the various treatment groups, we tried to ensure a similar mean and range of the covariate within each group.
- the sample size per treatment=10
- the categorical $x$ variable with 3 levels
- the first treatment group has a population mean of 40.
- the other two treatments reduce the mean by 15 and 20 units respectively
- the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
- the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15
set.seed(3) n <- 10 p <- 3 A.eff <- c(40, -15, -20) beta <- -0.45 sigma <- 4 B <- rnorm(n * p, 0, 15) A <- gl(p, n, lab = paste("Group", LETTERS[1:3])) mm <- model.matrix(~A + B) data <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4)) data$B <- data$B + 20 head(data)
A B Y 1 Group A 5.570999 50.09555 2 Group A 15.612114 45.38163 3 Group A 23.881823 41.16404 4 Group A 2.718022 50.72290 5 Group A 22.936742 37.26995 6 Group A 20.451859 42.61873
Assumptions
As ANCOVA designs are essentially regular ANOVA designs that are first adjusted (centered) for the covariate(s), ANCOVA designs inherit all of the underlying assumptions of the appropriate ANOVA design. Readers should also eventually consult Tutorial 7.6a, Tutorial 9.2a, Tutorial 9.3a and Tutorial 9.4a. Specifically, hypothesis tests assume that:
- the appropriate residuals are normally distributed. Boxplots using the appropriate scale of replication (reflecting the appropriate residuals/F-ratio denominator, see the above tables) should be used to explore normality. Scale transformations are often useful.
- the appropriate residuals are equally varied. Boxplots and plots of means against variance (using the appropriate scale of replication) should be used to explore the spread of values. Residual plots should reveal no patterns. Scale transformations are often useful.
- the appropriate residuals are independent of one another.
- the relationship between the response variable and the covariate should be linear. Linearity can be explored using scatterplots and residual plots should reveal no patterns.
- for repeated measures and other designs in which treatment levels within blocks can not be be randomly ordered, the variance/covariance matrix is assumed to display sphericity.
- for designs that utilize blocking, it is assumed that there are no block by within block interactions
- homogeneity of slopes. Since a single slope is estimated for the covariate (rather than a separate slope per treatment), it is important that the trends between the response and the covariate is similar for each Group.
Exploratory data analysis
library(car) scatterplot(Y ~ B | A, data = data)
boxplot(Y ~ A, data)
# OR via ggplot library(ggplot2) ggplot(data, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")
ggplot(data, aes(y = Y, x = A)) + geom_boxplot()
- there is no evidence of obvious non-normality
- the assumption of linearity seems reasonable
- the variability of the three groups seems approximately equal
- the slopes (Y vs B trends) appear broadly similar for each treatment group
Homogeneity of slopes
We can explore inferential evidence of unequal slopes by examining estimated effects of the interaction between the categorical variable and the covariate. Note, pay no attention to the main effects - only the interaction. Even though I intend to illustrate Bayesian analyses here, for such a simple model, it is considerably simpler to use traditional OLS for testing for the presence of an interaction..
anova(lm(Y ~ B * A, data = data))
Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) B 1 354.80 354.80 23.9691 5.414e-05 *** A 2 2772.56 1386.28 93.6531 4.609e-12 *** B:A 2 55.08 27.54 1.8606 0.1773 Residuals 24 355.26 14.80 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model fitting or statistical analysis
Consistent with Tutorial 7.2b we will explore Bayesian modelling of ANCOVA using a variety of tools (such as MCMCpack, JAGS, RSTAN, RSTANARM and BRMS). Whilst JAGS and RSTAN are extremely flexible and thus allow models to be formulated that contain not only the simple model, but also additional derivatives, the other approaches are more restrictive. Consequently, I will mostly restrict models to just the minimum necessary and all derivatives will instead be calculated in R itself from the returned posteriors.
The observed response ($y_i$) are assumed to be drawn from a normal distribution with a given mean ($\mu$) and standard deviation ($\sigma$). The expected values ($\mu$) are themselves determined by the linear predictor ($\mathbf{X}\boldsymbol{\beta}$). In this case, $\boldsymbol{\beta}$ represents the vector of $\beta$'s - the intercept associated with the first group, the (effects) differences between this intercept and the intercepts for each other group as well as the slope associated with the continuous covariate. $\mathbf{X}$ is the model matrix.
MCMC sampling requires priors on all parameters. We will employ weakly informative priors. Specifying 'uninformative' priors is always a bit of a balancing act. If the priors are too vague (wide) the MCMC sampler can wander off into nonscence areas of likelihood rather than concentrate around areas of highest likelihood (desired when wanting the outcomes to be largely driven by the data). On the other hand, if the priors are too strong, they may have an influence on the parameters. In such a simple model, this balance is very forgiving - it is for more complex models that prior choice becomes more important.
For this simple model, we will go with zero-centered Gaussian (normal) priors with relatively large standard deviations (100) for both the intercept and the treatment effect and a wide half-cauchy (scale=5) for the standard deviation. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$ Note, exploratory data analysis suggests that while the intercept (intercept of Group A) and categorical predictor effects (differences between intercepts of each of the Group and Group A's intercept) could be drawn from a similar distribution (with mean in the 10's and variances in the 100's), the slope (effect associated with Group A linear relationship) is likely to be an order of magnitude less. We might therefore be tempted to provide different priors for the intercept, categorical effects and slope effect. For a simple model such as this, it is unlikely to be necessary. However, for more complex models, where prior specification becomes more critical, separate priors would probably be necessary.
library(MCMCpack) data.mcmcpack <- MCMCregress(Y ~ A + B, data = data)
Define the model
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mean[i],tau) mean[i] <- inprod(beta[],X[i,]) } #Priors for (i in 1:ngroups) { beta[i] ~ dnorm(0, 1.0E-6) } sigma ~ dunif(0, 100) tau <- 1 / (sigma * sigma) } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
- the number of predictor terms (nX)
X <- model.matrix(~A + B, data) data.list <- with(data, list(y = Y, X = X, n = nrow(data), ngroups = ncol(X)))
Define the MCMC chain parameters
Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:
- the nodes (estimated parameters) to monitor (return samples for)
- the number of MCMC chains (3)
- the number of burnin steps (1000)
- the thinning factor (10)
- the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains
params <- c("beta", "sigma") nChains = 3 burnInSteps = 3000 thinSteps = 10 numSavedSteps = 15000 #across all chains nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains) nIter
[1] 53000
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
library(R2jags)
data.r2jags <- jags(data = data.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 30 Unobserved stochastic nodes: 5 Total graph size: 232 Initializing model
print(data.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 50.201 1.803 46.617 49.021 50.207 51.404 53.741 1.001 8400 beta[2] -17.141 1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001 8400 beta[3] -22.917 1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001 5900 beta[4] -0.414 0.065 -0.543 -0.457 -0.414 -0.372 -0.285 1.001 15000 sigma 4.175 0.619 3.178 3.735 4.108 4.531 5.587 1.001 7200 deviance 169.232 3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 6.3 and DIC = 175.5 DIC is an estimate of expected predictive error (lower deviance is better).
data.mcmc.list <- as.mcmc(data.r2jags)
Whilst Gibbs sampling provides an elegantly simple MCMC sampling routine, very complex hierarchical models can take enormous numbers of iterations (often prohibitory large) to converge on a stable posterior distribution.
To address this, Andrew Gelman (and other collaborators) have implemented a variation on Hamiltonian Monte Carlo (HMC: a sampler that selects subsequent samples in a way that reduces the correlation between samples, thereby speeding up convergence) called the No-U-Turn (NUTS) sampler. All of these developments are brought together into a tool called Stan ("Sampling Through Adaptive Neighborhoods").
By design (to appeal to the vast BUGS users), Stan models are defined in a manner reminiscent of BUGS. Stan first converts these models into C++ code which is then compiled to allow very rapid computation.
Consistent with the use of C++, the model must be accompanied by variable declarations for all inputs and parameters.
One important difference between Stan and JAGS is that whereas BUGS (and thus JAGS) use precision rather than variance, Stan uses variance.
Stan itself is a stand-alone command line application. However, conveniently, the authors of Stan have also developed an R interface to Stan called Rstan which can be used much like R2jags.
Model matrix formulation
The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:- the normal distribution is defined by variance rather than precision
- rather than using a uniform prior for sigma, I am using a half-Cauchy
We now translate the likelihood model into STAN code.
$$\begin{align}
y_{ij}&\sim{}N(\mu_{ij}, \sigma)\\
\mu_{ij} &= \mathbf{X}\boldsymbol{\beta}\\
\beta_0&\sim{}N(0,100)\\
\beta&\sim{}N(0,10)\\
\sigma&\sim{}Cauchy(0,5)\\
\end{align}
$$
Define the model
modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; } parameters { vector[nX] beta; real<lower=0> sigma; } transformed parameters { vector[n] mu; mu = X*beta; } model { #Likelihood y~normal(mu,sigma); #Priors beta ~ normal(0,100); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); } } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
- the number of predictor terms (nX)
Xmat <- model.matrix(~A + B, data) data.list <- with(data, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data)))
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
## load the rstan package library(rstan)
data.rstan <- stan(data = data.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0, from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4, from file4be61ee4e612.cpp:8: /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined # define BOOST_NO_CXX11_RVALUE_REFERENCES ^ <command-line>:0:0: note: this is the location of the previous definition SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1). Gradient evaluation took 1.5e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.062651 seconds (Warm-up) 0.061975 seconds (Sampling) 0.124626 seconds (Total) SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.053706 seconds (Warm-up) 0.057327 seconds (Sampling) 0.111033 seconds (Total) SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.054613 seconds (Warm-up) 0.064711 seconds (Sampling) 0.119324 seconds (Total)
print(data.rstan, par = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 50.22 0.06 1.79 46.60 49.03 50.24 51.40 53.83 1027 1 beta[2] -17.12 0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19 1210 1 beta[3] -22.88 0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24 1407 1 beta[4] -0.42 0.00 0.06 -0.54 -0.46 -0.42 -0.37 -0.29 1282 1 sigma 4.11 0.02 0.58 3.12 3.70 4.05 4.45 5.37 1353 1 Samples were drawn using NUTS(diag_e) at Fri Nov 3 15:05:11 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
The STAN team has put together pre-compiled modules (functions) to make specifying and applying STAN models much simpler. Each function offers a consistent interface that is Also reminiscent of major frequentist linear modelling routines in R.
Whilst it is not necessary to specify priors when using rstanarm functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you re doing. Consistent with the pure STAN version, we will employ the following priors:
- weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
- weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
- half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$
Note, I am using the refresh=0 option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.
library(rstanarm) library(broom) library(coda)
data.rstanarm = stan_glm(Y ~ A + B, data = data, iter = 2000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100), prior = normal(0, 100), prior_aux = cauchy(0, 2))
Gradient evaluation took 5.5e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds. Adjust your expectations accordingly! Elapsed Time: 0.149605 seconds (Warm-up) 0.181861 seconds (Sampling) 0.331466 seconds (Total) Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Elapsed Time: 0.120154 seconds (Warm-up) 0.198657 seconds (Sampling) 0.318811 seconds (Total) Gradient evaluation took 1.4e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds. Adjust your expectations accordingly! Elapsed Time: 0.182133 seconds (Warm-up) 0.174642 seconds (Sampling) 0.356775 seconds (Total)
print(data.rstanarm)
stan_glm family: gaussian [identity] formula: Y ~ A + B ------ Estimates: Median MAD_SD (Intercept) 50.2 1.8 AGroup B -17.2 1.9 AGroup C -22.9 1.9 B -0.4 0.1 sigma 4.1 0.6 Sample avg. posterior predictive distribution of y (X = xbar): Median MAD_SD mean_PPD 30.0 1.1 ------ For info on the priors used see help('prior_summary.stanreg').
tidyMCMC(data.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 50.2286902 1.9365801 46.4428738 54.1263008 2 AGroup B -17.2114440 1.9386491 -21.0066484 -13.4338089 3 AGroup C -22.9037572 1.9679361 -26.6757211 -19.0189639 4 B -0.4138748 0.0681432 -0.5451839 -0.2852376 5 sigma 4.1990238 0.6317351 3.0581470 5.4328442
The brms package serves a similar goal to the rstanarm package - to provide a simple user interface to STAN. However, unlike the rstanarm implementation, brms simply converts the formula, data, priors and family into STAN model code and data before executing stan with those elements.
Whilst it is not necessary to specify priors when using brms functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you are doing. Consistent with the pure STAN version, we will employ the following priors:
- weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
- weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
- half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$
Note, I am using the refresh=0. option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.
library(brms) library(broom) library(coda)
data.brms = brm(Y ~ A + B, data = data, iter = 2000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"), prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 1.8e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds. Adjust your expectations accordingly! Elapsed Time: 0.036403 seconds (Warm-up) 0.043672 seconds (Sampling) 0.080075 seconds (Total) Gradient evaluation took 8e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds. Adjust your expectations accordingly! Elapsed Time: 0.033241 seconds (Warm-up) 0.041044 seconds (Sampling) 0.074285 seconds (Total) Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Elapsed Time: 0.036154 seconds (Warm-up) 0.041079 seconds (Sampling) 0.077233 seconds (Total)
print(data.brms)
Family: gaussian(identity) Formula: Y ~ A + B Data: data (Number of observations: 30) Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; total post-warmup samples = 2250 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 50.20 1.76 46.79 53.75 2177 1 AGroupB -17.17 1.83 -20.81 -13.68 2158 1 AGroupC -22.89 1.85 -26.43 -19.25 2127 1 B -0.41 0.06 -0.54 -0.29 2129 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 4.11 0.59 3.14 5.41 1854 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
tidyMCMC(data.brms, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 b_Intercept 50.1970335 1.75540856 46.6017008 53.5440980 2 b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289 3 b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870 4 b_B -0.4122714 0.06370931 -0.5416812 -0.2961189 5 sigma 4.1081576 0.58644550 3.1075609 5.3375706
MCMC diagnostics
In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.
Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.
- Traceplots for each parameter illustrate the MCMC sample values after each successive
iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests
that the MCMC sampling chains may not have completely traversed all features of the posterior
distribution and that more iterations are required to ensure the distribution has been accurately
represented.
- Autocorrelation plot for each paramter illustrate the degree of correlation between
MCMC samples separated by different lags. For example, a lag of 0 represents the degree of
correlation between each MCMC sample and itself (obviously this will be a correlation of 1).
A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain
and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be
independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom
autocorrelation plot.
- Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.
- Trace plots
View trace plots
library(MCMCpack) plot(data.mcmcpack)
- Raftery diagnostic
View Raftery diagnostic
library(MCMCpack) raftery.diag(data.mcmcpack)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) (Intercept) 2 3834 3746 1.020 AGroup B 2 3771 3746 1.010 AGroup C 2 3962 3746 1.060 B 2 3741 3746 0.999 sigma2 2 3771 3746 1.010
- Autocorrelation diagnostic
View autocorrelations
library(MCMCpack) autocorr.diag(data.mcmcpack)
(Intercept) AGroup B AGroup C B sigma2 Lag 0 1.000000000 1.000000e+00 1.000000000 1.000000000 1.000000000 Lag 1 -0.003346420 -4.064355e-03 -0.002480710 0.002480390 0.132659644 Lag 5 0.016500991 1.299476e-02 -0.003939996 0.001939800 -0.018030548 Lag 10 -0.005233676 -6.125242e-06 -0.013564685 -0.006005238 0.006407515 Lag 50 0.006330549 -9.343650e-03 -0.011855223 -0.002673821 0.007502186
Again, prior to examining the summaries, we should have explored the convergence diagnostics.
library(coda) data.mcmc = as.mcmc(data.r2jags)
- Trace plots
plot(data.mcmc)
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]") plot(as.mcmc(data.r2jags)[, preds])
- Raftery diagnostic
raftery.diag(data.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 36200 3746 9.66 beta[2] 20 38660 3746 10.30 beta[3] 20 38030 3746 10.20 beta[4] 20 36200 3746 9.66 deviance 20 37410 3746 9.99 sigma 20 38030 3746 10.20 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 39300 3746 10.50 beta[2] 20 38660 3746 10.30 beta[3] 20 39950 3746 10.70 beta[4] 20 37410 3746 9.99 deviance 20 39300 3746 10.50 sigma 20 36800 3746 9.82 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 38030 3746 10.20 beta[2] 20 36800 3746 9.82 beta[3] 20 37410 3746 9.99 beta[4] 20 38030 3746 10.20 deviance 20 36800 3746 9.82 sigma 20 36200 3746 9.66
- Autocorrelation diagnostic
autocorr.diag(data.mcmc)
beta[1] beta[2] beta[3] beta[4] deviance sigma Lag 0 1.0000000000 1.0000000000 1.000000000 1.000000000 1.0000000000 1.0000000000 Lag 10 -0.0086789407 0.0002409218 -0.012834845 -0.014876564 -0.0155077394 -0.0009823887 Lag 50 -0.0133637002 -0.0146591155 -0.022266022 0.007955582 -0.0131845968 -0.0066436459 Lag 100 0.0007452402 0.0128511117 -0.001800657 -0.007853232 0.0008253937 -0.0101001901 Lag 500 -0.0029682531 0.0011976990 0.001848702 -0.005150388 -0.0016034261 -0.0118212907
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data.rstan) wch = grep("beta", dimnames(s)$parameters) s = s[, , wch] mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data.rstan) wch = grep("beta", dimnames(s)$parameters) s = s[, , wch] mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) autocorr.diag(mcmc)
beta[1] beta[2] beta[3] Lag 0 1.000000000 1.0000000000 1.000000000 Lag 1 0.064013692 0.0605531343 0.026397084 Lag 5 0.007321517 0.0009337118 0.006726571 Lag 10 -0.016376224 -0.0098527765 -0.009376572 Lag 50 -0.017683148 -0.0290390559 0.010863075
- via rstan
- Traceplots
stan_trace(data.rstan)
- Raftery diagnostic
raftery.diag(data.rstan)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data.rstan)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data.rstan)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data.rstan)
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.matrix(data.rstan), regex_pars = "beta|sigma")
library(bayesplot) mcmc_combo(as.matrix(data.rstan), regex_pars = "beta|sigma")
- Density plots
library(bayesplot) mcmc_dens(as.matrix(data.rstan), regex_pars = "beta|sigma")
- Trace plots and density plots
- via shinystan
library(shinystan) launch_shinystan(data.rstan)
- It is worth exploring the influence of our priors.
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data.rstanarm) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data.rstanarm) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) autocorr.diag(mcmc)
(Intercept) AGroup B AGroup C B Lag 0 1.0000000000 1.000000000 1.000000000 1.000000000 Lag 1 0.0290559273 0.001379705 -0.023247646 -0.018452103 Lag 5 -0.0035230620 -0.003667234 0.004820692 0.008991159 Lag 10 0.0180539117 -0.016292304 0.028751812 0.024889288 Lag 50 0.0008045354 -0.013900306 0.008315425 0.009651236
- via rstan
- Traceplots
stan_trace(data.rstanarm)
- Raftery diagnostic
raftery.diag(data.rstanarm)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data.rstanarm)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data.rstanarm)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data.rstanarm)
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.array(data.rstanarm), regex_pars = "Intercept|x|sigma")
mcmc_combo(as.array(data.rstanarm))
- Density plots
mcmc_dens(as.array(data.rstanarm))
- Trace plots and density plots
- via rstanarm
The rstanarm package provides additional posterior checks.- Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior.
If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential.
On the other hand, overly wide priors can lead to computational issues.
library(rstanarm) posterior_vs_prior(data.rstanarm, color_by = "vs", group_by = TRUE, facet_args = list(scales = "free_y"))
Gradient evaluation took 3.6e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds. Adjust your expectations accordingly! Elapsed Time: 0.318686 seconds (Warm-up) 0.056968 seconds (Sampling) 0.375654 seconds (Total) Gradient evaluation took 1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds. Adjust your expectations accordingly! Elapsed Time: 0.371641 seconds (Warm-up) 0.056263 seconds (Sampling) 0.427904 seconds (Total)
- Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior.
If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential.
On the other hand, overly wide priors can lead to computational issues.
- via shinystan
library(shinystan) launch_shinystan(data.rstanarm))
Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) mcmc = as.mcmc(data.brms) plot(mcmc)
library(coda) mcmc = as.mcmc(data.brms) autocorr.diag(mcmc)
Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
- via rstan
- Traceplots
stan_trace(data.brms$fit)
- Raftery diagnostic
raftery.diag(data.brms)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data.brms$fit)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data.brms$fit)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data.brms$fit)
- Traceplots
Model validation
Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.
For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.
There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.
Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:
Standardized residuals: | the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known). | |
Studentized residuals: | the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out. | |
Pearson residuals: | the raw residuals divided by the standard deviation of the response variable. |
The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.
The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.
Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.
mcmc = as.data.frame(data.mcmcpack) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates head(mcmc)
(Intercept) AGroup B AGroup C B sigma2 1 50.18839 -18.23379 -22.87414 -0.4394640 14.11917 2 49.39556 -15.64877 -20.98086 -0.4528605 13.22004 3 49.44854 -16.36367 -23.66886 -0.3887300 15.57976 4 51.46891 -16.26642 -24.26313 -0.4709924 12.01597 5 50.55617 -17.54613 -22.44262 -0.4252965 16.35805 6 50.73340 -17.54078 -23.75846 -0.4598987 10.84731
coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = as.data.frame(data.mcmcpack) # generate a model matrix newdata = newdata Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit newdata = newdata %>% cbind(fit, resid) ggplot(newdata) + geom_point(aes(y = resid, x = A))
ggplot(newdata) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = as.data.frame(data.mcmcpack) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = as.matrix(data.mcmcpack) # generate a model matrix Xmat = model.matrix(~A + B, data) ## get median parameter estimates coefs = mcmc[, 1:4] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i, ], sqrt(mcmc[i, "sigma2"]))) newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
mcmc_areas(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.
mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = newdata Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit newdata = newdata %>% cbind(fit, resid) ggplot(newdata) + geom_point(aes(y = resid, x = A))
ggplot(newdata) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix Xmat = model.matrix(~A + B, data) ## get median parameter estimates coefs = mcmc[, 1:4] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i, ], mcmc[i, "sigma"])) newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
mcmc_areas(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.
mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = newdata Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit newdata = newdata %>% cbind(fit, resid) ggplot(newdata) + geom_point(aes(y = resid, x = A))
ggplot(newdata) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates coefs = apply(mcmc[, 1:4], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix Xmat = model.matrix(~A + B, data) ## get median parameter estimates coefs = mcmc[, 1:4] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i, ], mcmc[i, "sigma"])) newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data.rstan), regex_pars = "beta|sigma")
mcmc_areas(as.matrix(data.rstan), regex_pars = "beta|sigma")
Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.
resid = resid(data.rstanarm) fit = fitted(data.rstanarm) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
resid = resid(data.rstanarm) dat = data %>% mutate(resid = resid) ggplot(dat) + geom_point(aes(y = resid, x = A))
ggplot(dat) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
resid = resid(data.rstanarm) sresid = resid/sd(resid) fit = fitted(data.rstanarm) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
y_pred = posterior_predict(data.rstanarm) newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value", -Y:-B) ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = B, group = B, color = A))
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
mcmc_areas(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.
resid = resid(data.brms)[, "Estimate"] fit = fitted(data.brms)[, "Estimate"] ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
resid = resid(data.brms)[, "Estimate"] dat = data %>% mutate(resid = resid) ggplot(dat) + geom_point(aes(y = resid, x = A))
ggplot(dat) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
resid = resid(data.brms)[, "Estimate"] sresid = resid/sd(resid) fit = fitted(data.brms)[, "Estimate"] ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
y_pred = posterior_predict(data.brms) newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value", -Y:-B) ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data, aes(y = Y, x = B, group = B, color = A))
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data.brms), regex_pars = "b_|sigma")
mcmc_areas(as.matrix(data.brms), regex_pars = "b_|sigma")
Parameter estimates (posterior summaries)
Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } }
Matrix model (MCMCpack)
summary(data.mcmcpack)
Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 50.2015 1.7846 0.017846 0.0178458 AGroup B -17.1433 1.8782 0.018782 0.0187821 AGroup C -22.9133 1.8682 0.018682 0.0186817 B -0.4138 0.0637 0.000637 0.0006543 sigma2 17.0079 5.0889 0.050889 0.0581569 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) 46.7714 49.0274 50.2019 51.3437 53.7977 AGroup B -20.8797 -18.3726 -17.1445 -15.8884 -13.4552 AGroup C -26.6231 -24.1304 -22.9030 -21.6955 -19.1900 B -0.5397 -0.4556 -0.4134 -0.3714 -0.2885 sigma2 9.7749 13.4022 16.1334 19.6823 29.3438
# OR library(broom) tidyMCMC(data.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 50.2014506 1.78458442 46.7660798 53.7766054 2 AGroup B -17.1433463 1.87821127 -20.7503649 -13.3485818 3 AGroup C -22.9132573 1.86817189 -26.6020778 -19.1852142 4 B -0.4137879 0.06369625 -0.5369301 -0.2860994 5 sigma2 17.0078626 5.08890812 8.9411967 27.1577316
- the intercept of the first group (Group A) is
50.2014506
- the mean of the second group (Group B) is
-17.1433463
units greater than (A) - the mean of the third group (Group C) is
-22.9132573
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4137879
units increase in Y
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(data.mcmcpack[, 2]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 3]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 4]) # effect of (slope = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 2:4]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups.
Matrix model (JAGS)
print(data.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 50.201 1.803 46.617 49.021 50.207 51.404 53.741 1.001 8400 beta[2] -17.141 1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001 8400 beta[3] -22.917 1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001 5900 beta[4] -0.414 0.065 -0.543 -0.457 -0.414 -0.372 -0.285 1.001 15000 sigma 4.175 0.619 3.178 3.735 4.108 4.531 5.587 1.001 7200 deviance 169.232 3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 6.3 and DIC = 175.5 DIC is an estimate of expected predictive error (lower deviance is better).
# OR library(broom) tidyMCMC(as.mcmc(data.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta[1] 50.2007407 1.80285405 46.5994039 53.6923687 2 beta[2] -17.1407725 1.87934256 -20.8989661 -13.5038133 3 beta[3] -22.9165431 1.88243441 -26.5907610 -19.1480417 4 beta[4] -0.4139048 0.06529536 -0.5434197 -0.2853424 5 deviance 169.2319386 3.54753309 163.9376050 176.1387130 6 sigma 4.1749569 0.61875758 3.0490746 5.3897463
- the intercept of the first group (Group A) is
50.2007407
- the mean of the second group (Group B) is
-17.1407725
units greater than (A) - the mean of the third group (Group C) is
-22.9165431
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4139048
units increase in Y
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[2]"]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[3]"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[4]"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, 2:4]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups.
Matrix model (RSTAN)
print(data.rstan, pars = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 50.22 0.06 1.79 46.60 49.03 50.24 51.40 53.83 1027 1 beta[2] -17.12 0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19 1210 1 beta[3] -22.88 0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24 1407 1 beta[4] -0.42 0.00 0.06 -0.54 -0.46 -0.42 -0.37 -0.29 1282 1 sigma 4.11 0.02 0.58 3.12 3.70 4.05 4.45 5.37 1353 1 Samples were drawn using NUTS(diag_e) at Fri Nov 3 15:05:11 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
# OR library(broom) tidyMCMC(data.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))
term estimate std.error conf.low conf.high 1 beta[1] 50.2161150 1.78767398 46.4168478 53.5737947 2 beta[2] -17.1206004 1.91600172 -20.7216779 -13.0233814 3 beta[3] -22.8794576 1.92793878 -26.8184381 -19.3130173 4 beta[4] -0.4157079 0.06337683 -0.5361929 -0.2891039 5 sigma 4.1094504 0.58022288 3.0779992 5.3039276
- the intercept of the first group (Group A) is
50.216115
- the mean of the second group (Group B) is
-17.1206004
units greater than (A) - the mean of the third group (Group C) is
-22.8794576
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4157079
units increase in Y
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(as.matrix(data.rstan)[, "beta[2]"]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, "beta[3]"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, "beta[4]"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, 2:4]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups.
library(loo) (full = loo(extract_log_lik(data.rstan)))
Computed from 1500 by 30 log-likelihood matrix Estimate SE elpd_loo -87.3 3.5 p_loo 4.6 1.0 looic 174.5 7.1 Pareto k diagnostic values: Count Pct (-Inf, 0.5] (good) 29 96.7% (0.5, 0.7] (ok) 1 3.3% (0.7, 1] (bad) 0 0.0% (1, Inf) (very bad) 0 0.0% All Pareto k estimates are ok (k < 0.7) See help('pareto-k-diagnostic') for details.
# now fit a model without main factor modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; } parameters { vector[nX] beta; real<lower=0> sigma; } transformed parameters { vector[n] mu; mu = X*beta; } model { #Likelihood y~normal(mu,sigma); #Priors beta ~ normal(0,1000); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); } } " Xmat <- model.matrix(~1, data) data.list <- with(data, list(y = Y, X = Xmat, n = nrow(data), nX = ncol(Xmat))) data.rstan.red <- stan(data = data.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0, from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4, from file12361ebf982d.cpp:8: /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined # define BOOST_NO_CXX11_RVALUE_REFERENCES ^ <command-line>:0:0: note: this is the location of the previous definition SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1). Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.018509 seconds (Warm-up) 0.041875 seconds (Sampling) 0.060384 seconds (Total) SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.017436 seconds (Warm-up) 0.046029 seconds (Sampling) 0.063465 seconds (Total) SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3). Gradient evaluation took 6e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.012316 seconds (Warm-up) 0.026582 seconds (Sampling) 0.038898 seconds (Total)
(reduced = loo(extract_log_lik(data.rstan.red)))
Computed from 1500 by 30 log-likelihood matrix Estimate SE elpd_loo -116.0 3.2 p_loo 1.6 0.4 looic 232.0 6.4 All Pareto k estimates are good (k < 0.5) See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Matrix model (RSTANARM)
summary(data.rstanarm)
Model Info: function: stan_glm family: gaussian [identity] formula: Y ~ A + B algorithm: sampling priors: see help('prior_summary') sample: 2250 (posterior sample size) num obs: 30 Estimates: mean sd 2.5% 25% 50% 75% 97.5% (Intercept) 50.2 1.9 46.4 49.0 50.2 51.5 54.1 AGroup B -17.2 1.9 -21.0 -18.5 -17.2 -15.9 -13.3 AGroup C -22.9 2.0 -26.6 -24.2 -22.9 -21.6 -19.0 B -0.4 0.1 -0.5 -0.5 -0.4 -0.4 -0.3 sigma 4.2 0.6 3.2 3.8 4.1 4.5 5.6 mean_PPD 30.0 1.1 27.8 29.3 30.0 30.7 32.2 log-posterior -98.3 1.8 -103.1 -99.1 -97.9 -96.9 -95.9 Diagnostics: mcse Rhat n_eff (Intercept) 0.0 1.0 1980 AGroup B 0.0 1.0 2145 AGroup C 0.0 1.0 2250 B 0.0 1.0 2250 sigma 0.0 1.0 1200 mean_PPD 0.0 1.0 1720 log-posterior 0.1 1.0 960 For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# OR library(broom) tidyMCMC(data.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 50.2286902 1.9365801 46.4428738 54.1263008 2 AGroup B -17.2114440 1.9386491 -21.0066484 -13.4338089 3 AGroup C -22.9037572 1.9679361 -26.6757211 -19.0189639 4 B -0.4138748 0.0681432 -0.5451839 -0.2852376 5 sigma 4.1990238 0.6317351 3.0581470 5.4328442 6 mean_PPD 30.0266842 1.1171901 27.8860300 32.2422092 7 log-posterior -98.2605466 1.8364590 -101.7802647 -95.7415132
- the intercept of the first group (Group A) is
50.2286902
- the mean of the second group (Group B) is
-17.211444
units greater than (A) - the mean of the third group (Group C) is
-22.9037572
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4138748
units increase in Y
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero head(as.matrix(data.rstanarm))
parameters iterations (Intercept) AGroup B AGroup C B sigma [1,] 49.25161 -18.13674 -20.71374 -0.3963549 4.384061 [2,] 50.28149 -20.18446 -23.60224 -0.3823662 4.601748 [3,] 47.22865 -13.74348 -20.42610 -0.3439257 3.559008 [4,] 50.36383 -18.72387 -23.75338 -0.3776768 4.211066 [5,] 49.33028 -18.88190 -20.94786 -0.3634085 3.830714 [6,] 49.08087 -14.40159 -22.87480 -0.4144611 3.793997
mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup B"]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup C"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, "B"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, 2:4]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A
library(loo) (full = loo(data.rstanarm))
Computed from 2250 by 30 log-likelihood matrix Estimate SE elpd_loo -87.2 3.3 p_loo 4.3 0.9 looic 174.4 6.7 All Pareto k estimates are good (k < 0.5) See help('pareto-k-diagnostic') for details.
data.rstanarm.red = update(data.rstanarm, . ~ 1)
Gradient evaluation took 3.8e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds. Adjust your expectations accordingly! Elapsed Time: 0.027383 seconds (Warm-up) 0.039991 seconds (Sampling) 0.067374 seconds (Total) Gradient evaluation took 1.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds. Adjust your expectations accordingly! Elapsed Time: 0.0218 seconds (Warm-up) 0.039113 seconds (Sampling) 0.060913 seconds (Total) Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Elapsed Time: 0.02501 seconds (Warm-up) 0.043676 seconds (Sampling) 0.068686 seconds (Total)
(reduced = loo(data.rstanarm.red))
Computed from 2250 by 30 log-likelihood matrix Estimate SE elpd_loo -116.0 3.1 p_loo 1.5 0.4 looic 232.0 6.1 All Pareto k estimates are good (k < 0.5) See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
elpd_diff se -28.8 4.5
Matrix model (BRMS)
summary(data.brms)
Family: gaussian(identity) Formula: Y ~ A + B Data: data (Number of observations: 30) Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; total post-warmup samples = 2250 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 50.20 1.76 46.79 53.75 2177 1 AGroupB -17.17 1.83 -20.81 -13.68 2158 1 AGroupC -22.89 1.85 -26.43 -19.25 2127 1 B -0.41 0.06 -0.54 -0.29 2129 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 4.11 0.59 3.14 5.41 1854 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
# OR library(broom) tidyMCMC(data.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 b_Intercept 50.1970335 1.75540856 46.6017008 53.5440980 2 b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289 3 b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870 4 b_B -0.4122714 0.06370931 -0.5416812 -0.2961189 5 sigma 4.1081576 0.58644550 3.1075609 5.3375706
- the intercept of the first group (Group A) is
50.1970335
- the mean of the second group (Group B) is
-17.1728328
units greater than (A) - the mean of the third group (Group C) is
-22.8915861
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4122714
units increase in Y
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero head(as.matrix(data.brms))
parameters iterations b_Intercept b_AGroupB b_AGroupC b_B sigma lp__ [1,] 46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789 [2,] 50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635 [3,] 49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988 [4,] 52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522 [5,] 49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716 [6,] 51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362
mcmcpvalue(as.matrix(data.brms)[, "b_AGroupB"]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, "b_AGroupC"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, "b_B"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, 2:4]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A
library(loo) (full = loo(data.brms))
LOOIC SE 174.34 7.05
data.brms.red = update(data.brms, . ~ 1)
SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1). Gradient evaluation took 1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.019211 seconds (Warm-up) 0.014546 seconds (Sampling) 0.033757 seconds (Total) SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2). Gradient evaluation took 4e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.015666 seconds (Warm-up) 0.017803 seconds (Sampling) 0.033469 seconds (Total) SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3). Gradient evaluation took 4e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.016449 seconds (Warm-up) 0.016509 seconds (Sampling) 0.032958 seconds (Total)
(reduced = loo(data.brms.red))
LOOIC SE 232.15 6.38
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Graphical summaries
A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.
Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.
Matrix model (MCMCpack)
mcmc = data.mcmcpack ## Calculate the fitted values newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B), len = 100)) Xmat = model.matrix(~A + B, newdata) coefs = mcmc[, 1:4] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata = rdata = data fMat = rMat = model.matrix(~A + B, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (JAGS)
mcmc = data.r2jags$BUGSoutput$sims.matrix ## Calculate the fitted values newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B), len = 100)) Xmat = model.matrix(~A + B, newdata) coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata = rdata = data fMat = rMat = model.matrix(~A + B, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (RSTAN)
mcmc = as.matrix(data.rstan) ## Calculate the fitted values newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B), len = 100)) Xmat = model.matrix(~A + B, newdata) coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata = rdata = data fMat = rMat = model.matrix(~A + B, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (RSTANARM)
## Calculate the fitted values newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B), len = 100)) fit = posterior_linpred(data.rstanarm, newdata = newdata) newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals rdata = data pp = posterior_linpred(data.rstanarm, newdata = rdata) fit = as.vector(apply(pp, 2, median)) resid = resid(data.rstanarm) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (BRMS)
Although we could calculated the fitted values via matrix multiplication of the coefficients and the model matrix (as for MCMCpack, RJAGS and RSTAN), for more complex models, it is more convenient to use the marginal_effects function that comes with brms.
plot(marginal_effects(data.brms), points = TRUE)
# OR eff = plot(marginal_effects(data.brms), points = TRUE, plot = FALSE) eff
$A
$B
## Calculate the fitted values newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B), len = 100)) fit = fitted(data.brms, newdata = newdata, summary = FALSE) newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals rdata = data fit = fitted(data.brms, summary = TRUE)[, "Estimate"] resid = resid(data.brms)[, "Estimate"] rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Posteriors
In frequentist statistics, when we have more than two groups, we are typically not only interested in whether there is evidence for an overall "effect" of a factor - we are also interested in how various groups compare to one another.
To explore these trends, we either compare each group to each other in a pairwise manner (controlling for family-wise Type I error rates) or we explore an independent subset of the possible comparisons. Although these alternate approaches can adequately address a specific research agenda, often they impose severe limitations and compromises on the scope and breadth of questions that can be asked of your data. The reason for these limitations is that in a frequentist framework, any single hypothesis carries with it a (nominally) 5% chance of a false rejection (since it is based on long-run frequency). Thus, performing multiple tests are likely to compound this error rate. The point is, that each comparison is compared to its own probability distribution (and each carries a 5% error rate).
By contrast, in Bayesian statistics, all comparisons (contrasts) are drawn from the one (hopefully stable and convergent) posterior distribution and this posterior is invariant to the type and number of comparisons drawn. Hence, the theory clearly indicates that having generated our posterior distribution, we can then query this distribution in any way that we wish thereby allowing us to explore all of our research questions simultaneously.
Bayesian "contrasts" can be performed either:
- within the Bayesian sampling model or
- construct them from the returned MCMC samples (they are drawn from the posteriors)
In order to allow direct comparison to the frequentist equivalents, I will explore the same set of planned and "Tukey's" test comparisons described here. For the "planned comparison" we defined two contrasts:
- group B vs group C
- group A vs the average of groups B and C
Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.
mcmc = data.mcmcpack coefs <- as.matrix(mcmc)[, 1:4] newdata = expand.grid(A = levels(data$A), B = mean(data$B)) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey") tuk.mat
Multiple Comparisons of Means: Tukey Contrasts Group A Group B Group C Group B - Group A -1 1 0 Group C - Group A -1 0 1 Group C - Group B 0 -1 1
Xmat <- model.matrix(~A + B, data = newdata) Xmat
(Intercept) AGroup B AGroup C B 1 1 0 0 16.50175 2 1 1 0 16.50175 3 1 0 1 16.50175 attr(,"assign") [1] 0 1 1 2 attr(,"contrasts") attr(,"contrasts")$A [1] "contr.treatment"
pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) AGroup B AGroup C B Group B - Group A 0 1 0 0 Group C - Group A 0 0 1 0 Group C - Group B 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -17.143346 1.878211 -20.750365 -13.348582 2 Group C - Group A -22.913257 1.868172 -26.602078 -19.185214 3 Group C - Group B -5.769911 1.821653 -9.552934 -2.271914
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).
Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.
# Modify the tuk.mat to replace -1 with 0. This will allow us to get a # mcmc matrix of .. tuk.mat[tuk.mat == -1] = 0 comp.mat <- tuk.mat %*% Xmat comp.mat
(Intercept) AGroup B AGroup C B Group B - Group A 1 1 0 16.50175 Group C - Group A 1 0 1 16.50175 Group C - Group B 1 0 1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat) (comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -65.78138 9.862131 -84.99925 -46.617429 2 Group C - Group A -112.84960 15.200206 -144.80969 -84.547452 3 Group C - Group B -28.70578 10.358636 -50.05632 -9.004512
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() + theme_classic()
And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).
c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3)) c.mat
[,1] [,2] [,3] [1,] 0.0 1.0000000 -1.0000000 [2,] 0.5 -0.3333333 -0.3333333
mcmc = data.mcmcpack coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) Xmat <- model.matrix(~A + B, data = newdata) c.mat = c.mat %*% Xmat c.mat
(Intercept) AGroup B AGroup C B [1,] 0.0000000 1.0000000 -1.0000000 0.000000 [2,] -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 var1 5.769911 1.8216528 2.271914 9.552934 2 var2 6.123330 0.9187204 4.344498 7.965326
Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.
mcmc = data.r2jags$BUGSoutput$sims.matrix coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey") Xmat <- model.matrix(~A + B, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) AGroup B AGroup C B Group B - Group A 0 1 0 0 Group C - Group A 0 0 1 0 Group C - Group B 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -17.140773 1.879343 -20.898966 -13.503813 2 Group C - Group A -22.916543 1.882434 -26.590761 -19.148042 3 Group C - Group B -5.775771 1.880763 -9.402033 -1.979244
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).
Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.
# Modify the tuk.mat to replace -1 with 0. This will allow us to get a # mcmc matrix of .. tuk.mat[tuk.mat == -1] = 0 comp.mat <- tuk.mat %*% Xmat comp.mat
(Intercept) AGroup B AGroup C B Group B - Group A 1 1 0 16.50175 Group C - Group A 1 0 1 16.50175 Group C - Group B 1 0 1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat) (comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -65.77396 9.861029 -85.38307 -46.803881 2 Group C - Group A -112.95081 15.533036 -144.44128 -83.512153 3 Group C - Group B -28.79007 10.740187 -50.07490 -8.241323
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() + theme_classic()
And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).
c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3)) c.mat
[,1] [,2] [,3] [1,] 0.0 1.0000000 -1.0000000 [2,] 0.5 -0.3333333 -0.3333333
mcmc = data.r2jags$BUGSoutput$sims.matrix coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) Xmat <- model.matrix(~A + B, data = newdata) c.mat = c.mat %*% Xmat c.mat
(Intercept) AGroup B AGroup C B [1,] 0.0000000 1.0000000 -1.0000000 0.000000 [2,] -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 var1 5.775771 1.880763 1.979244 9.402033 2 var2 6.124007 0.913995 4.278228 7.879558
Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.
mcmc = data.rstan coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey") Xmat <- model.matrix(~A + B, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) AGroup B AGroup C B Group B - Group A 0 1 0 0 Group C - Group A 0 0 1 0 Group C - Group B 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -17.120600 1.916002 -20.721678 -13.023381 2 Group C - Group A -22.879458 1.927939 -26.818438 -19.313017 3 Group C - Group B -5.758857 1.908307 -9.465031 -1.972144
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).
Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.
# Modify the tuk.mat to replace -1 with 0. This will allow us to get a # mcmc matrix of .. tuk.mat[tuk.mat == -1] = 0 comp.mat <- tuk.mat %*% Xmat comp.mat
(Intercept) AGroup B AGroup C B Group B - Group A 1 1 0 16.50175 Group C - Group A 1 0 1 16.50175 Group C - Group B 1 0 1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat) (comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -65.68005 9.938618 -86.57138 -46.874022 2 Group C - Group A -112.65860 15.767824 -145.18904 -84.337691 3 Group C - Group B -28.69047 10.825916 -48.80312 -6.685448
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() + theme_classic()
And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).
c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3)) c.mat
[,1] [,2] [,3] [1,] 0.0 1.0000000 -1.0000000 [2,] 0.5 -0.3333333 -0.3333333
mcmc = data.rstan coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) Xmat <- model.matrix(~A + B, data = newdata) c.mat = c.mat %*% Xmat c.mat
(Intercept) AGroup B AGroup C B [1,] 0.0000000 1.0000000 -1.0000000 0.000000 [2,] -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 var1 5.758857 1.9083068 1.972144 9.465031 2 var2 6.107318 0.9312412 4.373877 8.159077
Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.
mcmc = data.rstanarm coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey") Xmat <- model.matrix(~A + B, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) AGroup B AGroup C B Group B - Group A 0 1 0 0 Group C - Group A 0 0 1 0 Group C - Group B 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -17.211444 1.938649 -21.006648 -13.433809 2 Group C - Group A -22.903757 1.967936 -26.675721 -19.018964 3 Group C - Group B -5.692313 1.892174 -9.159957 -1.799432
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).
Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.
# Modify the tuk.mat to replace -1 with 0. This will allow us to get a # mcmc matrix of .. tuk.mat[tuk.mat == -1] = 0 comp.mat <- tuk.mat %*% Xmat comp.mat
(Intercept) AGroup B AGroup C B Group B - Group A 1 1 0 16.50175 Group C - Group A 1 0 1 16.50175 Group C - Group B 1 0 1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat) (comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -66.16030 10.09871 -86.76507 -46.675359 2 Group C - Group A -112.70201 15.95929 -144.71321 -82.068118 3 Group C - Group B -28.33604 10.75415 -50.15437 -8.304887
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() + theme_classic()
And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).
c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3)) c.mat
[,1] [,2] [,3] [1,] 0.0 1.0000000 -1.0000000 [2,] 0.5 -0.3333333 -0.3333333
mcmc = data.rstanarm coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) Xmat <- model.matrix(~A + B, data = newdata) c.mat = c.mat %*% Xmat c.mat
(Intercept) AGroup B AGroup C B [1,] 0.0000000 1.0000000 -1.0000000 0.000000 [2,] -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 var1 5.692313 1.8921744 1.799432 9.159957 2 var2 6.138562 0.9558204 4.235113 7.985014
Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.
mcmc = data.brms coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey") Xmat <- model.matrix(~A + B, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) AGroup B AGroup C B Group B - Group A 0 1 0 0 Group C - Group A 0 0 1 0 Group C - Group B 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -17.172833 1.831351 -20.680311 -13.558229 2 Group C - Group A -22.891586 1.854493 -26.384632 -19.217887 3 Group C - Group B -5.718753 1.846079 -9.169838 -2.065845
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).
Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.
# Modify the tuk.mat to replace -1 with 0. This will allow us to get a # mcmc matrix of .. tuk.mat[tuk.mat == -1] = 0 comp.mat <- tuk.mat %*% Xmat comp.mat
(Intercept) AGroup B AGroup C B Group B - Group A 1 1 0 16.50175 Group C - Group A 1 0 1 16.50175 Group C - Group B 1 0 1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat) (comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Group B - Group A -65.90339 9.650503 -84.05345 -46.582017 2 Group C - Group A -112.55948 15.341280 -143.66383 -84.530670 3 Group C - Group B -28.43318 10.498519 -47.72283 -7.691681
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() + theme_classic()
And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).
c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3)) c.mat
[,1] [,2] [,3] [1,] 0.0 1.0000000 -1.0000000 [2,] 0.5 -0.3333333 -0.3333333
mcmc = data.brms coefs <- as.matrix(mcmc)[, 1:4] newdata <- data.frame(A = levels(data$A), B = mean(data$B)) Xmat <- model.matrix(~A + B, data = newdata) c.mat = c.mat %*% Xmat c.mat
(Intercept) AGroup B AGroup C B [1,] 0.0000000 1.0000000 -1.0000000 0.000000 [2,] -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 var1 5.718753 1.8460785 2.065845 9.169838 2 var2 6.122501 0.8984429 4.344599 7.808558
Finite Population Standard Deviations
Variance components, the amount of added variance attributed to each influence, are traditionally estimated for so called random effects. These are the effects for which the levels employed in the design are randomly selected to represent a broader range of possible levels. For such effects, effect sizes (differences between each level and a reference level) are of little value. Instead, the 'importance' of the variables are measured in units of variance components.
On the other hand, regular variance components for fixed factors (those whose measured levels represent the only levels of interest) are not logical - since variance components estimate variance as if the levels are randomly selected from a larger population. Nevertheless, in order to compare and contrast the scale of variability of both fixed and random factors, it is necessary to measure both on the same scale (sample or population based variance).
Finite-population variance components (Gelman, 2005)
assume that the levels of all
factors (fixed and random) in the design are all the possible levels available.
In other words, they are assumed to represent finite populations of levels.
Sample (rather than population) statistics are then used to calculate these
finite-population variances (or standard deviations).
Since standard deviation (and variance) are bound at zero, standard deviation posteriors are typically non-normal. Consequently, medians and HPD intervals are more robust estimates.
library(broom) mcmc = data.mcmcpack head(mcmc)
Markov Chain Monte Carlo (MCMC) output: Start = 1001 End = 1007 Thinning interval = 1 (Intercept) AGroup B AGroup C B sigma2 [1,] 50.18839 -18.23379 -22.87414 -0.4394640 14.11917 [2,] 49.39556 -15.64877 -20.98086 -0.4528605 13.22004 [3,] 49.44854 -16.36367 -23.66886 -0.3887300 15.57976 [4,] 51.46891 -16.26642 -24.26313 -0.4709924 12.01597 [5,] 50.55617 -17.54613 -22.44262 -0.4252965 16.35805 [6,] 50.73340 -17.54078 -23.75846 -0.4598987 10.84731 [7,] 51.16774 -17.41150 -22.41515 -0.3996571 17.57170
# A wch = grep("^A", colnames(mcmc)) # Get the rowwise standard deviations between effects parameters sd.A = apply(mcmc[, wch], 1, sd) # B wch = grep("^B", colnames(mcmc)) sd.B = sd(data$B) * abs(mcmc[, wch]) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates wch = grep("(Intercept)|^A|^B", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") sd.resid = apply(resid, 1, sd) sd.all = cbind(sd.A, sd.B, sd.resid) (fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 4.080665 1.2858143 1.606486 6.754944 2 sd.B 5.030454 0.7743607 3.478134 6.527505 3 sd.resid 3.983878 0.1906139 3.761980 4.356994
# OR expressed as a percentage (fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 31.28801 7.123578 16.55312 44.15266 2 sd.B 38.61193 5.120548 28.29820 48.32862 3 sd.resid 30.05025 4.065060 24.60465 38.59845
## we can even plot this as a Bayesian ANOVA table ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) + scale_y_continuous("Finite population standard deviation") + scale_x_discrete() + coord_flip() + theme_classic()
Conclusions:
Approximately 31.3%
of the total finite population standard deviation is due to x.
library(broom) mcmc = data.r2jags$BUGSoutput$sims.matrix head(mcmc)
beta[1] beta[2] beta[3] beta[4] deviance sigma [1,] 52.60917 -18.82408 -23.38489 -0.4880702 167.0879 4.283265 [2,] 53.75233 -18.89312 -24.92630 -0.5191816 171.4824 5.124059 [3,] 51.15722 -16.81356 -23.25841 -0.5336914 170.4625 3.705732 [4,] 50.32853 -16.98682 -23.41316 -0.4149829 165.9603 4.545555 [5,] 48.78829 -15.18438 -24.19762 -0.4395482 172.1889 4.458376 [6,] 51.55891 -17.64261 -22.37656 -0.5040106 166.7893 4.155680
# A wch = grep("beta.[2-3]", colnames(mcmc)) # Get the rowwise standard deviations between effects parameters sd.A = apply(mcmc[, wch], 1, sd) # B wch = grep("beta.4", colnames(mcmc)) sd.B = sd(data$B) * abs(mcmc[, wch]) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates wch = grep("beta.[1-4]", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") sd.resid = apply(resid, 1, sd) sd.all = cbind(sd.A, sd.B, sd.resid) (fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 4.085929 1.3242287 1.399537 6.648241 2 sd.B 5.031876 0.7938012 3.468931 6.606399 3 sd.resid 3.991447 0.1970424 3.762837 4.379866
# OR expressed as a percentage (fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 31.33381 7.376887 15.57164 43.97278 2 sd.B 38.65910 5.313671 27.75097 48.62132 3 sd.resid 29.97544 4.188131 24.58803 39.19370
## we can even plot this as a Bayesian ANOVA table ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) + scale_y_continuous("Finite population standard deviation") + scale_x_discrete() + coord_flip() + theme_classic()
Conclusions:
Approximately 31.3%
of the total finite population standard deviation is due to x.
library(broom) mcmc = as.matrix(data.rstan) head(mcmc)
parameters iterations beta[1] beta[2] beta[3] beta[4] sigma mu[1] mu[2] mu[3] mu[4] [1,] 52.68797 -15.89902 -22.48872 -0.5325084 4.190521 49.72136 44.37439 39.97070 51.24060 [2,] 51.04073 -14.86065 -23.63103 -0.4404722 3.869872 48.58686 44.16402 40.52145 49.84351 [3,] 52.08091 -17.19621 -23.79866 -0.4785798 4.179663 49.41474 44.60927 40.65155 50.78012 [4,] 51.54719 -19.52916 -24.99191 -0.3788141 4.283020 49.43681 45.63310 42.50041 50.51756 [5,] 49.24231 -15.18013 -21.04165 -0.3906899 3.719325 47.06578 43.14281 39.91192 48.18041 [6,] 49.65178 -16.39404 -19.21428 -0.4629938 3.691510 47.07244 42.42347 38.59465 48.39336 parameters iterations mu[5] mu[6] mu[7] mu[8] mu[9] mu[10] mu[11] mu[12] mu[13] [1,] 40.47396 41.79718 41.35552 33.11874 51.77358 31.91453 32.08782 35.17453 31.86079 [2,] 40.93773 42.03225 41.66692 34.85374 50.28437 33.85767 32.29147 34.84469 32.10367 [3,] 41.10385 42.29307 41.89613 34.49351 51.25912 33.41126 30.65966 33.43378 30.45562 [4,] 42.85842 43.79973 43.48554 37.62609 50.89671 36.76944 28.67375 30.86957 28.51224 [5,] 40.28116 41.25197 40.92793 34.88479 48.57144 34.00129 30.61307 32.87772 30.44650 [6,] 39.03221 40.18270 39.79869 32.63715 48.85676 31.59015 29.17031 31.85408 28.97291 parameters iterations mu[14] mu[15] mu[16] mu[17] mu[18] mu[19] mu[20] mu[21] mu[22] [1,] 24.12069 24.92430 28.59623 33.75113 31.31670 16.35942 24.54276 24.16979 27.07583 [2,] 25.70134 26.36605 29.40334 33.66730 31.65363 19.28149 26.05046 22.42234 24.82611 [3,] 23.49939 24.22161 27.52168 32.15453 29.96664 16.52413 23.87872 22.86342 25.47515 [4,] 23.00612 23.57779 26.18991 29.85699 28.12520 17.48493 23.30637 22.26606 24.33334 [5,] 24.76775 25.35734 28.05136 31.83340 30.04731 19.07348 25.07742 23.77698 25.90907 [6,] 22.24322 22.94193 26.13451 30.61648 28.49985 15.49512 22.61020 25.19514 27.72182 parameters iterations mu[23] mu[24] mu[25] mu[26] mu[27] mu[28] mu[29] mu[30] log_lik[1] [1,] 21.17639 32.86026 23.41873 25.46849 10.27852 11.46507 20.12482 28.62927 -2.355750 [2,] 19.94630 29.61079 21.80108 23.49658 10.93196 11.91344 19.07648 26.11106 -2.348154 [3,] 20.17316 30.67378 22.18841 24.03059 10.37895 11.44533 19.22809 26.87127 -2.362435 [4,] 20.13662 28.44826 21.73177 23.18992 12.38413 13.22821 19.38856 25.43843 -2.385424 [5,] 21.58078 30.15298 23.22594 24.72981 13.58525 14.45580 20.80927 27.04880 -2.564270 [6,] 22.59250 32.75114 24.54212 26.32431 13.11726 14.14891 21.67820 29.07247 -2.560301 parameters iterations log_lik[2] log_lik[3] log_lik[4] log_lik[5] log_lik[6] log_lik[7] log_lik[8] log_lik[9] [1,] -2.380651 -2.392311 -2.359395 -2.644059 -2.370981 -2.656101 -2.358963 -3.937789 [2,] -2.321658 -2.285946 -2.297979 -2.721302 -2.283643 -2.564277 -2.439362 -3.463785 [3,] -2.366243 -2.356686 -2.349263 -2.769866 -2.352204 -2.562282 -2.450072 -3.731235 [4,] -2.375320 -2.422274 -2.374746 -3.224846 -2.411614 -2.408977 -3.057791 -3.556057 [5,] -2.413647 -2.289148 -2.466128 -2.560216 -2.299999 -2.726477 -2.418549 -2.888813 [6,] -2.546048 -2.467202 -2.424089 -2.338922 -2.442708 -3.079584 -2.224991 -2.983441 parameters iterations log_lik[10] log_lik[11] log_lik[12] log_lik[13] log_lik[14] log_lik[15] log_lik[16] [1,] -2.560221 -2.385313 -2.758740 -2.999655 -2.808351 -2.357355 -5.539627 [2,] -2.291578 -2.298123 -2.669736 -2.956470 -3.313616 -2.305454 -6.602203 [3,] -2.391007 -2.530008 -2.468268 -3.440631 -2.676762 -2.386746 -4.935821 [4,] -2.499489 -2.925431 -2.381090 -4.170168 -2.601230 -2.460896 -4.195091 [5,] -2.246331 -2.469401 -2.312064 -3.614919 -3.014530 -2.232484 -5.873203 [6,] -2.561867 -2.812913 -2.232745 -4.376755 -2.390970 -2.440833 -4.643851 parameters iterations log_lik[17] log_lik[18] log_lik[19] log_lik[20] log_lik[21] log_lik[22] log_lik[23] [1,] -2.602428 -2.541644 -2.411276 -2.925312 -2.550560 -2.796703 -2.397480 [2,] -2.549710 -2.440502 -2.909103 -3.472430 -2.915533 -2.369028 -2.480363 [3,] -2.402926 -2.791773 -2.423397 -2.767720 -2.795438 -2.507550 -2.496696 [4,] -2.397022 -3.282272 -2.553796 -2.661806 -2.936898 -2.413541 -2.518648 [5,] -2.272281 -2.768729 -2.857917 -3.144357 -2.565447 -2.513088 -2.259383 [6,] -2.226004 -3.294590 -2.237378 -2.464606 -2.320908 -3.001039 -2.225789 parameters iterations log_lik[24] log_lik[25] log_lik[26] log_lik[27] log_lik[28] log_lik[29] log_lik[30] [1,] -4.438382 -3.307734 -2.499837 -2.496503 -3.808212 -2.814545 -2.829593 [2,] -3.213944 -2.854593 -2.275338 -2.554512 -3.772549 -3.133717 -2.355333 [3,] -3.512033 -2.945362 -2.369487 -2.507912 -3.821285 -3.044316 -2.505695 [4,] -2.842706 -2.833433 -2.373597 -2.891791 -3.165134 -2.993189 -2.395956 [5,] -3.470845 -3.366605 -2.318398 -3.350387 -2.858380 -2.637408 -2.461301 [6,] -4.845740 -3.980841 -2.585876 -3.176836 -2.957508 -2.450304 -2.981159 parameters iterations lp__ [1,] -56.99328 [2,] -56.17719 [3,] -54.73138 [4,] -55.42088 [5,] -54.94832 [6,] -56.99186
# A wch = grep("beta.[2-3]", colnames(mcmc)) # Get the rowwise standard deviations between effects parameters sd.A = apply(mcmc[, wch], 1, sd) # B wch = grep("beta.[4]", colnames(mcmc)) sd.B = sd(data$B) * abs(mcmc[, wch]) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") sd.resid = apply(resid, 1, sd) sd.all = cbind(sd.A, sd.B, sd.resid) (fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 4.072432 1.3484539 1.394517 6.692788 2 sd.B 5.053796 0.7704775 3.514660 6.518542 3 sd.resid 3.993504 0.1978560 3.764060 4.389092
# OR expressed as a percentage (fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 31.42998 7.687620 14.88253 43.87950 2 sd.B 38.57359 5.482235 27.76693 48.93887 3 sd.resid 30.05401 4.027620 25.10582 39.01512
## we can even plot this as a Bayesian ANOVA table ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) + scale_y_continuous("Finite population standard deviation") + scale_x_discrete() + coord_flip() + theme_classic()
Conclusions:
Approximately 31.4%
of the total finite population standard deviation is due to x.
library(broom) mcmc = as.matrix(data.rstanarm) head(mcmc)
parameters iterations (Intercept) AGroup B AGroup C B sigma [1,] 49.25161 -18.13674 -20.71374 -0.3963549 4.384061 [2,] 50.28149 -20.18446 -23.60224 -0.3823662 4.601748 [3,] 47.22865 -13.74348 -20.42610 -0.3439257 3.559008 [4,] 50.36383 -18.72387 -23.75338 -0.3776768 4.211066 [5,] 49.33028 -18.88190 -20.94786 -0.3634085 3.830714 [6,] 49.08087 -14.40159 -22.87480 -0.4144611 3.793997
# A wch = grep("^A", colnames(mcmc)) # Get the rowwise standard deviations between effects parameters sd.A = apply(mcmc[, wch], 1, sd) # B wch = grep("^B", colnames(mcmc)) sd.B = sd(data$B) * abs(mcmc[, wch]) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates wch = grep("(Intercept)|^A|^B", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") sd.resid = apply(resid, 1, sd) sd.all = cbind(sd.A, sd.B, sd.resid) (fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 4.026941 1.3323358 1.272390 6.477067 2 sd.B 5.031510 0.8284226 3.467658 6.627846 3 sd.resid 4.004512 0.2119432 3.762475 4.420304
# OR expressed as a percentage (fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 31.14949 7.543440 13.88979 43.33205 2 sd.B 38.63719 5.482952 27.23552 49.17710 3 sd.resid 30.23369 4.336975 24.98450 40.19062
## we can even plot this as a Bayesian ANOVA table ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) + scale_y_continuous("Finite population standard deviation") + scale_x_discrete() + coord_flip() + theme_classic()
Conclusions:
Approximately 31.1%
of the total finite population standard deviation is due to x.
library(broom) mcmc = as.matrix(data.brms) head(mcmc)
parameters iterations b_Intercept b_AGroupB b_AGroupC b_B sigma lp__ [1,] 46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789 [2,] 50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635 [3,] 49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988 [4,] 52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522 [5,] 49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716 [6,] 51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362
# A wch = grep("^b_A", colnames(mcmc)) # Get the rowwise standard deviations between effects parameters sd.A = apply(mcmc[, wch], 1, sd) # B wch = grep("^b_B", colnames(mcmc)) sd.B = sd(data$B) * abs(mcmc[, wch]) # generate a model matrix newdata = data Xmat = model.matrix(~A + B, newdata) ## get median parameter estimates wch = c(grep("(b_Intercept)|^b_", colnames(mcmc))) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") sd.resid = apply(resid, 1, sd) sd.all = cbind(sd.A, sd.B, sd.resid) (fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 4.043769 1.3053747 1.460773 6.484054 2 sd.B 5.012018 0.7745195 3.599942 6.585264 3 sd.resid 3.982644 0.1855119 3.761674 4.356373
# OR expressed as a percentage (fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 sd.A 31.29667 7.332669 15.34081 43.31182 2 sd.B 38.43989 5.149650 28.75407 49.05275 3 sd.resid 30.14396 4.349195 24.45494 39.31055
## we can even plot this as a Bayesian ANOVA table ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0, linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) + scale_y_continuous("Finite population standard deviation") + scale_x_discrete() + coord_flip() + theme_classic()
Conclusions:
Approximately 31.3%
of the total finite population standard deviation is due to x.
$R^2$
In a frequentist context, the $R^2$ value is seen as a useful indicator of goodness of fit. Whilst it has long been acknowledged that this measure is not appropriate for comparing models (for such purposes information criterion such as AIC are more appropriate), it is nevertheless useful for estimating the amount (percent) of variance explained by the model.
In a frequentist context, $R^2$ is calculated as the variance in predicted values divided by the variance in the observed (response) values.
Unfortunately, this classical formulation does not translate simply into a Bayesian context since
the equivalently calculated numerator can be larger than the an equivalently calculated denominator - thereby resulting in an $R^2$
greater than 100%. Gelman, Goodrich, Gabry, and Ali (2017)
proposed an alternative
formulation in which the denominator comprises the sum of the explained variance and the variance of the residuals.
So in the standard regression model notation of: $$ \begin{align} y_i \sim{}& N(\mu_i, \sigma)\\ \mu_i =& \mathbf{X}\boldsymbol{\beta} \end{align} $$ The $R^2$ could be formulated as: $$ R^2 = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_e} $$ where $\sigma^2_f = var(\mu)$, ($\mu = \mathbf{X}\boldsymbol{\beta})$) and for Gaussian models $\sigma^2_e = var(y-\mu)$
library(broom) mcmc <- data.mcmcpack Xmat = model.matrix(~A + B, data) wch = grep("(Intercept)|^A|^B", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.8712662 0.02114025 0.8316521 0.8958785
# for comparison with frequentist summary(lm(Y ~ A + B, data))
Call: lm(formula = Y ~ A + B, data = data) Residuals: Min 1Q Median 3Q Max -8.671 -2.911 0.328 2.188 7.407 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 50.20598 1.71450 29.283 < 2e-16 *** AGroup B -17.15033 1.78618 -9.602 4.91e-10 *** AGroup C -22.91188 1.79775 -12.745 1.09e-12 *** B -0.41403 0.06143 -6.740 3.76e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.973 on 26 degrees of freedom Multiple R-squared: 0.884, Adjusted R-squared: 0.8706 F-statistic: 66.05 on 3 and 26 DF, p-value: 2.719e-12
library(broom) mcmc <- data.r2jags$BUGSoutput$sims.matrix Xmat = model.matrix(~A + B, data) wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.8708952 0.02144096 0.8290317 0.8960092
# for comparison with frequentist summary(lm(Y ~ A + B, data))
Call: lm(formula = Y ~ A + B, data = data) Residuals: Min 1Q Median 3Q Max -8.671 -2.911 0.328 2.188 7.407 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 50.20598 1.71450 29.283 < 2e-16 *** AGroup B -17.15033 1.78618 -9.602 4.91e-10 *** AGroup C -22.91188 1.79775 -12.745 1.09e-12 *** B -0.41403 0.06143 -6.740 3.76e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.973 on 26 degrees of freedom Multiple R-squared: 0.884, Adjusted R-squared: 0.8706 F-statistic: 66.05 on 3 and 26 DF, p-value: 2.719e-12
library(broom) mcmc <- as.matrix(data.rstan) Xmat = model.matrix(~A + B, data) wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.8706419 0.02178395 0.8297436 0.8959157
# for comparison with frequentist summary(lm(Y ~ A + B, data))
Call: lm(formula = Y ~ A + B, data = data) Residuals: Min 1Q Median 3Q Max -8.671 -2.911 0.328 2.188 7.407 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 50.20598 1.71450 29.283 < 2e-16 *** AGroup B -17.15033 1.78618 -9.602 4.91e-10 *** AGroup C -22.91188 1.79775 -12.745 1.09e-12 *** B -0.41403 0.06143 -6.740 3.76e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.973 on 26 degrees of freedom Multiple R-squared: 0.884, Adjusted R-squared: 0.8706 F-statistic: 66.05 on 3 and 26 DF, p-value: 2.719e-12
library(broom) mcmc <- as.matrix(data.rstanarm) Xmat = model.matrix(~A + B, data) wch = grep("Intercept|^A|^B", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.8699886 0.0241677 0.8256175 0.8960312
# for comparison with frequentist summary(lm(Y ~ A + B, data))
Call: lm(formula = Y ~ A + B, data = data) Residuals: Min 1Q Median 3Q Max -8.671 -2.911 0.328 2.188 7.407 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 50.20598 1.71450 29.283 < 2e-16 *** AGroup B -17.15033 1.78618 -9.602 4.91e-10 *** AGroup C -22.91188 1.79775 -12.745 1.09e-12 *** B -0.41403 0.06143 -6.740 3.76e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.973 on 26 degrees of freedom Multiple R-squared: 0.884, Adjusted R-squared: 0.8706 F-statistic: 66.05 on 3 and 26 DF, p-value: 2.719e-12
library(broom) mcmc <- as.matrix(data.brms) Xmat = model.matrix(~A + B, data) wch = grep("b_", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data$Y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.8712322 0.02007052 0.8322119 0.8956157
# for comparison with frequentist summary(lm(Y ~ A + B, data))
Call: lm(formula = Y ~ A + B, data = data) Residuals: Min 1Q Median 3Q Max -8.671 -2.911 0.328 2.188 7.407 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 50.20598 1.71450 29.283 < 2e-16 *** AGroup B -17.15033 1.78618 -9.602 4.91e-10 *** AGroup C -22.91188 1.79775 -12.745 1.09e-12 *** B -0.41403 0.06143 -6.740 3.76e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.973 on 26 degrees of freedom Multiple R-squared: 0.884, Adjusted R-squared: 0.8706 F-statistic: 66.05 on 3 and 26 DF, p-value: 2.719e-12
Dealing with heterogeneous slopes
- the sample size per treatment=10
- the categorical $x$ variable with 3 levels
- the first treatment group has a population mean of 40.
- the other two treatments reduce the mean by 15 and 20 units respectively
- the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
- the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15
set.seed(6) n <- 10 p <- 3 A.eff <- c(40, -15, -20) beta <- c(-0.45, -0.1, 0.5) sigma <- 4 B <- rnorm(n * p, 0, 15) A <- gl(p, n, lab = paste("Group", LETTERS[1:3])) mm <- model.matrix(~A * B) data1 <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4)) data1$B <- data1$B + 20 head(data1)
A B Y 1 Group A 24.04409 35.49432 2 Group A 10.55022 46.22216 3 Group A 33.02990 29.41898 4 Group A 45.90793 24.10656 5 Group A 20.36281 44.38834 6 Group A 25.52038 36.87477
Exploratory data analysis
library(car) scatterplot(Y ~ B | A, data = data1)
boxplot(Y ~ A, data1)
# OR via ggplot library(ggplot2) ggplot(data1, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")
ggplot(data1, aes(y = Y, x = A)) + geom_boxplot()
- there is no evidence of obvious non-normality
- the assumption of linearity seems reasonable
- the variability of the three groups seems approximately equal
- the slopes (Y vs B trends) do appear to differ between treatment groups - in particular, Group C seems to portray a different trend to Groups A and B.
Homogeneity of slopes
We can explore inferential evidence of unequal slopes by examining estimated effects of the interaction between the categorical variable and the covariate. Note, pay no attention to the main effects - only the interaction. Even though I intend to illustrate Bayesian analyses here, for such a simple model, it is considerably simpler to use traditional OLS for testing for the presence of an interaction..
anova(lm(Y ~ B * A, data = data1))
Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) B 1 1257.14 1257.14 98.520 5.685e-10 *** A 2 2042.02 1021.01 80.015 2.420e-11 *** B:A 2 510.02 255.01 19.985 7.778e-06 *** Residuals 24 306.25 12.76 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model fitting or statistical analysis
The multiplicative model is constructed and specified in the same way as the additive model. The difference is that the model matrix ($\mathbf{X}$) contains the interaction terms.
$$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$
library(MCMCpack) data1.mcmcpack <- MCMCregress(Y ~ A * B, data = data1)
Define the model
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mean[i],tau) mean[i] <- inprod(beta[],X[i,]) } #Priors for (i in 1:ngroups) { beta[i] ~ dnorm(0, 1.0E-6) } sigma ~ dunif(0, 100) tau <- 1 / (sigma * sigma) } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
- the number of predictor terms (nX)
X <- model.matrix(~A * B, data1) data1.list <- with(data1, list(y = Y, X = X, n = nrow(data1), ngroups = ncol(X)))
Define the MCMC chain parameters
Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:
- the nodes (estimated parameters) to monitor (return samples for)
- the number of MCMC chains (3)
- the number of burnin steps (1000)
- the thinning factor (10)
- the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains
params <- c("beta", "sigma") nChains = 3 burnInSteps = 3000 thinSteps = 10 numSavedSteps = 15000 #across all chains nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains) nIter
[1] 53000
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
library(R2jags)
data1.r2jags <- jags(data = data1.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 30 Unobserved stochastic nodes: 7 Total graph size: 298 Initializing model
print(data1.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 49.553 2.336 45.049 48.003 49.538 51.092 54.162 1.001 15000 beta[2] -12.227 3.093 -18.248 -14.285 -12.228 -10.182 -6.142 1.001 11000 beta[3] -28.270 2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000 beta[4] -0.495 0.093 -0.680 -0.556 -0.494 -0.433 -0.313 1.001 15000 beta[5] -0.121 0.115 -0.347 -0.197 -0.122 -0.046 0.105 1.001 15000 beta[6] 0.410 0.111 0.193 0.336 0.410 0.482 0.636 1.001 15000 sigma 3.772 0.585 2.827 3.354 3.707 4.106 5.105 1.001 15000 deviance 163.101 4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 9.6 and DIC = 172.7 DIC is an estimate of expected predictive error (lower deviance is better).
data1.mcmc.list <- as.mcmc(data1.r2jags)
Model matrix formulation
The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:- the normal distribution is defined by variance rather than precision
- rather than using a uniform prior for sigma, I am using a half-Cauchy
We now translate the likelihood model into STAN code.
$$\begin{align}
y_{ij}&\sim{}N(\mu_{ij}, \sigma)\\
\mu_{ij} &= \mathbf{X}\boldsymbol{\beta}\\
\beta_0&\sim{}N(0,100)\\
\beta&\sim{}N(0,10)\\
\sigma&\sim{}Cauchy(0,5)\\
\end{align}
$$
Define the model
modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; } parameters { vector[nX] beta; real<lower=0> sigma; } transformed parameters { vector[n] mu; mu = X*beta; } model { #Likelihood y~normal(mu,sigma); #Priors beta ~ normal(0,100); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); } } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
- the number of predictor terms (nX)
Xmat <- model.matrix(~A * B, data1) data1.list <- with(data1, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data1)))
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
## load the rstan package library(rstan)
data1.rstan <- stan(data = data1.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1). Gradient evaluation took 2.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.099447 seconds (Warm-up) 0.151706 seconds (Sampling) 0.251153 seconds (Total) SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.106983 seconds (Warm-up) 0.116191 seconds (Sampling) 0.223174 seconds (Total) SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.142543 seconds (Warm-up) 0.116691 seconds (Sampling) 0.259234 seconds (Total)
print(data1.rstan, par = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 49.50 0.08 2.34 44.55 48.01 49.58 51.01 54.11 902 1 beta[2] -12.29 0.09 3.04 -18.60 -14.30 -12.17 -10.22 -6.70 1094 1 beta[3] -28.21 0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04 957 1 beta[4] -0.49 0.00 0.09 -0.67 -0.55 -0.49 -0.44 -0.31 978 1 beta[5] -0.12 0.00 0.11 -0.34 -0.19 -0.12 -0.05 0.11 1164 1 beta[6] 0.41 0.00 0.11 0.18 0.34 0.41 0.48 0.62 953 1 sigma 3.73 0.02 0.58 2.81 3.32 3.66 4.05 5.07 1271 1 Samples were drawn using NUTS(diag_e) at Sat Nov 4 07:45:04 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
library(rstanarm) library(broom) library(coda)
data1.rstanarm = stan_glm(Y ~ A * B, data = data1, iter = 2000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100), prior = normal(0, 100), prior_aux = cauchy(0, 2))
Gradient evaluation took 7.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.71 seconds. Adjust your expectations accordingly! Elapsed Time: 0.586557 seconds (Warm-up) 0.34573 seconds (Sampling) 0.932287 seconds (Total) Gradient evaluation took 2.5e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds. Adjust your expectations accordingly! Elapsed Time: 0.478284 seconds (Warm-up) 0.353797 seconds (Sampling) 0.832081 seconds (Total) Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Elapsed Time: 0.438293 seconds (Warm-up) 0.338371 seconds (Sampling) 0.776664 seconds (Total)
print(data1.rstanarm)
stan_glm family: gaussian [identity] formula: Y ~ A * B ------ Estimates: Median MAD_SD (Intercept) 49.6 2.3 AGroup B -12.3 3.1 AGroup C -28.3 3.0 B -0.5 0.1 AGroup B:B -0.1 0.1 AGroup C:B 0.4 0.1 sigma 3.7 0.6 Sample avg. posterior predictive distribution of y (X = xbar): Median MAD_SD mean_PPD 26.9 1.0 ------ For info on the priors used see help('prior_summary.stanreg').
tidyMCMC(data1.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 49.5438430 2.38328655 44.8579446 54.1707410 2 AGroup B -12.3097610 3.13510315 -18.8059933 -6.6018889 3 AGroup C -28.2751561 3.02810337 -33.8195233 -21.9065319 4 B -0.4940998 0.09381214 -0.6922853 -0.3200262 5 AGroup B:B -0.1204605 0.11724241 -0.3514843 0.1104638 6 AGroup C:B 0.4111006 0.11271108 0.1810014 0.6214744 7 sigma 3.7727701 0.59136185 2.7322401 4.9499110
library(brms) library(broom) library(coda)
data1.brms = brm(Y ~ A * B, data = data1, iter = 2000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"), prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 2.9e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds. Adjust your expectations accordingly! Elapsed Time: 0.094494 seconds (Warm-up) 0.100503 seconds (Sampling) 0.194997 seconds (Total) Gradient evaluation took 8e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds. Adjust your expectations accordingly! Elapsed Time: 0.076777 seconds (Warm-up) 0.104894 seconds (Sampling) 0.181671 seconds (Total) Gradient evaluation took 9e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds. Adjust your expectations accordingly! Elapsed Time: 0.091052 seconds (Warm-up) 0.09965 seconds (Sampling) 0.190702 seconds (Total)
print(data1.brms)
Family: gaussian(identity) Formula: Y ~ A * B Data: data1 (Number of observations: 30) Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; total post-warmup samples = 2250 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 49.60 2.32 44.97 54.25 1182 1 AGroupB -12.33 3.06 -18.49 -6.37 1216 1 AGroupC -28.29 2.97 -34.18 -22.34 1246 1 B -0.49 0.09 -0.67 -0.32 1160 1 AGroupB:B -0.12 0.11 -0.35 0.11 1186 1 AGroupC:B 0.41 0.11 0.19 0.62 1248 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 3.71 0.54 2.81 4.98 1423 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
tidyMCMC(data1.brms, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 b_Intercept 49.6027270 2.3173028 44.8624049 54.0547491 2 b_AGroupB -12.3262056 3.0553552 -18.7775957 -6.7594663 3 b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177 4 b_B -0.4926334 0.0908813 -0.6696454 -0.3177777 5 b_AGroupB:B -0.1212027 0.1126979 -0.3497945 0.1053154 6 b_AGroupC:B 0.4067397 0.1097094 0.1878784 0.6203292 7 sigma 3.7129053 0.5394484 2.6776034 4.7406944
MCMC diagnostics
In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.
Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.
- Traceplots for each parameter illustrate the MCMC sample values after each successive
iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests
that the MCMC sampling chains may not have completely traversed all features of the posterior
distribution and that more iterations are required to ensure the distribution has been accurately
represented.
- Autocorrelation plot for each paramter illustrate the degree of correlation between
MCMC samples separated by different lags. For example, a lag of 0 represents the degree of
correlation between each MCMC sample and itself (obviously this will be a correlation of 1).
A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain
and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be
independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom
autocorrelation plot.
- Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.
- Trace plots
View trace plots
library(MCMCpack) plot(data1.mcmcpack)
- Raftery diagnostic
View Raftery diagnostic
library(MCMCpack) raftery.diag(data1.mcmcpack)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) (Intercept) 2 3929 3746 1.050 AGroup B 2 3650 3746 0.974 AGroup C 2 3710 3746 0.990 B 2 3650 3746 0.974 AGroup B:B 2 3962 3746 1.060 AGroup C:B 2 3865 3746 1.030 sigma2 2 3929 3746 1.050
- Autocorrelation diagnostic
View autocorrelations
library(MCMCpack) autocorr.diag(data1.mcmcpack)
(Intercept) AGroup B AGroup C B AGroup B:B AGroup C:B sigma2 Lag 0 1.00000000 1.000000000 1.000000000 1.000000000 1.000000000 1.0000000000 1.000000000 Lag 1 -0.01234075 -0.008277569 -0.011438936 -0.003198364 -0.011825247 -0.0003818285 0.217142614 Lag 5 0.01859622 0.019122855 -0.003823357 0.013434001 0.026688474 0.0020263629 0.005952516 Lag 10 0.01991699 0.021417875 0.011460174 0.013885559 0.022966909 0.0030683915 0.023030028 Lag 50 -0.00543317 -0.009493830 -0.007385782 -0.005974147 -0.008253629 0.0009582506 0.011223549
Again, prior to examining the summaries, we should have explored the convergence diagnostics.
library(coda) data1.mcmc = as.mcmc(data1.r2jags)
- Trace plots
plot(data1.mcmc)
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]") plot(as.mcmc(data1.r2jags)[, preds])
- Raftery diagnostic
raftery.diag(data1.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 37410 3746 9.99 beta[2] 20 38660 3746 10.30 beta[3] 20 36800 3746 9.82 beta[4] 20 38660 3746 10.30 beta[5] 20 37410 3746 9.99 beta[6] 20 38030 3746 10.20 deviance 20 39000 3746 10.40 sigma 20 36200 3746 9.66 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 37410 3746 9.99 beta[2] 20 38030 3746 10.20 beta[3] 20 38050 3746 10.20 beta[4] 10 37410 3746 9.99 beta[5] 20 36800 3746 9.82 beta[6] 20 36800 3746 9.82 deviance 20 38030 3746 10.20 sigma 20 36800 3746 9.82 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 36200 3746 9.66 beta[2] 20 37410 3746 9.99 beta[3] 20 39300 3746 10.50 beta[4] 20 39300 3746 10.50 beta[5] 20 36800 3746 9.82 beta[6] 20 36800 3746 9.82 deviance 20 39300 3746 10.50 sigma 20 36810 3746 9.83
- Autocorrelation diagnostic
autocorr.diag(data1.mcmc)
beta[1] beta[2] beta[3] beta[4] beta[5] beta[6] deviance Lag 0 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.0000000000 Lag 10 0.004437399 0.002545394 0.005457633 0.002408673 0.005032132 0.003931065 -0.0005090962 Lag 50 -0.005470768 -0.006470215 -0.008316906 -0.004031177 -0.009725834 -0.003744444 0.0069346156 Lag 100 -0.006510539 -0.018290658 -0.009018695 0.003875208 -0.005127466 0.004179197 -0.0033519319 Lag 500 -0.002091585 -0.006610883 0.001302301 -0.000777923 -0.003703581 0.005995650 0.0081995570 sigma Lag 0 1.000000000 Lag 10 0.002181234 Lag 50 -0.002582261 Lag 100 0.004125329 Lag 500 0.008170166
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data1.rstan) wch = grep("beta", dimnames(s)$parameters) s = s[, , wch] mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data1.rstan) wch = grep("beta", dimnames(s)$parameters) s = s[, , wch] mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) autocorr.diag(mcmc)
beta[1] beta[2] beta[3] beta[4] beta[5] Lag 0 1.00000000 1.000000000 1.000000000 1.000000000 1.000000000 Lag 1 0.19962255 0.134452951 0.162486507 0.181592704 0.125717018 Lag 5 -0.01473278 0.013954338 -0.020209597 -0.007642484 0.007016602 Lag 10 -0.01802321 -0.008248189 -0.007702265 -0.019093851 -0.002760047 Lag 50 -0.03711757 -0.035737496 -0.030485873 -0.020661680 -0.014791965
- via rstan
- Traceplots
stan_trace(data1.rstan)
- Raftery diagnostic
raftery.diag(data1.rstan)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data1.rstan)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data1.rstan)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data1.rstan)
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.matrix(data1.rstan), regex_pars = "beta|sigma")
library(bayesplot) mcmc_combo(as.matrix(data1.rstan), regex_pars = "beta|sigma")
- Density plots
library(bayesplot) mcmc_dens(as.matrix(data1.rstan), regex_pars = "beta|sigma")
- Trace plots and density plots
- via shinystan
library(shinystan) launch_shinystan(data1.rstan)
- It is worth exploring the influence of our priors.
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data1.rstanarm) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data1.rstanarm) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc)) autocorr.diag(mcmc)
(Intercept) AGroup B AGroup C B AGroup B:B AGroup C:B Lag 0 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 Lag 1 0.051997304 0.022839434 0.049122699 0.029332271 0.018600498 0.039303026 Lag 5 -0.009030666 -0.023573121 -0.021521616 0.002608995 -0.008629405 -0.008541973 Lag 10 0.022740083 0.005983251 -0.006472736 0.009874178 -0.009109182 -0.041524169 Lag 50 0.023718877 0.022629378 0.043072824 0.029933371 0.029119110 0.025530832
- via rstan
- Traceplots
stan_trace(data1.rstanarm)
- Raftery diagnostic
raftery.diag(data1.rstanarm)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data1.rstanarm)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data1.rstanarm)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data1.rstanarm)
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.array(data1.rstanarm), regex_pars = "Intercept|x|sigma")
mcmc_combo(as.array(data1.rstanarm))
- Density plots
mcmc_dens(as.array(data1.rstanarm))
- Trace plots and density plots
- via rstanarm
The rstanarm package provides additional posterior checks.- Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior.
If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential.
On the other hand, overly wide priors can lead to computational issues.
library(rstanarm) posterior_vs_prior(data1.rstanarm, color_by = "vs", group_by = TRUE, facet_args = list(scales = "free_y"))
Gradient evaluation took 6.9e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.69 seconds. Adjust your expectations accordingly! Elapsed Time: 0.403063 seconds (Warm-up) 0.077008 seconds (Sampling) 0.480071 seconds (Total) Gradient evaluation took 1.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds. Adjust your expectations accordingly! Elapsed Time: 0.280175 seconds (Warm-up) 0.104319 seconds (Sampling) 0.384494 seconds (Total)
- Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior.
If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential.
On the other hand, overly wide priors can lead to computational issues.
- via shinystan
library(shinystan) launch_shinystan(data1.rstanarm))
Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than duplicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) mcmc = as.mcmc(data1.brms) plot(mcmc)
library(coda) mcmc = as.mcmc(data1.brms) autocorr.diag(mcmc)
Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
- via rstan
- Traceplots
stan_trace(data1.brms$fit)
- Raftery diagnostic
raftery.diag(data1.brms)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data1.brms$fit)
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data1.brms$fit)
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data1.brms$fit)
- Traceplots
Model validation
Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.
For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.
There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.
Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:
Standardized residuals: | the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known). | |
Studentized residuals: | the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out. | |
Pearson residuals: | the raw residuals divided by the standard deviation of the response variable. |
The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.
The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.
Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.
mcmc = as.data.frame(data1.mcmcpack) # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates head(mcmc)
(Intercept) AGroup B AGroup C B AGroup B:B AGroup C:B sigma2 1 48.70652 -11.241113 -25.43591 -0.4057086 -0.1929317 0.2966322 9.505405 2 43.17996 -8.274010 -22.66362 -0.3300430 -0.1959714 0.2629181 13.109720 3 49.48138 -10.480517 -27.46155 -0.4829662 -0.1457857 0.3635239 14.426140 4 47.75332 -11.387909 -27.18387 -0.3879251 -0.2055644 0.3181976 10.130435 5 49.43988 -11.918492 -29.20874 -0.4436741 -0.2074017 0.3961033 9.917436 6 46.51075 -6.677204 -22.78787 -0.3225533 -0.3883363 0.1657746 11.947826
coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = as.data.frame(data1.mcmcpack) # generate a model matrix newdata1 = newdata1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit newdata1 = newdata1 %>% cbind(fit, resid) ggplot(newdata1) + geom_point(aes(y = resid, x = A))
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = as.data.frame(data1.mcmcpack) # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = as.matrix(data1.mcmcpack) # generate a model matrix Xmat = model.matrix(~A * B, data1) ## get median parameter estimates coefs = mcmc[, 1:6] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i, ], sqrt(mcmc[i, "sigma2"]))) newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
mcmc_areas(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.
mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = newdata1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit newdata1 = newdata1 %>% cbind(fit, resid) ggplot(newdata1) + geom_point(aes(y = resid, x = A))
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data1 = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix Xmat = model.matrix(~A * B, data1) ## get median parameter estimates coefs = mcmc[, 1:6] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i, ], mcmc[i, "sigma"])) newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
mcmc_areas(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.
mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = newdata1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit newdata1 = newdata1 %>% cbind(fit, resid) ggplot(newdata1) + geom_point(aes(y = resid, x = A))
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix newdata1 = data1 Xmat = model.matrix(~A * B, newdata1) ## get median parameter estimates coefs = apply(mcmc[, 1:6], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data1$Y - fit sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"), sigma) %>% as.matrix # generate a model matrix Xmat = model.matrix(~A * B, data1) ## get median parameter estimates coefs = mcmc[, 1:6] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i, ], mcmc[i, "sigma"])) newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample, value = Value, -A, -B) ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data1.rstan), regex_pars = "beta|sigma")
mcmc_areas(as.matrix(data1.rstan), regex_pars = "beta|sigma")
Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.
resid = resid(data1.rstanarm) fit = fitted(data1.rstanarm) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
resid = resid(data1.rstanarm) dat = data1 %>% mutate(resid = resid) ggplot(dat) + geom_point(aes(y = resid, x = A))
ggplot(dat) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
resid = resid(data1.rstanarm) sresid = resid/sd(resid) fit = fitted(data1.rstanarm) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
y_pred = posterior_predict(data1.rstanarm) newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value", -A:-Y) ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
mcmc_areas(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.
resid = resid(data1.brms)[, "Estimate"] fit = fitted(data1.brms)[, "Estimate"] ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Residuals against predictors
resid = resid(data1.brms)[, "Estimate"] dat = data1 %>% mutate(resid = resid) ggplot(dat) + geom_point(aes(y = resid, x = A))
ggplot(dat) + geom_point(aes(y = resid, x = B))
And now for studentized residuals
resid = resid(data1.brms)[, "Estimate"] sresid = resid/sd(resid) fit = fitted(data1.brms)[, "Estimate"] ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).
Lets see how well data simulated from the model reflects the raw data
y_pred = posterior_predict(data1.brms) newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value", -A:-Y) ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"), alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A, fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = A), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model", group = B, color = A), alpha = 0.5) + geom_point(data = data1, aes(y = Y, x = B, group = B, color = A))
Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(data1.brms), regex_pars = "b_|sigma")
mcmc_areas(as.matrix(data1.brms), regex_pars = "b_|sigma")
Parameter estimates (posterior summaries)
Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } }
Matrix model (MCMCpack)
summary(data1.mcmcpack)
Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 49.5353 2.29334 0.0229334 0.0226186 AGroup B -12.2262 3.09179 0.0309179 0.0309179 AGroup C -28.2439 2.93282 0.0293282 0.0293282 B -0.4938 0.09038 0.0009038 0.0008764 AGroup B:B -0.1211 0.11358 0.0011358 0.0011178 AGroup C:B 0.4089 0.10829 0.0010829 0.0010829 sigma2 13.8964 4.43200 0.0443200 0.0552651 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) 45.0690 48.0206 49.5282 51.01720 54.0757 AGroup B -18.3801 -14.2357 -12.1642 -10.19136 -6.1832 AGroup C -34.0521 -30.1795 -28.2026 -26.30389 -22.5493 B -0.6743 -0.5528 -0.4940 -0.43428 -0.3178 AGroup B:B -0.3412 -0.1958 -0.1209 -0.04575 0.1030 AGroup C:B 0.1948 0.3382 0.4086 0.47902 0.6240 sigma2 7.7536 10.8320 13.0991 16.09205 24.8023
# OR library(broom) tidyMCMC(data1.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 49.5353232 2.29333768 45.1289511 54.1244471 2 AGroup B -12.2262443 3.09178506 -18.4166155 -6.2375566 3 AGroup C -28.2438752 2.93282206 -33.9185333 -22.4861115 4 B -0.4938007 0.09037721 -0.6733914 -0.3173760 5 AGroup B:B -0.1211118 0.11358441 -0.3414655 0.1027879 6 AGroup C:B 0.4088582 0.10828868 0.1931050 0.6219178 7 sigma2 13.8964411 4.43200264 7.0646504 22.9045959
- the intercept of the first group (Group A) is
49.5353232
- the mean of the second group (Group B) is
-12.2262443
units greater than (A) - the mean of the third group (Group C) is
-28.2438752
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4938007
units increase in Y - difference in slope between Group B and Group A
-0.1211118
- difference in slope between Group C and Group A
0.4088582
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(data1.mcmcpack[, 2]) # effect of (B-A = 0)
[1] 0.001
mcmcpvalue(data1.mcmcpack[, 3]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(data1.mcmcpack[, 4]) # effect of (slope = 0)
[1] 0
mcmcpvalue(data1.mcmcpack[, 5]) # effect of (slopeB - slopeA = 0)
[1] 0.2763
mcmcpvalue(data1.mcmcpack[, 6]) # effect of (slopeC - slopeA = 0)
[1] 7e-04
mcmcpvalue(data1.mcmcpack[, 2:6]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups.
Matrix model (JAGS)
print(data1.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 49.553 2.336 45.049 48.003 49.538 51.092 54.162 1.001 15000 beta[2] -12.227 3.093 -18.248 -14.285 -12.228 -10.182 -6.142 1.001 11000 beta[3] -28.270 2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000 beta[4] -0.495 0.093 -0.680 -0.556 -0.494 -0.433 -0.313 1.001 15000 beta[5] -0.121 0.115 -0.347 -0.197 -0.122 -0.046 0.105 1.001 15000 beta[6] 0.410 0.111 0.193 0.336 0.410 0.482 0.636 1.001 15000 sigma 3.772 0.585 2.827 3.354 3.707 4.106 5.105 1.001 15000 deviance 163.101 4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 9.6 and DIC = 172.7 DIC is an estimate of expected predictive error (lower deviance is better).
# OR library(broom) tidyMCMC(as.mcmc(data1.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta[1] 49.5534512 2.33556926 45.0234105 54.1328491 2 beta[2] -12.2268098 3.09333864 -18.3836687 -6.3089149 3 beta[3] -28.2700616 2.99256629 -34.2921382 -22.4881054 4 beta[4] -0.4947167 0.09264266 -0.6838619 -0.3183081 5 beta[5] -0.1214410 0.11480484 -0.3469758 0.1054090 6 beta[6] 0.4099855 0.11145489 0.1877435 0.6289609 7 deviance 163.1011311 4.37873657 156.1255607 171.7611813 8 sigma 3.7720815 0.58508086 2.7189187 4.9291545
- the intercept of the first group (Group A) is
49.5534512
- the mean of the second group (Group B) is
-12.2268098
units greater than (A) - the mean of the third group (Group C) is
-28.2700616
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4947167
units increase in Y - difference in slope between Group B and Group A
-0.121441
- difference in slope between Group C and Group A
0.4099855
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[2]"]) # effect of (B-A = 0)
[1] 0.0003333333
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[3]"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[4]"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[5]"]) # effect of (slopeB - slopeA = 0)
[1] 0.28
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[6]"]) # effect of (slopeC - slopeA = 0)
[1] 0.0007333333
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, 2:6]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups.
Matrix model (RSTAN)
print(data1.rstan, pars = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 49.50 0.08 2.34 44.55 48.01 49.58 51.01 54.11 902 1 beta[2] -12.29 0.09 3.04 -18.60 -14.30 -12.17 -10.22 -6.70 1094 1 beta[3] -28.21 0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04 957 1 beta[4] -0.49 0.00 0.09 -0.67 -0.55 -0.49 -0.44 -0.31 978 1 beta[5] -0.12 0.00 0.11 -0.34 -0.19 -0.12 -0.05 0.11 1164 1 beta[6] 0.41 0.00 0.11 0.18 0.34 0.41 0.48 0.62 953 1 sigma 3.73 0.02 0.58 2.81 3.32 3.66 4.05 5.07 1271 1 Samples were drawn using NUTS(diag_e) at Sat Nov 4 07:45:04 2017. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
# OR library(broom) tidyMCMC(data1.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))
term estimate std.error conf.low conf.high 1 beta[1] 49.5013545 2.34230542 44.9503609 54.24168440 2 beta[2] -12.2889364 3.04350709 -18.5232918 -6.66905473 3 beta[3] -28.2129289 2.96719159 -33.8515230 -22.12391688 4 beta[4] -0.4929557 0.09152697 -0.6703121 -0.30402421 5 beta[5] -0.1195490 0.11233322 -0.3536692 0.08719164 6 beta[6] 0.4082751 0.11091872 0.1750039 0.61796175 7 sigma 3.7255865 0.57520983 2.6288615 4.77664381
- the intercept of the first group (Group A) is
49.5013545
- the mean of the second group (Group B) is
-12.2889364
units greater than (A) - the mean of the third group (Group C) is
-28.2129289
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4929557
units increase in Y - difference in slope between Group B and Group A
-0.119549
- difference in slope between Group C and Group A
0.4082751
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero mcmcpvalue(as.matrix(data1.rstan)[, "beta[2]"]) # effect of (B-A = 0)
[1] 0.0006666667
mcmcpvalue(as.matrix(data1.rstan)[, "beta[3]"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstan)[, "beta[4]"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstan)[, "beta[5]"]) # effect of (slopeB - slopeA = 0)
[1] 0.278
mcmcpvalue(as.matrix(data1.rstan)[, "beta[6]"]) # effect of (slopeC - slopeA = 0)
[1] 0.0006666667
mcmcpvalue(as.matrix(data1.rstan)[, 2:6]) # effect of (model)
[1] 0
There is evidence that a model that includes an interaction is better than an additive model.
library(loo) (full = loo(extract_log_lik(data1.rstan)))
Computed from 1500 by 30 log-likelihood matrix Estimate SE elpd_loo -85.3 4.0 p_loo 6.3 1.6 looic 170.6 7.9 Pareto k diagnostic values: Count Pct (-Inf, 0.5] (good) 26 86.7% (0.5, 0.7] (ok) 4 13.3% (0.7, 1] (bad) 0 0.0% (1, Inf) (very bad) 0 0.0% All Pareto k estimates are ok (k < 0.7) See help('pareto-k-diagnostic') for details.
# now fit a model without main factor modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; } parameters { vector[nX] beta; real<lower=0> sigma; } transformed parameters { vector[n] mu; mu = X*beta; } model { #Likelihood y~normal(mu,sigma); #Priors beta ~ normal(0,1000); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); } } " Xmat <- model.matrix(~A + B, data1) data1.list <- with(data1, list(y = Y, X = Xmat, n = nrow(data1), nX = ncol(Xmat))) data1.rstan.red <- stan(data = data1.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1). Gradient evaluation took 4.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.42 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.138805 seconds (Warm-up) 0.122949 seconds (Sampling) 0.261754 seconds (Total) SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2). Gradient evaluation took 1.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.106299 seconds (Warm-up) 0.113143 seconds (Sampling) 0.219442 seconds (Total) SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3). Gradient evaluation took 1.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.120129 seconds (Warm-up) 0.110436 seconds (Sampling) 0.230565 seconds (Total)
(reduced = loo(extract_log_lik(data1.rstan.red)))
Computed from 1500 by 30 log-likelihood matrix Estimate SE elpd_loo -97.9 3.3 p_loo 5.0 1.1 looic 195.8 6.7 Pareto k diagnostic values: Count Pct (-Inf, 0.5] (good) 28 93.3% (0.5, 0.7] (ok) 2 6.7% (0.7, 1] (bad) 0 0.0% (1, Inf) (very bad) 0 0.0% All Pareto k estimates are ok (k < 0.7) See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Matrix model (RSTANARM)
summary(data1.rstanarm)
Model Info: function: stan_glm family: gaussian [identity] formula: Y ~ A * B algorithm: sampling priors: see help('prior_summary') sample: 2250 (posterior sample size) num obs: 30 Estimates: mean sd 2.5% 25% 50% 75% 97.5% (Intercept) 49.5 2.4 44.9 48.0 49.6 51.1 54.3 AGroup B -12.3 3.1 -18.3 -14.4 -12.3 -10.2 -6.1 AGroup C -28.3 3.0 -34.3 -30.3 -28.3 -26.3 -22.3 B -0.5 0.1 -0.7 -0.6 -0.5 -0.4 -0.3 AGroup B:B -0.1 0.1 -0.3 -0.2 -0.1 0.0 0.1 AGroup C:B 0.4 0.1 0.2 0.3 0.4 0.5 0.6 sigma 3.8 0.6 2.8 3.3 3.7 4.1 5.2 mean_PPD 26.9 1.0 24.9 26.2 26.9 27.5 28.9 log-posterior -97.2 2.2 -102.5 -98.4 -96.8 -95.6 -94.1 Diagnostics: mcse Rhat n_eff (Intercept) 0.1 1.0 1938 AGroup B 0.1 1.0 1974 AGroup C 0.1 1.0 1902 B 0.0 1.0 2126 AGroup B:B 0.0 1.0 2068 AGroup C:B 0.0 1.0 1989 sigma 0.0 1.0 1226 mean_PPD 0.0 1.0 1497 log-posterior 0.1 1.0 1030 For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# OR library(broom) tidyMCMC(data1.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 49.5438430 2.38328655 44.8579446 54.1707410 2 AGroup B -12.3097610 3.13510315 -18.8059933 -6.6018889 3 AGroup C -28.2751561 3.02810337 -33.8195233 -21.9065319 4 B -0.4940998 0.09381214 -0.6922853 -0.3200262 5 AGroup B:B -0.1204605 0.11724241 -0.3514843 0.1104638 6 AGroup C:B 0.4111006 0.11271108 0.1810014 0.6214744 7 sigma 3.7727701 0.59136185 2.7322401 4.9499110 8 mean_PPD 26.8713702 1.00446504 24.9737303 29.0017404 9 log-posterior -97.2208223 2.19450449 -101.6130115 -93.7209496
- the intercept of the first group (Group A) is
49.543843
- the mean of the second group (Group B) is
-12.309761
units greater than (A) - the mean of the third group (Group C) is
-28.2751561
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4940998
units increase in Y - difference in slope between Group B and Group A
-0.1204605
- difference in slope between Group C and Group A
0.4111006
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero head(as.matrix(data1.rstanarm))
parameters iterations (Intercept) AGroup B AGroup C B AGroup B:B AGroup C:B sigma [1,] 49.10544 -12.03203 -30.24254 -0.5408720 -0.07177445 0.5097346 3.368864 [2,] 49.57108 -10.46682 -27.74189 -0.5167232 -0.15566016 0.4159785 3.202435 [3,] 52.79242 -16.30169 -31.83083 -0.5749994 -0.08438700 0.4699279 3.849197 [4,] 50.85134 -14.46853 -28.92998 -0.5535872 0.03970759 0.4676661 4.339336 [5,] 52.41561 -18.30582 -30.55315 -0.4583573 -0.10973607 0.3507038 4.801621 [6,] 52.90121 -18.31527 -30.77699 -0.6350685 0.13775093 0.4858842 3.541416
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B"]) # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "B"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B:B"]) # effect of (slopeB - slopeA = 0)
[1] 0.2951111
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C:B"]) # effect of (slopeC - slopeA = 0)
[1] 0.001777778
mcmcpvalue(as.matrix(data1.rstanarm)[, 2:6]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A
library(loo) (full = loo(data1.rstanarm))
Computed from 2250 by 30 log-likelihood matrix Estimate SE elpd_loo -85.9 4.1 p_loo 6.8 1.8 looic 171.9 8.2 Pareto k diagnostic values: Count Pct (-Inf, 0.5] (good) 24 80.0% (0.5, 0.7] (ok) 5 16.7% (0.7, 1] (bad) 1 3.3% (1, Inf) (very bad) 0 0.0% See help('pareto-k-diagnostic') for details.
data1.rstanarm.red = update(data1.rstanarm, . ~ A + B)
Gradient evaluation took 4.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.41 seconds. Adjust your expectations accordingly! Elapsed Time: 0.252022 seconds (Warm-up) 0.130327 seconds (Sampling) 0.382349 seconds (Total) Gradient evaluation took 1.9e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. Adjust your expectations accordingly! Elapsed Time: 0.182518 seconds (Warm-up) 0.152707 seconds (Sampling) 0.335225 seconds (Total) Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Elapsed Time: 0.164992 seconds (Warm-up) 0.158132 seconds (Sampling) 0.323124 seconds (Total)
(reduced = loo(data1.rstanarm.red))
Computed from 2250 by 30 log-likelihood matrix Estimate SE elpd_loo -97.7 3.1 p_loo 4.7 1.0 looic 195.5 6.3 Pareto k diagnostic values: Count Pct (-Inf, 0.5] (good) 27 90.0% (0.5, 0.7] (ok) 3 10.0% (0.7, 1] (bad) 0 0.0% (1, Inf) (very bad) 0 0.0% All Pareto k estimates are ok (k < 0.7) See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
elpd_diff se -11.8 4.2
Matrix model (BRMS)
summary(data1.brms)
Family: gaussian(identity) Formula: Y ~ A * B Data: data1 (Number of observations: 30) Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; total post-warmup samples = 2250 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 49.60 2.32 44.97 54.25 1182 1 AGroupB -12.33 3.06 -18.49 -6.37 1216 1 AGroupC -28.29 2.97 -34.18 -22.34 1246 1 B -0.49 0.09 -0.67 -0.32 1160 1 AGroupB:B -0.12 0.11 -0.35 0.11 1186 1 AGroupC:B 0.41 0.11 0.19 0.62 1248 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 3.71 0.54 2.81 4.98 1423 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
# OR library(broom) tidyMCMC(data1.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 b_Intercept 49.6027270 2.3173028 44.8624049 54.0547491 2 b_AGroupB -12.3262056 3.0553552 -18.7775957 -6.7594663 3 b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177 4 b_B -0.4926334 0.0908813 -0.6696454 -0.3177777 5 b_AGroupB:B -0.1212027 0.1126979 -0.3497945 0.1053154 6 b_AGroupC:B 0.4067397 0.1097094 0.1878784 0.6203292 7 sigma 3.7129053 0.5394484 2.6776034 4.7406944
- the intercept of the first group (Group A) is
49.602727
- the mean of the second group (Group B) is
-12.3262056
units greater than (A) - the mean of the third group (Group C) is
-28.2870152
units greater than (A) - a one unit increase in B in Group A is associated with a
-0.4926334
units increase in Y - difference in slope between Group B and Group A
-0.1212027
- difference in slope between Group C and Group A
0.4067397
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
## since values are less than zero head(as.matrix(data1.brms))
parameters iterations b_Intercept b_AGroupB b_AGroupC b_B b_AGroupB:B b_AGroupC:B sigma lp__ [1,] 47.70303 -10.255010 -25.32093 -0.3742985 -0.17137635 0.2600049 3.092922 -116.2110 [2,] 50.55262 -14.941232 -30.12870 -0.4643934 -0.13460334 0.4086522 3.230887 -114.2718 [3,] 48.22410 -9.538243 -25.80487 -0.4858593 -0.10874505 0.4073568 4.423214 -116.0591 [4,] 48.07606 -10.397199 -26.96966 -0.4404635 -0.16751208 0.3346061 2.849359 -113.2790 [5,] 49.43749 -10.676002 -28.03187 -0.5099977 -0.11303887 0.4276380 3.777916 -113.2435 [6,] 50.64586 -13.340054 -27.57699 -0.5712098 -0.02596519 0.4449288 2.659352 -115.2269
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB"]) # effect of (B-A = 0)
[1] 0.0004444444
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC"]) # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.brms)[, "b_B"]) # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB:B"]) # effect of (slopeB - slopeA = 0)
[1] 0.264
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC:B"]) # effect of (slopeB - slopeA = 0)
[1] 0.001333333
mcmcpvalue(as.matrix(data1.brms)[, 2:6]) # effect of (model)
[1] 0
There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A
library(loo) (full = loo(data1.brms))
LOOIC SE 170.84 8.04
data1.brms.red = update(data1.brms, . ~ A + B)
SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1). Gradient evaluation took 2.4e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.059643 seconds (Warm-up) 0.028751 seconds (Sampling) 0.088394 seconds (Total) SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2). Gradient evaluation took 8e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.05915 seconds (Warm-up) 0.028504 seconds (Sampling) 0.087654 seconds (Total) SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3). Gradient evaluation took 8e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 600 / 2000 [ 30%] (Warmup) Iteration: 800 / 2000 [ 40%] (Warmup) Iteration: 1000 / 2000 [ 50%] (Warmup) Iteration: 1001 / 2000 [ 50%] (Sampling) Iteration: 1200 / 2000 [ 60%] (Sampling) Iteration: 1400 / 2000 [ 70%] (Sampling) Iteration: 1600 / 2000 [ 80%] (Sampling) Iteration: 1800 / 2000 [ 90%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.056265 seconds (Warm-up) 0.027707 seconds (Sampling) 0.083972 seconds (Total)
(reduced = loo(data1.brms.red))
LOOIC SE 195.67 6.7
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3) plot(full, label_points = TRUE) plot(reduced, label_points = TRUE)
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Graphical summaries
A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.
Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.
Matrix model (MCMCpack)
mcmc = data1.mcmcpack ## Calculate the fitted values newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B), len = 100)) Xmat = model.matrix(~A * B, newdata1) coefs = mcmc[, 1:6] fit = coefs %*% t(Xmat) newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata1 = rdata1 = data1 fMat = rMat = model.matrix(~A * B, fdata1) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat)) rdata1 = rdata1 %>% mutate(partial.resid = resid + fit) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (JAGS)
mcmc = data1.r2jags$BUGSoutput$sims.matrix ## Calculate the fitted values newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B), len = 100)) Xmat = model.matrix(~A * B, newdata1) coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]", "beta[6]")] fit = coefs %*% t(Xmat) newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata1 = rdata1 = data1 fMat = rMat = model.matrix(~A * B, fdata1) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat)) rdata1 = rdata1 %>% mutate(partial.resid = resid + fit) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (RSTAN)
mcmc = as.matrix(data1.rstan) ## Calculate the fitted values newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B), len = 100)) Xmat = model.matrix(~A * B, newdata1) coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]", "beta[6]")] fit = coefs %*% t(Xmat) newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals fitted values fdata1 = rdata1 = data1 fMat = rMat = model.matrix(~A * B, fdata1) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat)) rdata1 = rdata1 %>% mutate(partial.resid = resid + fit) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (RSTANARM)
## Calculate the fitted values newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B), len = 100)) fit = posterior_linpred(data1.rstanarm, newdata = newdata1) newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals rdata1 = data1 pp = posterior_linpred(data1.rstanarm, newdata = rdata1) fit = as.vector(apply(pp, 2, median)) resid = resid(data1.rstanarm) rdata1 = rdata1 %>% mutate(partial.resid = resid + fit) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
Matrix model (BRMS)
Although we could calculated the fitted values via matrix multiplication of the coefficients and the model matrix (as for MCMCpack, RJAGS and RSTAN), for more complex models, it is more convenient to use the marginal_effects function that comes with brms.
plot(marginal_effects(data1.brms), points = TRUE)
# OR eff = plot(marginal_effects(data1.brms), points = TRUE, plot = FALSE) eff
$A
$B
$`B:A`
## Calculate the fitted values newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B), len = 100)) fit = fitted(data1.brms, newdata = newdata1, summary = FALSE) newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.
## Calculate partial residuals rdata1 = data1 fit = fitted(data1.brms, summary = TRUE)[, "Estimate"] resid = resid(data1.brms)[, "Estimate"] rdata1 = rdata1 %>% mutate(partial.resid = resid + fit) ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1, aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("B") + theme_classic()
References
Gelman, A. (2005). “Analysis of Variance - Why it is More Important Than Ever”. In: The Annals of Statistics 33.1, pp. 1–53.
Gelman, A., B. Goodrich, J. Gabry, et al. (2017). “R-squared for Bayesian regression models”.
Worked Examples
- McCarthy (2007) - Chpt 6
- Kery (2010) - Chpt 10
- Gelman & Hill (2007) - Chpt 4
- Logan (2010) - Chpt 12
- Quinn & Keough (2002) - Chpt 9
Homogeneous slopes
To investigate the impacts of sexual activity on the fruitfly longevity, Partridge and Farquhar (1981), measured the longevity of male fruitflies with access to either one virgin female (potential mate), eight virgin females, one pregnant female (not a potential mate), eight pregnant females or no females. The pool of available male fruitflies varied in size and since size is known to impact longevity, the researchers also measured thorax length as a covariate.
Download Partridge1 data setFormat of partridge1.csv data file | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
partridge <- read.csv("../downloads/data/partridge1.csv", strip.white = T) head(partridge)
TREATMENT THORAX LONGEV 1 Preg8 0.64 35 2 Preg8 0.68 37 3 Preg8 0.68 49 4 Preg8 0.72 46 5 Preg8 0.72 63 6 Preg8 0.76 39
A subset of these data were analysed in Tutorial 7.4b Q3 (the same data, yet without the continuous covariate). On that occasion we decided that although the response were counts (and thus a good match for a Poisson distribution), the magnitude of the observations would probably mean that a Gaussian distribution would be adequate.
The addition of a continuous covariate is likely to complicate this somewhat. Now we not only assume homogeneity of variance between treatments, we also assume homogeneity of variance with thorax length.
- Perform basic exploratory data analysis to examined.
- normality
- homogeneity of variance
- homogeneity of slopes (and equality of covariate ranges)
ggplot(partridge, aes(y = LONGEV, x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
plot(lm(LONGEV ~ THORAX + TREATMENT, partridge), which = 1)
There is definitely evidence of non-homogeneity of variance and perhaps even non-linearity. As already indicated, this is perhaps not unsurprising. Arguably, the most appropriate starting point would be to begin with a Poisson distribution which would not only expect the non-homogeneity of variance, since the Poisson uses a log-link, the relationship on a response scale would be expected to be non-linear.
Nevertheless, we will researve the use of a Poisson distribution until Tutorial 9.4b and for now, simply use a log (base 10) transformation to be consistent with the original authors. Note, as there are no zeros in these data, and (perhaps more importantly), the response does not approach zero, a logarithmic transformation is likely to be reasonable.
ggplot(partridge, aes(y = log10(LONGEV), x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
plot(lm(log10(LONGEV) ~ THORAX + TREATMENT, partridge), which = 1)
That does appear more reasonable..
And now to explore the range of the covariate.
ggplot(partridge, aes(y = THORAX, x = TREATMENT)) + geom_boxplot()
summary(lm(THORAX ~ TREATMENT, partridge))
Call: lm(formula = THORAX ~ TREATMENT, data = partridge) Residuals: Min 1Q Median 3Q Max -0.1960 -0.0456 0.0144 0.0544 0.1344 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.83600 0.01543 54.194 <2e-16 *** TREATMENTPreg1 -0.01040 0.02182 -0.477 0.634 TREATMENTPreg8 -0.03040 0.02182 -1.393 0.166 TREATMENTVirg1 0.00160 0.02182 0.073 0.942 TREATMENTVirg8 -0.03600 0.02182 -1.650 0.102 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.07713 on 120 degrees of freedom Multiple R-squared: 0.04032, Adjusted R-squared: 0.008335 F-statistic: 1.261 on 4 and 120 DF, p-value: 0.2893
Conclusions - no evidence that the range of thorax length varies substantially between treatments. Furthermore, homogeneity of slopes seems reasonable and so we will use an additive model.
- Fit the appropriate Bayesian model to explore the effect of treatment and thorax
length on male fruitfly longenvity.
$$
\begin{align}
y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\
\mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em]
\beta_0 &\sim{} N(0,10)\\
\beta &\sim{} N(0,10)\\
\sigma &\sim{} cauchy(0,5)\\
\end{align}
$$
library(MCMCpack) partridge.mcmcpack = MCMCregress(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mu[i],tau) mu[i] <- beta0 + inprod(beta[],X[i,]) } #Priors beta0 ~ dnorm(0.01,1.0E-6) for (j in 1:nX) { beta[j] ~ dnorm(0.01,1.0E-6) } tau <- 1 / (sigma * sigma) sigma~dunif(0,100) } " X = model.matrix(~TREATMENT + THORAX, data = partridge) partridge.list <- with(partridge, list(y = log10(LONGEV), X = X[, -1], nX = ncol(X) - 1, n = nrow(partridge))) params <- c("beta0", "beta", "sigma") burnInSteps = 3000 nChains = 3 numSavedSteps = 15000 thinSteps = 10 nIter = ceiling((numSavedSteps * thinSteps)/nChains) partridge.r2jags <- jags(data = partridge.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 125 Unobserved stochastic nodes: 7 Total graph size: 998 Initializing model
modelString=" data { int
n; // total number of observations vector[n] Y; // response variable int nX; // number of effects matrix[n, nX] X; // model matrix } transformed data { matrix[n, nX - 1] Xc; // centered version of X vector[nX - 1] means_X; // column means of X before centering for (i in 2:nX) { means_X[i - 1] = mean(X[, i]); Xc[, i - 1] = X[, i] - means_X[i - 1]; } } parameters { vector[nX-1] beta; // population-level effects real cbeta0; // center-scale intercept real sigma; // residual SD } transformed parameters { } model { vector[n] mu; mu = Xc * beta + cbeta0; // prior specifications beta ~ normal(0, 10); cbeta0 ~ normal(0, 10); sigma ~ cauchy(0, 5); // likelihood contribution Y ~ normal(mu, sigma); } generated quantities { real beta0; // population-level intercept vector[n] log_lik; beta0 = cbeta0 - dot_product(means_X, beta); for (i in 1:n) { log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma); } } " X = model.matrix(~TREATMENT + THORAX, data = partridge) partridge.list <- with(partridge, list(Y = log10(LONGEV), X = X, nX = ncol(X), n = nrow(partridge))) library(rstan) partridge.rstan <- stan(data = partridge.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500, thin = 2)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0, from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4, from file6f0539983e2.cpp:8: /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined # define BOOST_NO_CXX11_RVALUE_REFERENCES ^ <command-line>:0:0: note: this is the location of the previous definition SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1). Gradient evaluation took 3.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.077105 seconds (Warm-up) 0.380042 seconds (Sampling) 0.457147 seconds (Total) SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2). Gradient evaluation took 1.6e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.067367 seconds (Warm-up) 0.394947 seconds (Sampling) 0.462314 seconds (Total) SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3). Gradient evaluation took 1.4e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.067727 seconds (Warm-up) 0.42144 seconds (Sampling) 0.489167 seconds (Total)
partridge.rstanarm = stan_glm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge, iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 10), prior = normal(0, 10), prior_aux = cauchy(0, 5))
Gradient evaluation took 0.000161 seconds 1000 transitions using 10 leapfrog steps per transition would take 1.61 seconds. Adjust your expectations accordingly! Elapsed Time: 0.109844 seconds (Warm-up) 0.657854 seconds (Sampling) 0.767698 seconds (Total) Gradient evaluation took 2.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds. Adjust your expectations accordingly! Elapsed Time: 0.08606 seconds (Warm-up) 0.507298 seconds (Sampling) 0.593358 seconds (Total) Gradient evaluation took 3e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds. Adjust your expectations accordingly! Elapsed Time: 0.084726 seconds (Warm-up) 0.50021 seconds (Sampling) 0.584936 seconds (Total)
partridge.brm = brm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge, iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 2.5e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds. Adjust your expectations accordingly! Elapsed Time: 0.060042 seconds (Warm-up) 0.340743 seconds (Sampling) 0.400785 seconds (Total) Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Elapsed Time: 0.061698 seconds (Warm-up) 0.346451 seconds (Sampling) 0.408149 seconds (Total) Gradient evaluation took 1.4e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds. Adjust your expectations accordingly! Elapsed Time: 0.064057 seconds (Warm-up) 0.336682 seconds (Sampling) 0.400739 seconds (Total)
- Explore MCMC diagnostics
library(MCMCpack) plot(partridge.mcmcpack)
raftery.diag(partridge.mcmcpack)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) (Intercept) 2 3865 3746 1.030 TREATMENTPreg1 2 3710 3746 0.990 TREATMENTPreg8 2 3741 3746 0.999 TREATMENTVirg1 2 3710 3746 0.990 TREATMENTVirg8 2 3771 3746 1.010 THORAX 2 3710 3746 0.990 sigma2 2 3994 3746 1.070
autocorr.diag(partridge.mcmcpack)
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Lag 0 1.000000000 1.000000000 1.000000e+00 1.000000000 1.000000000 1.000000e+00 Lag 1 -0.006504213 0.005715657 -1.260043e-03 -0.008276112 -0.008425989 -6.216742e-03 Lag 5 0.020132882 0.012739945 -2.900389e-02 -0.001338169 -0.015262191 2.106973e-02 Lag 10 0.005066356 0.011293213 -8.019851e-03 -0.003100969 -0.013767490 5.865134e-03 Lag 50 -0.001937426 -0.024530392 8.454565e-05 0.004761843 -0.008212958 -3.677482e-05 sigma2 Lag 0 1.000000e+00 Lag 1 5.041996e-02 Lag 5 7.127112e-05 Lag 10 1.653846e-02 Lag 50 -7.121217e-03
library(R2jags) library(coda) partridge.mcmc = as.mcmc(partridge.r2jags) plot(partridge.mcmc)
raftery.diag(partridge.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 10 37660 3746 10.10 beta[1] 20 37020 3746 9.88 beta[2] 20 37020 3746 9.88 beta[3] 20 37020 3746 9.88 beta[4] 20 35750 3746 9.54 beta[5] 20 37020 3746 9.88 deviance 20 37020 3746 9.88 sigma 10 37660 3746 10.10 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 10 37660 3746 10.10 beta[1] 10 37660 3746 10.10 beta[2] 30 40390 3746 10.80 beta[3] 20 37020 3746 9.88 beta[4] 10 37660 3746 10.10 beta[5] 20 38330 3746 10.20 deviance 20 38330 3746 10.20 sigma 20 36380 3746 9.71 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 37020 3746 9.88 beta[1] 10 37660 3746 10.10 beta[2] 20 39000 3746 10.40 beta[3] 20 37020 3746 9.88 beta[4] 20 37020 3746 9.88 beta[5] 10 37660 3746 10.10 deviance 20 37020 3746 9.88 sigma 20 38330 3746 10.20
autocorr.diag(partridge.mcmc)
beta0 beta[1] beta[2] beta[3] beta[4] beta[5] deviance Lag 0 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 Lag 10 -0.013921949 0.005149797 0.004975120 -0.001853710 -0.014382555 -0.012575832 -0.008892071 Lag 50 -0.003696594 -0.002835295 0.001318796 0.010156118 -0.011928061 -0.003560853 0.010943677 Lag 100 0.003584436 -0.001331310 0.003826374 0.009171373 0.008765503 0.003492416 -0.008883727 Lag 500 0.016409182 0.013260824 0.009091810 0.001029463 -0.003070906 0.015760118 0.002411228 sigma Lag 0 1.0000000000 Lag 10 -0.0092198712 Lag 50 0.0059702667 Lag 100 -0.0008803947 Lag 500 0.0077816203
library(rstan) library(coda) s = as.array(partridge.rstan) partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]", "sigma")], 2, as.mcmc)) plot(partridge.mcmc)
raftery.diag(partridge.mcmc)
$`1` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`2` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`3` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
autocorr.diag(partridge.mcmc)
beta0 beta[1] beta[2] beta[3] beta[4] beta[5] sigma Lag 0 1.0000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.000000000 1.0000000000 Lag 1 0.0362085746 0.091738066 0.093666654 0.065303202 0.083647790 0.029088441 0.0297993560 Lag 5 0.0075315830 -0.007179393 -0.013474037 0.019288850 -0.002217306 0.007947353 -0.0114935490 Lag 10 0.0154634582 0.004801029 0.020335742 0.010920588 0.025187625 0.014189952 0.0007989306 Lag 50 -0.0001803206 -0.001614632 -0.005646201 0.003760134 -0.019317695 0.003126408 -0.0024392021
library(rstan) library(coda) stan_ac(partridge.rstan, pars = c("beta0", "beta", "sigma"))
stan_rhat(partridge.rstan, pars = c("beta0", "beta", "sigma"))
stan_ess(partridge.rstan, pars = c("beta0", "beta", "sigma"))
# using Bayeseplot library(bayesplot) detach("package:reshape") mcmc_trace(as.array(partridge.rstan), regex_par = "beta|sigma")
mcmc_trace(as.array(partridge.rstan), regex_pars = "beta|sigma")
mcmc_dens(as.array(partridge.rstan), regex_par = "beta|sigma")
detach("package:reshape") library(bayesplot) mcmc_combo(as.array(partridge.rstan), regex_par = "beta|sigma")
library(rstanarm) library(coda) s = as.array(partridge.rstanarm) colnames(as.matrix(partridge.rstanarm))
[1] "(Intercept)" "TREATMENTPreg1" "TREATMENTPreg8" "TREATMENTVirg1" "TREATMENTVirg8" [6] "THORAX" "sigma"
partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("(Intercept)", "TREATMENTPreg1", "TREATMENTPreg8", "TREATMENTVirg1", "TREATMENTVirg8", "THORAX", "sigma")], 2, as.mcmc)) plot(partridge.mcmc)
raftery.diag(partridge.mcmc)
$`1` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`2` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`3` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
autocorr.diag(partridge.mcmc)
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Lag 0 1.000000000 1.000000000 1.000000000 1.0000000000 1.000000000 1.0000000000 Lag 1 0.075353592 0.177654851 0.188350285 0.1751439722 0.185668547 0.0617834129 Lag 5 0.003490198 0.005324381 0.000440865 -0.0045602106 -0.002274830 0.0006088013 Lag 10 -0.003447542 0.012745177 -0.005106027 0.0115046730 -0.010358881 -0.0033071860 Lag 50 -0.004644673 -0.009315829 0.011216715 -0.0005416519 -0.007338707 -0.0043183473 sigma Lag 0 1.000000000 Lag 1 0.048851176 Lag 5 -0.011582830 Lag 10 -0.008910899 Lag 50 0.003726101
library(rstanarm) library(coda) stan_ac(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
stan_rhat(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
stan_ess(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
# using Bayeseplot library(bayesplot) detach("package:reshape") mcmc_trace(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")
mcmc_trace(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
mcmc_dens(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
detach("package:reshape") library(bayesplot) mcmc_combo(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")
library(rstanarm) posterior_vs_prior(partridge.rstanarm, color_by = "vs", group_by = TRUE, facet_args = list(scales = "free_y"))
Gradient evaluation took 4.7e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.47 seconds. Adjust your expectations accordingly! Elapsed Time: 0.064874 seconds (Warm-up) 0.139432 seconds (Sampling) 0.204306 seconds (Total) Gradient evaluation took 4e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds. Adjust your expectations accordingly! Elapsed Time: 0.074223 seconds (Warm-up) 0.240174 seconds (Sampling) 0.314397 seconds (Total)
library(coda) library(brms) partridge.mcmc = as.mcmc(partridge.brm) plot(partridge.mcmc)
raftery.diag(partridge.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
autocorr.diag(partridge.mcmc)
Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
library(coda) stan_ac(partridge.brm$fit)
stan_rhat(partridge.brm$fit)
stan_ess(partridge.brm$fit)
- Perform model validation
library(MCMCpack) partridge.mcmc = as.data.frame(partridge.mcmcpack) # generate a model matrix newdata = partridge Xmat = model.matrix(~TREATMENT + THORAX, newdata) ## get median parameter estimates wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = apply(partridge.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = log10(partridge$LONGEV) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against treatment level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
library(MCMCpack) partridge.mcmc = as.matrix(partridge.mcmcpack) # generate a model matrix Xmat = model.matrix(~TREATMENT + THORAX, partridge) ## get median parameter estimates wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge), fit[i, ], sqrt(partridge.mcmc[i, "sigma2"]))) newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX, yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX) ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"), alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT, fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = TREATMENT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model", group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")
mcmc_areas(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix # generate a model matrix newdata = partridge Xmat = model.matrix(~TREATMENT + THORAX, newdata) ## get median parameter estimates wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = apply(partridge.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = log10(partridge$LONGEV) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against treatment level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix # generate a model matrix Xmat = model.matrix(~TREATMENT + THORAX, partridge) ## get median parameter estimates wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge), fit[i, ], partridge.mcmc[i, "sigma"])) newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX, yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX) ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"), alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT, fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = TREATMENT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model", group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
mcmc_areas(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
partridge.mcmc = as.matrix(partridge.rstan) # generate a model matrix newdata = partridge Xmat = model.matrix(~TREATMENT + THORAX, newdata) ## get median parameter estimates wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = apply(partridge.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = log10(partridge$LONGEV) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against treatment level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
partridge.mcmc = as.matrix(partridge.rstan) # generate a model matrix Xmat = model.matrix(~TREATMENT + THORAX, partridge) ## get median parameter estimates wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge), fit[i, ], partridge.mcmc[i, "sigma"])) newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX, yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX) ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"), alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT, fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = TREATMENT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model", group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(partridge.rstan), regex_pars = "beta|sigma")
mcmc_areas(as.matrix(partridge.rstan), regex_pars = "beta|sigma")
partridge.mcmc = as.matrix(partridge.rstanarm) # generate a model matrix newdata = partridge Xmat = model.matrix(~TREATMENT + THORAX, newdata) ## get median parameter estimates wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = apply(partridge.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = log10(partridge$LONGEV) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against treatment level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
partridge.mcmc = as.matrix(partridge.rstanarm) # generate a model matrix Xmat = model.matrix(~TREATMENT + THORAX, partridge) ## get median parameter estimates wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge), fit[i, ], partridge.mcmc[i, "sigma"])) newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX, yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX) ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"), alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT, fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = TREATMENT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model", group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
mcmc_areas(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
partridge.mcmc = as.matrix(partridge.brm) # generate a model matrix newdata = partridge Xmat = model.matrix(~TREATMENT + THORAX, newdata) ## get median parameter estimates wch = grep("b_", colnames(partridge.mcmc)) coefs = apply(partridge.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = log10(partridge$LONGEV) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against treatment level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
partridge.mcmc = as.matrix(partridge.brm) # generate a model matrix Xmat = model.matrix(~TREATMENT + THORAX, partridge) ## get median parameter estimates wch = grep("b_", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge), fit[i, ], partridge.mcmc[i, "sigma"])) newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX, yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX) ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"), alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT, fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = TREATMENT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model", group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(partridge.brm), regex_pars = "b_|sigma")
mcmc_areas(as.matrix(partridge.brm), regex_pars = "b_|sigma")
- Explore parameter estimates
library(MCMCpack) summary(partridge.mcmcpack)
Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) 0.79153 0.0862441 8.624e-04 8.643e-04 TREATMENTPreg1 0.02226 0.0238689 2.387e-04 2.387e-04 TREATMENTPreg8 0.03650 0.0241417 2.414e-04 2.414e-04 TREATMENTVirg1 -0.05404 0.0240102 2.401e-04 2.401e-04 TREATMENTVirg8 -0.18198 0.0241462 2.415e-04 2.415e-04 THORAX 1.19335 0.1009666 1.010e-03 1.006e-03 sigma2 0.00713 0.0009452 9.452e-06 9.942e-06 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) 0.625379 0.732743 0.791865 0.849579 0.961182 TREATMENTPreg1 -0.025297 0.006360 0.022323 0.038120 0.068806 TREATMENTPreg8 -0.010526 0.020301 0.036510 0.052720 0.083886 TREATMENTVirg1 -0.101254 -0.069767 -0.053919 -0.038007 -0.006695 TREATMENTVirg8 -0.230098 -0.197744 -0.181983 -0.165952 -0.134440 THORAX 0.992453 1.124938 1.193252 1.262005 1.389023 sigma2 0.005491 0.006455 0.007057 0.007723 0.009211
library(broom) tidyMCMC(partridge.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) 0.791534756 0.0862440749 0.627282417 0.962326127 2 TREATMENTPreg1 0.022262435 0.0238688705 -0.025800765 0.068161673 3 TREATMENTPreg8 0.036498778 0.0241417477 -0.012028238 0.082087102 4 TREATMENTVirg1 -0.054043317 0.0240102293 -0.101607483 -0.007550831 5 TREATMENTVirg8 -0.181978738 0.0241461559 -0.231333679 -0.135891212 6 THORAX 1.193347232 0.1009665978 0.995580301 1.390356975 7 sigma2 0.007129683 0.0009451969 0.005343271 0.008958414
mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg1"])
[1] 0.3491
mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg8"])
[1] 0.133
mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg1"])
[1] 0.0271
mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg8"])
[1] 0
mcmcpvalue(partridge.mcmcpack[, "THORAX"])
[1] 0
wch = grep("TREATMENT|THORAX", colnames(partridge.mcmcpack)) mcmcpvalue(partridge.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
print(partridge.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10 n.sims = 14100 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 0.022 0.024 -0.025 0.006 0.022 0.039 0.069 1.001 5000 beta[2] 0.037 0.024 -0.010 0.020 0.037 0.053 0.083 1.001 11000 beta[3] -0.054 0.024 -0.101 -0.070 -0.054 -0.038 -0.007 1.001 14000 beta[4] -0.182 0.024 -0.229 -0.198 -0.182 -0.166 -0.134 1.001 6500 beta[5] 1.194 0.100 0.995 1.128 1.194 1.262 1.389 1.001 7500 beta0 0.791 0.085 0.626 0.733 0.790 0.848 0.960 1.001 6800 sigma 0.085 0.006 0.074 0.081 0.084 0.088 0.096 1.001 14000 deviance -264.409 3.899 -269.889 -267.273 -265.103 -262.305 -254.937 1.002 2600 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 7.6 and DIC = -256.8 DIC is an estimate of expected predictive error (lower deviance is better).
library(broom) tidyMCMC(partridge.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta[1] 0.02222087 0.024107038 -0.02387136 7.028915e-02 2 beta[2] 0.03655562 0.024131694 -0.01101267 8.252448e-02 3 beta[3] -0.05390009 0.023948614 -0.09989605 -6.049349e-03 4 beta[4] -0.18170503 0.024261555 -0.22861460 -1.338749e-01 5 beta[5] 1.19380034 0.100331683 1.00121995 1.392995e+00 6 beta0 0.79102584 0.085462563 0.62940375 9.630549e-01 7 deviance -264.40914257 3.898646906 -270.71869433 -2.567381e+02 8 sigma 0.08453005 0.005586101 0.07400793 9.562832e-02
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix mcmcpvalue(partridge.mcmc[, "beta[1]"])
[1] 0.3587234
mcmcpvalue(partridge.mcmc[, "beta[2]"])
[1] 0.1301418
mcmcpvalue(partridge.mcmc[, "beta[3]"])
[1] 0.02468085
mcmcpvalue(partridge.mcmc[, "beta[4]"])
[1] 0
mcmcpvalue(partridge.mcmc[, "beta[5]"])
[1] 0
wch = grep("beta\\[", colnames(partridge.mcmc)) mcmcpvalue(partridge.mcmc[, wch])
[1] 0
# summary(partridge.rstan) library(broom) partridge.mcmc = as.matrix(partridge.rstan) tidyMCMC(partridge.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"), ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 beta0 0.79128931 0.086061774 0.62566968 0.961482006 1.000007 5827 2 beta[1] 0.02299254 0.024083810 -0.02571685 0.067810084 1.000434 5184 3 beta[2] 0.03628671 0.024376240 -0.01049910 0.083787879 1.000117 5274 4 beta[3] -0.05368096 0.023662283 -0.10081880 -0.009464246 1.000751 5262 5 beta[4] -0.18108684 0.024518494 -0.22869622 -0.132604319 1.000599 5420 6 beta[5] 1.19318572 0.100771538 1.00337859 1.396651732 1.000058 6383 7 sigma 0.08455309 0.005593207 0.07383390 0.095473807 1.000092 6376
mcmcpvalue(partridge.mcmc[, "beta[1]"])
[1] 0.3402963
mcmcpvalue(partridge.mcmc[, "beta[2]"])
[1] 0.1362963
mcmcpvalue(partridge.mcmc[, "beta[3]"])
[1] 0.02237037
mcmcpvalue(partridge.mcmc[, "beta[4]"])
[1] 0
mcmcpvalue(partridge.mcmc[, "beta[5]"])
[1] 0
wch = grep("beta\\[", colnames(partridge.mcmc)) mcmcpvalue(partridge.mcmc[, wch])
[1] 0
summary(partridge.rstanarm)
Model Info: function: stan_glm family: gaussian [identity] formula: log10(LONGEV) ~ TREATMENT + THORAX algorithm: sampling priors: see help('prior_summary') sample: 6750 (posterior sample size) num obs: 125 Estimates: mean sd 2.5% 25% 50% 75% 97.5% (Intercept) 0.8 0.1 0.6 0.7 0.8 0.9 1.0 TREATMENTPreg1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 TREATMENTPreg8 0.0 0.0 0.0 0.0 0.0 0.1 0.1 TREATMENTVirg1 -0.1 0.0 -0.1 -0.1 -0.1 0.0 0.0 TREATMENTVirg8 -0.2 0.0 -0.2 -0.2 -0.2 -0.2 -0.1 THORAX 1.2 0.1 1.0 1.1 1.2 1.3 1.4 sigma 0.1 0.0 0.1 0.1 0.1 0.1 0.1 mean_PPD 1.7 0.0 1.7 1.7 1.7 1.7 1.8 log-posterior 122.3 1.9 117.6 121.3 122.6 123.7 125.0 Diagnostics: mcse Rhat n_eff (Intercept) 0.0 1.0 5726 TREATMENTPreg1 0.0 1.0 4536 TREATMENTPreg8 0.0 1.0 4695 TREATMENTVirg1 0.0 1.0 4830 TREATMENTVirg8 0.0 1.0 4579 THORAX 0.0 1.0 5922 sigma 0.0 1.0 6149 mean_PPD 0.0 1.0 6728 log-posterior 0.0 1.0 4744 For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
library(broom) partridge.mcmc = as.matrix(partridge.rstanarm) tidyMCMC(partridge.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 (Intercept) 0.79293103 0.085741974 0.63442374 0.967936705 0.9998648 5726 2 TREATMENTPreg1 0.02234996 0.024074877 -0.02338140 0.070865192 0.9999204 4536 3 TREATMENTPreg8 0.03641433 0.024403496 -0.01013371 0.086076915 1.0000061 4695 4 TREATMENTVirg1 -0.05403215 0.023548978 -0.09925124 -0.006906492 0.9997467 4830 5 TREATMENTVirg8 -0.18226103 0.023984235 -0.22966723 -0.136666158 1.0002442 4579 6 THORAX 1.19165812 0.100358278 0.98759586 1.382129994 0.9999683 5922 7 sigma 0.08462483 0.005556366 0.07493118 0.096523655 0.9996408 6149 8 mean_PPD 1.73567551 0.010668382 1.71548076 1.757712481 1.0000841 6728 9 log-posterior 122.29436093 1.937479394 118.54507644 125.359663348 0.9998040 4744
mcmcpvalue(partridge.mcmc[, "TREATMENTPreg1"])
[1] 0.354963
mcmcpvalue(partridge.mcmc[, "TREATMENTPreg8"])
[1] 0.1346667
mcmcpvalue(partridge.mcmc[, "TREATMENTVirg1"])
[1] 0.02222222
mcmcpvalue(partridge.mcmc[, "TREATMENTVirg8"])
[1] 0
mcmcpvalue(partridge.mcmc[, "THORAX"])
[1] 0
wch = grep("TREATMENT|THORAX", colnames(partridge.mcmc)) mcmcpvalue(partridge.mcmc[, wch])
[1] 0
summary(partridge.brm)
Family: gaussian(identity) Formula: log10(LONGEV) ~ TREATMENT + THORAX Data: partridge (Number of observations: 125) Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 2; total post-warmup samples = 6750 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 0.79 0.09 0.63 0.96 6165 1 TREATMENTPreg1 0.02 0.02 -0.02 0.07 5774 1 TREATMENTPreg8 0.04 0.02 -0.01 0.08 5790 1 TREATMENTVirg1 -0.05 0.02 -0.10 -0.01 5633 1 TREATMENTVirg8 -0.18 0.02 -0.23 -0.13 5810 1 THORAX 1.19 0.10 1.00 1.39 6155 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 0.08 0.01 0.07 0.1 6265 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
library(broom) partridge.mcmc = as.matrix(partridge.brm) tidyMCMC(partridge.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 b_Intercept 0.79087403 0.085559712 0.628005663 0.963051668 0.9999102 6165 2 b_TREATMENTPreg1 0.02240460 0.024011041 -0.024874580 0.068487474 0.9997021 5774 3 b_TREATMENTPreg8 0.03620256 0.023843389 -0.008463557 0.084372278 0.9999173 5790 4 b_TREATMENTVirg1 -0.05404743 0.023815863 -0.100910410 -0.008313408 0.9997183 5633 5 b_TREATMENTVirg8 -0.18198728 0.023869345 -0.228420724 -0.135256737 0.9997294 5810 6 b_THORAX 1.19414513 0.100754965 0.995884101 1.389117902 0.9998769 6155 7 sigma 0.08463568 0.005584061 0.074166370 0.095825344 1.0004425 6265
mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg1"])
[1] 0.3511111
mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg8"])
[1] 0.128
mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg1"])
[1] 0.024
mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg8"])
[1] 0
wch = grep("b_TREATMENT|b_THORAX", colnames(partridge.mcmc)) mcmcpvalue(partridge.mcmc[, wch])
[1] 0
- Generate graphical summaries
library(MCMCpack) partridge.mcmc = partridge.mcmcpack ## Calculate the fitted values newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX), max(THORAX), len = 100))) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95, conf.method = "HPDinterval")) newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8, conf.method = "HPDinterval")) newdata.95 %>% head
TREATMENT THORAX estimate std.error conf.low conf.high 1 None 0.6400000 35.98025 2.167746 31.85706 40.31875 2 Preg1 0.6400000 37.86609 2.167869 33.72704 42.22615 3 Preg8 0.6400000 39.12340 2.159726 34.98930 43.45699 4 Virg1 0.6400000 31.77057 1.920246 28.06600 35.52507 5 Virg8 0.6400000 23.65556 1.281681 21.25357 26.25122 6 None 0.6430303 36.27993 2.166315 32.15499 40.61696
ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() + scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + scale_color_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + theme_classic()
# And now with partial residuals fdata = rdata = partridge fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = 10^(resid + fit)) br <- log10(pretty(partridge$LONGEV)) ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT, fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") + scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type", values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type", values = c("black", "black", "black", "white", "white"), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0.1, 1), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix ## Calculate the fitted values newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX), max(THORAX), len = 100))) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95, conf.method = "HPDinterval")) newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8, conf.method = "HPDinterval")) newdata.95 %>% head
TREATMENT THORAX estimate std.error conf.low conf.high 1 None 0.6400000 35.96099 2.150108 31.84513 40.20482 2 Preg1 0.6400000 37.84543 2.204559 33.43408 42.08953 3 Preg8 0.6400000 39.10798 2.144250 34.96804 43.38181 4 Virg1 0.6400000 31.76434 1.908763 28.00014 35.46843 5 Virg8 0.6400000 23.65820 1.274451 21.22775 26.21792 6 None 0.6430303 36.26065 2.148823 32.15733 40.50511
ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() + scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + scale_color_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + theme_classic()
# And now with partial residuals fdata = rdata = partridge fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = 10^(resid + fit)) br <- log10(pretty(partridge$LONGEV)) ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT, fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") + scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type", values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type", values = c("black", "black", "black", "white", "white"), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0.1, 1), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
partridge.mcmc = as.matrix(partridge.rstan) ## Calculate the fitted values newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX), max(THORAX), len = 100))) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95, conf.method = "HPDinterval")) newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8, conf.method = "HPDinterval")) newdata.95 %>% head
TREATMENT THORAX estimate std.error conf.low conf.high 1 None 0.6400000 35.95150 2.171163 32.04491 40.56730 2 Preg1 0.6400000 37.90240 2.225653 33.73523 42.38996 3 Preg8 0.6400000 39.07261 2.150087 34.97066 43.26534 4 Virg1 0.6400000 31.77152 1.919648 28.18076 35.68269 5 Virg8 0.6400000 23.68545 1.287215 21.32823 26.26502 6 None 0.6430303 36.25091 2.169872 32.34034 40.84279
ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() + scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + scale_color_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + theme_classic()
# And now with partial residuals fdata = rdata = partridge fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = 10^(resid + fit)) br <- log10(pretty(partridge$LONGEV)) ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT, fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") + scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type", values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type", values = c("black", "black", "black", "white", "white"), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0.1, 1), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
partridge.mcmc = as.matrix(partridge.rstanarm) ## Calculate the fitted values newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX), max(THORAX), len = 100))) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95, conf.method = "HPDinterval")) newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8, conf.method = "HPDinterval")) newdata.95 %>% head
TREATMENT THORAX estimate std.error conf.low conf.high 1 None 0.6400000 36.00583 2.159989 31.81425 40.09867 2 Preg1 0.6400000 37.90180 2.179165 33.48592 42.07839 3 Preg8 0.6400000 39.14320 2.141342 35.04090 43.33835 4 Virg1 0.6400000 31.79228 1.886028 27.94081 35.30833 5 Virg8 0.6400000 23.65624 1.258929 21.30229 26.22091 6 None 0.6430303 36.30531 2.158638 32.06754 40.35430
ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() + scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + scale_color_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + theme_classic()
# And now with partial residuals fdata = rdata = partridge fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = 10^(resid + fit)) br <- log10(pretty(partridge$LONGEV)) ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT, fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") + scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type", values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type", values = c("black", "black", "black", "white", "white"), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0.1, 1), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
partridge.mcmc = as.matrix(partridge.brm) ## Calculate the fitted values newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX), max(THORAX), len = 100))) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("b_", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = 10^(coefs %*% t(Xmat)) newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95, conf.method = "HPDinterval")) newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8, conf.method = "HPDinterval")) newdata.95 %>% head
TREATMENT THORAX estimate std.error conf.low conf.high 1 None 0.6400000 35.96536 2.125997 31.82643 40.13090 2 Preg1 0.6400000 37.86659 2.190934 33.72999 42.27928 3 Preg8 0.6400000 39.08282 2.151109 34.95758 43.39081 4 Virg1 0.6400000 31.75761 1.891817 27.93170 35.27973 5 Virg8 0.6400000 23.64734 1.288577 21.28669 26.30815 6 None 0.6430303 36.26515 2.124425 32.13702 40.42999
ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) + geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() + scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + scale_color_discrete("Substrate type", breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners")) + theme_classic()
# And now with partial residuals fdata = rdata = partridge fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = 10^(resid + fit)) br <- log10(pretty(partridge$LONGEV)) ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT, fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") + scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type", values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type", values = c("black", "black", "black", "white", "white"), breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0.1, 1), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
- We have established that the male fruitfly longevity varies across the treatments.
The effects model directly compared each of the partner type to the control treatment with no partners.
We might also be interested in describing the difference in male fruitfly longevity between other
combinations of partner types. Lets compare each treatment group to each other in a pairwise manner.
Note, as we elected to fit the model on logarithmic (base ten) transformed longevity, we have to be a
bit careful in how we process the results. Without any back transforms, the comparisons will be on a log
scale. If we wish to express them on the natural response scale (difference in days, we will need to do a
bit more manipulation and remember our log laws ($log(A) - log(B) = log(A)/log(B)$). Thus if we back transform
the contrasts, we will get results that reflect the factor change.
library(MCMCpack) partridge.mcmc = partridge.mcmcpack wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE))) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey") Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Preg1 - None 0 1 0 0 0 0 Preg8 - None 0 0 1 0 0 0 Virg1 - None 0 0 0 1 0 0 Virg8 - None 0 0 0 0 1 0 Preg8 - Preg1 0 -1 1 0 0 0 Virg1 - Preg1 0 -1 0 1 0 0 Virg8 - Preg1 0 -1 0 0 1 0 Virg1 - Preg8 0 0 -1 1 0 0 Virg8 - Preg8 0 0 -1 0 1 0 Virg8 - Virg1 0 0 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 0.02226244 0.02386887 -0.02580077 0.068161673 2 Preg8 - None 0.03649878 0.02414175 -0.01202824 0.082087102 3 Virg1 - None -0.05404332 0.02401023 -0.10160748 -0.007550831 4 Virg8 - None -0.18197874 0.02414616 -0.23133368 -0.135891212 5 Preg8 - Preg1 0.01423634 0.02396522 -0.03512377 0.059624253 6 Virg1 - Preg1 -0.07630575 0.02391487 -0.12156718 -0.028062534 7 Virg8 - Preg1 -0.20424117 0.02371341 -0.25045812 -0.156732472 8 Virg1 - Preg8 -0.09054210 0.02433891 -0.13738602 -0.041521436 9 Virg8 - Preg8 -0.21847752 0.02396819 -0.26613085 -0.171726858 10 Virg8 - Virg1 -0.12793542 0.02426936 -0.17484823 -0.080668677
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
## With back transformed mcmc_areas(10^(coefs %*% t(pairwise.mat)))
(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 1.0541883 0.05796784 0.9423218 1.1699348 2 Preg8 - None 1.0893557 0.06060224 0.9684462 1.2032111 3 Virg1 - None 0.8843419 0.04891830 0.7906505 0.9820142 4 Virg8 - None 0.6587068 0.03662892 0.5855194 0.7297431 5 Preg8 - Preg1 1.0348974 0.05712792 0.9223085 1.1471607 6 Virg1 - Preg1 0.8401419 0.04629954 0.7512269 0.9324124 7 Virg8 - Preg1 0.6257574 0.03418221 0.5595466 0.6945545 8 Virg1 - Preg8 0.8130922 0.04560181 0.7219419 0.9014199 9 Virg8 - Preg8 0.6055967 0.03343095 0.5375230 0.6689410 10 Virg8 - Virg1 0.7460068 0.04173700 0.6685775 0.8304841
ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
library(MCMCpack) partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE))) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey") Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Preg1 - None 0 1 0 0 0 0 Preg8 - None 0 0 1 0 0 0 Virg1 - None 0 0 0 1 0 0 Virg8 - None 0 0 0 0 1 0 Preg8 - Preg1 0 -1 1 0 0 0 Virg1 - Preg1 0 -1 0 1 0 0 Virg8 - Preg1 0 -1 0 0 1 0 Virg1 - Preg8 0 0 -1 1 0 0 Virg8 - Preg8 0 0 -1 0 1 0 Virg8 - Virg1 0 0 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 0.02222087 0.02410704 -0.02387136 0.070289150 2 Preg8 - None 0.03655562 0.02413169 -0.01101267 0.082524482 3 Virg1 - None -0.05390009 0.02394861 -0.09989605 -0.006049349 4 Virg8 - None -0.18170503 0.02426155 -0.22861460 -0.133874873 5 Preg8 - Preg1 0.01433475 0.02402038 -0.03409306 0.060470474 6 Virg1 - Preg1 -0.07612096 0.02380513 -0.12371581 -0.030310117 7 Virg8 - Preg1 -0.20392590 0.02419627 -0.25216901 -0.157132792 8 Virg1 - Preg8 -0.09045571 0.02415687 -0.13813097 -0.043683249 9 Virg8 - Preg8 -0.21826065 0.02403122 -0.26479914 -0.171535168 10 Virg8 - Virg1 -0.12780494 0.02419583 -0.17496673 -0.080363292
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
## With back transformed mcmc_areas(10^(coefs %*% t(pairwise.mat)))
(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 1.0541196 0.05855850 0.9400143 1.1681953 2 Preg8 - None 1.0894963 0.06054854 0.9744850 1.2087235 3 Virg1 - None 0.8846269 0.04881624 0.7922053 0.9837108 4 Virg8 - None 0.6591326 0.03686490 0.5880233 0.7316295 5 Preg8 - Preg1 1.0351401 0.05731642 0.9245000 1.1493981 6 Virg1 - Preg1 0.8404883 0.04613413 0.7521149 0.9325881 7 Virg8 - Preg1 0.6262506 0.03492673 0.5595398 0.6964135 8 Virg1 - Preg8 0.8132350 0.04527198 0.7270732 0.9038008 9 Virg8 - Preg8 0.6059049 0.03357668 0.5435016 0.6736973 10 Virg8 - Virg1 0.7462239 0.04162240 0.6678105 0.8304426
ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
library(MCMCpack) partridge.mcmc = as.matrix(partridge.rstan) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE))) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey") Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Preg1 - None 0 1 0 0 0 0 Preg8 - None 0 0 1 0 0 0 Virg1 - None 0 0 0 1 0 0 Virg8 - None 0 0 0 0 1 0 Preg8 - Preg1 0 -1 1 0 0 0 Virg1 - Preg1 0 -1 0 1 0 0 Virg8 - Preg1 0 -1 0 0 1 0 Virg1 - Preg8 0 0 -1 1 0 0 Virg8 - Preg8 0 0 -1 0 1 0 Virg8 - Virg1 0 0 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 0.02299254 0.02408381 -0.02571685 0.067810084 2 Preg8 - None 0.03628671 0.02437624 -0.01049910 0.083787879 3 Virg1 - None -0.05368096 0.02366228 -0.10081880 -0.009464246 4 Virg8 - None -0.18108684 0.02451849 -0.22869622 -0.132604319 5 Preg8 - Preg1 0.01329417 0.02395326 -0.03264648 0.062360678 6 Virg1 - Preg1 -0.07667351 0.02383856 -0.12247183 -0.030528752 7 Virg8 - Preg1 -0.20407938 0.02424689 -0.25291361 -0.156602516 8 Virg1 - Preg8 -0.08996768 0.02416122 -0.13611621 -0.041840729 9 Virg8 - Preg8 -0.21737355 0.02391817 -0.26345342 -0.170901602 10 Virg8 - Virg1 -0.12740587 0.02428197 -0.17411665 -0.079437471
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
## With back transformed mcmc_areas(10^(coefs %*% t(pairwise.mat)))
(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 1.0559907 0.05858701 0.9383382 1.1646937 2 Preg8 - None 1.0888563 0.06114756 0.9758079 1.2124721 3 Virg1 - None 0.8850402 0.04819619 0.7928321 0.9784435 4 Virg8 - None 0.6600920 0.03724069 0.5898256 0.7359186 5 Preg8 - Preg1 1.0326541 0.05703137 0.9215931 1.1470933 6 Virg1 - Preg1 0.8394233 0.04615294 0.7542723 0.9321188 7 Virg8 - Preg1 0.6260333 0.03498437 0.5583422 0.6969889 8 Virg1 - Preg8 0.8141491 0.04529563 0.7309435 0.9081535 9 Virg8 - Preg8 0.6071349 0.03348412 0.5447600 0.6741852 10 Virg8 - Virg1 0.7469190 0.04185802 0.6680023 0.8307720
ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
library(MCMCpack) partridge.mcmc = as.matrix(partridge.rstanarm) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE))) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey") Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Preg1 - None 0 1 0 0 0 0 Preg8 - None 0 0 1 0 0 0 Virg1 - None 0 0 0 1 0 0 Virg8 - None 0 0 0 0 1 0 Preg8 - Preg1 0 -1 1 0 0 0 Virg1 - Preg1 0 -1 0 1 0 0 Virg8 - Preg1 0 -1 0 0 1 0 Virg1 - Preg8 0 0 -1 1 0 0 Virg8 - Preg8 0 0 -1 0 1 0 Virg8 - Virg1 0 0 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 0.02234996 0.02407488 -0.02338140 0.070865192 2 Preg8 - None 0.03641433 0.02440350 -0.01013371 0.086076915 3 Virg1 - None -0.05403215 0.02354898 -0.09925124 -0.006906492 4 Virg8 - None -0.18226103 0.02398424 -0.22966723 -0.136666158 5 Preg8 - Preg1 0.01406437 0.02412565 -0.02968572 0.065601507 6 Virg1 - Preg1 -0.07638211 0.02392343 -0.12328881 -0.029989351 7 Virg8 - Preg1 -0.20461099 0.02415189 -0.25341139 -0.159071623 8 Virg1 - Preg8 -0.09044648 0.02433552 -0.13815607 -0.042642378 9 Virg8 - Preg8 -0.21867536 0.02387648 -0.26542574 -0.170323352 10 Virg8 - Virg1 -0.12822888 0.02393159 -0.17425325 -0.080769365
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
## With back transformed mcmc_areas(10^(coefs %*% t(pairwise.mat)))
(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 1.0544289 0.05852064 0.9411695 1.1700444 2 Preg8 - None 1.0891800 0.06122090 0.9669837 1.2075192 3 Virg1 - None 0.8843145 0.04804840 0.7933975 0.9816892 4 Virg8 - None 0.6582655 0.03637684 0.5892950 0.7300185 5 Preg8 - Preg1 1.0345108 0.05759814 0.9221479 1.1490921 6 Virg1 - Preg1 0.8399958 0.04635241 0.7520730 0.9324562 7 Virg8 - Preg1 0.6252592 0.03476928 0.5579414 0.6933115 8 Virg1 - Preg8 0.8132717 0.04565201 0.7240084 0.9027741 9 Virg8 - Preg8 0.6053146 0.03332360 0.5402115 0.6728358 10 Virg8 - Virg1 0.7454694 0.04105743 0.6646395 0.8252720
ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
library(MCMCpack) partridge.mcmc = as.matrix(partridge.brm) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("b_", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE))) # A Tukeys contrast matrix library(multcomp) tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey") Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat
(Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX Preg1 - None 0 1 0 0 0 0 Preg8 - None 0 0 1 0 0 0 Virg1 - None 0 0 0 1 0 0 Virg8 - None 0 0 0 0 1 0 Preg8 - Preg1 0 -1 1 0 0 0 Virg1 - Preg1 0 -1 0 1 0 0 Virg8 - Preg1 0 -1 0 0 1 0 Virg1 - Preg8 0 0 -1 1 0 0 Virg8 - Preg8 0 0 -1 0 1 0 Virg8 - Virg1 0 0 0 -1 1 0
mcmc_areas(coefs %*% t(pairwise.mat))
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 0.02240460 0.02401104 -0.024874580 0.068487474 2 Preg8 - None 0.03620256 0.02384339 -0.008463557 0.084372278 3 Virg1 - None -0.05404743 0.02381586 -0.100910410 -0.008313408 4 Virg8 - None -0.18198728 0.02386935 -0.228420724 -0.135256737 5 Preg8 - Preg1 0.01379796 0.02421139 -0.033271153 0.061604483 6 Virg1 - Preg1 -0.07645203 0.02403209 -0.123225559 -0.028926763 7 Virg8 - Preg1 -0.20439189 0.02427000 -0.250323885 -0.155967121 8 Virg1 - Preg8 -0.09024999 0.02394566 -0.136639855 -0.043432356 9 Virg8 - Preg8 -0.21818985 0.02415905 -0.265948399 -0.171503685 10 Virg8 - Virg1 -0.12793985 0.02412511 -0.176962850 -0.081758344
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
## With back transformed mcmc_areas(10^(coefs %*% t(pairwise.mat)))
(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
term estimate std.error conf.low conf.high 1 Preg1 - None 1.0545530 0.05836951 0.9443336 1.1708128 2 Preg8 - None 1.0885725 0.05986066 0.9797715 1.2133891 3 Virg1 - None 0.8843113 0.04850064 0.7926648 0.9810397 4 Virg8 - None 0.6586722 0.03629164 0.5909888 0.7323914 5 Preg8 - Preg1 1.0338855 0.05765372 0.9262513 1.1524033 6 Virg1 - Preg1 0.8398717 0.04652671 0.7490572 0.9312345 7 Virg8 - Preg1 0.6255845 0.03497534 0.5603392 0.6965585 8 Virg1 - Preg8 0.8135973 0.04483752 0.7239839 0.8980175 9 Virg8 - Preg8 0.6060138 0.03377457 0.5403503 0.6717646 10 Virg8 - Virg1 0.7459856 0.04150406 0.6613482 0.8240819
ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()
So (for example), the longevity of male fruitflies with 8 virgin partners is approximately
0.61
that of male fruitflies with 8 pregnant partners. Similar contrasts to those defined in Tutorial 7.4b, Q3 could be defined to explore specific comparisons. - Explore $R^2$
library(MCMCpack) library(broom) partridge.mcmc <- partridge.mcmcpack Xmat = model.matrix(~TREATMENT + THORAX, data = partridge) wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, log10(partridge$LONGEV), "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.6984863 0.02600139 0.6472273 0.7451868
# for comparison with frequentist summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] Xmat = model.matrix(~TREATMENT + THORAX, data = partridge) fit = coefs %*% t(Xmat) resid = sweep(fit, 2, log10(partridge$LONGEV), "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.6983588 0.02576383 0.6474857 0.7445594
# for comparison with frequentist summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
partridge.mcmc = as.matrix(partridge.rstan) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc))) coefs = partridge.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, log10(partridge$LONGEV), "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.376688 0.148472 0.1275442 0.6879338
# for comparison with frequentist summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
partridge.mcmc = as.matrix(partridge.rstanarm) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, log10(partridge$LONGEV), "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.3773228 0.1478388 0.1241807 0.6818132
# for comparison with frequentist summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
partridge.mcmc = as.matrix(partridge.brm) Xmat = model.matrix(~TREATMENT + THORAX, newdata) wch = grep("b_", colnames(partridge.mcmc)) coefs = partridge.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, log10(partridge$LONGEV), "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.3771413 0.1483151 0.1227373 0.6817201
# for comparison with frequentist summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
Call: lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge) Residuals: Min 1Q Median 3Q Max -0.226736 -0.058445 -0.003469 0.059961 0.170389 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.79095 0.08443 9.368 5.89e-16 *** TREATMENTPreg1 0.02260 0.02368 0.954 0.3419 TREATMENTPreg8 0.03648 0.02385 1.530 0.1287 TREATMENTVirg1 -0.05381 0.02366 -2.275 0.0247 * TREATMENTVirg8 -0.18165 0.02392 -7.592 7.79e-12 *** THORAX 1.19385 0.09900 12.060 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.08364 on 119 degrees of freedom Multiple R-squared: 0.7055, Adjusted R-squared: 0.6932 F-statistic: 57.02 on 5 and 119 DF, p-value: < 2.2e-16
The chains appear well mixed and have converged on what is likely to be a stable posterior.
The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.
Heterogeneous slopes
Constable (1993) compared the inter-radial suture widths of urchins maintained on one of three food regimes (Initial: no additional food supplied above what was in the initial sample, low: food supplied periodically and high: food supplied ad libitum. In an attempt to control for substantial variability in urchin sizes, the initial body volume of each urchin was measured as a covariate.
Download Partridge1 data setFormat of constable.csv data file | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
constable <- read.csv("../downloads/data/constable.csv", strip.white = T) head(constable)
TREAT IV SUTW 1 Initial 3.5 0.010 2 Initial 5.0 0.020 3 Initial 8.0 0.061 4 Initial 10.0 0.051 5 Initial 13.0 0.041 6 Initial 13.0 0.061
- Perform basic exploratory data analysis to examined.
- normality
- homogeneity of variance
- homogeneity of slopes (and equality of covariate ranges)
ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth() + geom_point() + theme_classic()
# Or with linear smoother ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth(method = "lm") + geom_point() + theme_classic()
- To obtain a quick estimation of the likely residuals,
lets explore the residuals from a simple linear model fitted in a frequentist
framework.
plot(lm(SUTW ~ IV * TREAT, data = constable))
- Given the likely heterogeneous slopes, it is arguably best
to fit a model that permits heterogeneous slopes (that is, includes the
IV by TREAT interaction).
Fit the appropriate Bayesian model to explore the effect of treatment and inital volume
(and their interaction) on the width of sutures.
$$
\begin{align}
y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\
\mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em]
\beta_0 &\sim{} N(0,10)\\
\beta &\sim{} N(0,10)\\
\sigma &\sim{} cauchy(0,5)\\
\end{align}
$$
library(MCMCpack) constable.mcmcpack = MCMCregress(SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mu[i],tau) mu[i] <- beta0 + inprod(beta[],X[i,]) } #Priors beta0 ~ dnorm(0.01,1.0E-6) for (j in 1:nX) { beta[j] ~ dnorm(0.01,1.0E-6) } tau <- 1 / (sigma * sigma) sigma~dunif(0,100) } " X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) constable.list <- with(constable, list(y = SUTW, X = X[, -1], nX = ncol(X) - 1, n = nrow(constable))) params <- c("beta0", "beta", "sigma") burnInSteps = 3000 nChains = 3 numSavedSteps = 15000 thinSteps = 10 nIter = ceiling((numSavedSteps * thinSteps)/nChains) constable.r2jags <- jags(data = constable.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 72 Unobserved stochastic nodes: 7 Total graph size: 645 Initializing model
modelString=" data { int
n; // total number of observations vector[n] Y; // response variable int nX; // number of effects matrix[n, nX] X; // model matrix } transformed data { matrix[n, nX - 1] Xc; // centered version of X vector[nX - 1] means_X; // column means of X before centering for (i in 2:nX) { means_X[i - 1] = mean(X[, i]); Xc[, i - 1] = X[, i] - means_X[i - 1]; } } parameters { vector[nX-1] beta; // population-level effects real cbeta0; // center-scale intercept real sigma; // residual SD } transformed parameters { } model { vector[n] mu; mu = Xc * beta + cbeta0; // prior specifications beta ~ normal(0, 10); cbeta0 ~ normal(0, 10); sigma ~ cauchy(0, 5); // likelihood contribution Y ~ normal(mu, sigma); } generated quantities { real beta0; // population-level intercept vector[n] log_lik; beta0 = cbeta0 - dot_product(means_X, beta); for (i in 1:n) { log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma); } } " X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) constable.list <- with(constable, list(Y = SUTW, X = X, nX = ncol(X), n = nrow(constable))) library(rstan) constable.rstan <- stan(data = constable.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500, thin = 5)
SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1). Gradient evaluation took 1.9e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.144418 seconds (Warm-up) 0.984664 seconds (Sampling) 1.12908 seconds (Total) SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2). Gradient evaluation took 1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.140768 seconds (Warm-up) 1.06787 seconds (Sampling) 1.20863 seconds (Total) SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3). Gradient evaluation took 1.6e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds. Adjust your expectations accordingly! Iteration: 1 / 5000 [ 0%] (Warmup) Iteration: 500 / 5000 [ 10%] (Warmup) Iteration: 501 / 5000 [ 10%] (Sampling) Iteration: 1000 / 5000 [ 20%] (Sampling) Iteration: 1500 / 5000 [ 30%] (Sampling) Iteration: 2000 / 5000 [ 40%] (Sampling) Iteration: 2500 / 5000 [ 50%] (Sampling) Iteration: 3000 / 5000 [ 60%] (Sampling) Iteration: 3500 / 5000 [ 70%] (Sampling) Iteration: 4000 / 5000 [ 80%] (Sampling) Iteration: 4500 / 5000 [ 90%] (Sampling) Iteration: 5000 / 5000 [100%] (Sampling) Elapsed Time: 0.15773 seconds (Warm-up) 1.077 seconds (Sampling) 1.23473 seconds (Total)
constable.rstanarm = stan_glm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable, iter = 5000, warmup = 500, chains = 3, thin = 5, refresh = 0, prior_intercept = normal(0, 10), prior = normal(0, 10), prior_aux = cauchy(0, 5))
Gradient evaluation took 2.8e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds. Adjust your expectations accordingly! Elapsed Time: 0.276502 seconds (Warm-up) 1.72962 seconds (Sampling) 2.00612 seconds (Total) Gradient evaluation took 1.3e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds. Adjust your expectations accordingly! Elapsed Time: 0.256832 seconds (Warm-up) 1.64027 seconds (Sampling) 1.8971 seconds (Total) Gradient evaluation took 1.3e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds. Adjust your expectations accordingly! Elapsed Time: 0.263878 seconds (Warm-up) 1.5121 seconds (Sampling) 1.77597 seconds (Total)
constable.brm = brm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable, iter = 5000, warmup = 500, chains = 3, thin = 5, refresh = 0, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 1.6e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds. Adjust your expectations accordingly! Elapsed Time: 0.143567 seconds (Warm-up) 1.0836 seconds (Sampling) 1.22716 seconds (Total) Gradient evaluation took 9e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds. Adjust your expectations accordingly! Elapsed Time: 0.133804 seconds (Warm-up) 1.12441 seconds (Sampling) 1.25822 seconds (Total) Gradient evaluation took 1.1e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds. Adjust your expectations accordingly! Elapsed Time: 0.139708 seconds (Warm-up) 1.10982 seconds (Sampling) 1.24953 seconds (Total)
- Explore MCMC diagnostics
library(MCMCpack) plot(constable.mcmcpack)
raftery.diag(constable.mcmcpack)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) (Intercept) 2 3865 3746 1.030 I(IV^(1/3)) 2 3650 3746 0.974 TREATInitial 2 3710 3746 0.990 TREATLow 2 3710 3746 0.990 I(IV^(1/3)):TREATInitial 2 3994 3746 1.070 I(IV^(1/3)):TREATLow 2 3897 3746 1.040 sigma2 2 3929 3746 1.050
autocorr.diag(constable.mcmcpack)
(Intercept) I(IV^(1/3)) TREATInitial TREATLow I(IV^(1/3)):TREATInitial Lag 0 1.000000000 1.000000000 1.000000000 1.0000000000 1.000000000 Lag 1 -0.005949108 -0.008338260 -0.008217864 0.0004490301 -0.010759741 Lag 5 0.020882211 0.023896208 -0.003101549 0.0115059926 0.001748665 Lag 10 0.015981305 0.019153534 -0.002528803 0.0022271117 0.001748848 Lag 50 0.007896860 0.005767901 0.008420514 -0.0058100440 0.003191181 I(IV^(1/3)):TREATLow sigma2 Lag 0 1.0000000000 1.0000000000 Lag 1 -0.0006921279 0.0907121678 Lag 5 0.0146699538 0.0005388206 Lag 10 0.0066382223 0.0170684058 Lag 50 -0.0064255545 0.0075084152
library(R2jags) library(coda) constable.mcmc = as.mcmc(constable.r2jags) plot(constable.mcmc)
raftery.diag(constable.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 38330 3746 10.20 beta[1] 10 37670 3746 10.10 beta[2] 10 37660 3746 10.10 beta[3] 20 37020 3746 9.88 beta[4] 20 38330 3746 10.20 beta[5] 20 39000 3746 10.40 deviance 10 37660 3746 10.10 sigma 20 37020 3746 9.88 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 37020 3746 9.88 beta[1] 20 36380 3746 9.71 beta[2] 10 37660 3746 10.10 beta[3] 20 37020 3746 9.88 beta[4] 20 36380 3746 9.71 beta[5] 20 36380 3746 9.71 deviance 20 38330 3746 10.20 sigma 20 38330 3746 10.20 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 38330 3746 10.20 beta[1] 20 39000 3746 10.40 beta[2] 10 37660 3746 10.10 beta[3] 10 37660 3746 10.10 beta[4] 20 38330 3746 10.20 beta[5] 20 36380 3746 9.71 deviance 10 37660 3746 10.10 sigma 20 36380 3746 9.71
autocorr.diag(constable.mcmc)
beta0 beta[1] beta[2] beta[3] beta[4] beta[5] Lag 0 1.0000000000 1.0000000000 1.000000000 1.0000000000 1.000000000 1.0000000000 Lag 10 -0.0008583855 -0.0008855223 0.005002736 0.0044326527 0.005410357 0.0049190350 Lag 50 -0.0046942718 -0.0052147927 -0.002716165 0.0001732884 -0.003266711 0.0001285954 Lag 100 0.0073399128 0.0063289399 0.005797145 0.0107637633 0.004719665 0.0102551639 Lag 500 0.0009396995 0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252 deviance sigma Lag 0 1.000000000 1.000000000 Lag 10 0.003079761 -0.004958203 Lag 50 -0.013198225 -0.008532456 Lag 100 -0.006321487 -0.014973307 Lag 500 0.007831084 -0.006825210
library(rstan) library(coda) s = as.array(constable.rstan) constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]", "beta[6]", "sigma")], 2, as.mcmc))
Error in s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]", : subscript out of bounds
plot(constable.mcmc)
raftery.diag(constable.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 38330 3746 10.20 beta[1] 10 37670 3746 10.10 beta[2] 10 37660 3746 10.10 beta[3] 20 37020 3746 9.88 beta[4] 20 38330 3746 10.20 beta[5] 20 39000 3746 10.40 deviance 10 37660 3746 10.10 sigma 20 37020 3746 9.88 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 37020 3746 9.88 beta[1] 20 36380 3746 9.71 beta[2] 10 37660 3746 10.10 beta[3] 20 37020 3746 9.88 beta[4] 20 36380 3746 9.71 beta[5] 20 36380 3746 9.71 deviance 20 38330 3746 10.20 sigma 20 38330 3746 10.20 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 38330 3746 10.20 beta[1] 20 39000 3746 10.40 beta[2] 10 37660 3746 10.10 beta[3] 10 37660 3746 10.10 beta[4] 20 38330 3746 10.20 beta[5] 20 36380 3746 9.71 deviance 10 37660 3746 10.10 sigma 20 36380 3746 9.71
autocorr.diag(constable.mcmc)
beta0 beta[1] beta[2] beta[3] beta[4] beta[5] Lag 0 1.0000000000 1.0000000000 1.000000000 1.0000000000 1.000000000 1.0000000000 Lag 10 -0.0008583855 -0.0008855223 0.005002736 0.0044326527 0.005410357 0.0049190350 Lag 50 -0.0046942718 -0.0052147927 -0.002716165 0.0001732884 -0.003266711 0.0001285954 Lag 100 0.0073399128 0.0063289399 0.005797145 0.0107637633 0.004719665 0.0102551639 Lag 500 0.0009396995 0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252 deviance sigma Lag 0 1.000000000 1.000000000 Lag 10 0.003079761 -0.004958203 Lag 50 -0.013198225 -0.008532456 Lag 100 -0.006321487 -0.014973307 Lag 500 0.007831084 -0.006825210
library(rstan) library(coda) stan_ac(constable.rstan, pars = c("beta0", "beta", "sigma"))
stan_rhat(constable.rstan, pars = c("beta0", "beta", "sigma"))
stan_ess(constable.rstan, pars = c("beta0", "beta", "sigma"))
# using Bayeseplot library(bayesplot) detach("package:reshape") mcmc_trace(as.array(constable.rstan), regex_par = "beta|sigma")
mcmc_trace(as.array(constable.rstan), regex_pars = "beta|sigma")
mcmc_dens(as.array(constable.rstan), regex_par = "beta|sigma")
detach("package:reshape") library(bayesplot) mcmc_combo(as.array(constable.rstan), regex_par = "beta|sigma")
library(rstanarm) library(coda) s = as.array(constable.rstanarm) colnames(as.matrix(constable.rstanarm))
[1] "(Intercept)" "I(IV^(1/3))" "TREATInitial" [4] "TREATLow" "I(IV^(1/3)):TREATInitial" "I(IV^(1/3)):TREATLow" [7] "sigma"
constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , colnames(as.matrix(constable.rstanarm))], 2, as.mcmc)) plot(constable.mcmc)
raftery.diag(constable.mcmc)
$`1` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`2` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s $`3` Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
autocorr.diag(constable.mcmc)
(Intercept) I(IV^(1/3)) TREATInitial TREATLow I(IV^(1/3)):TREATInitial Lag 0 1.000000000 1.000000000 1.00000000 1.000000000 1.000000000 Lag 1 0.094390624 0.093078031 0.09427246 0.088553480 0.094495420 Lag 5 0.014550240 0.017891288 0.02036189 -0.003087282 0.025476705 Lag 10 -0.018253834 -0.015818047 -0.01071217 -0.007707348 -0.006127881 Lag 50 0.003953687 0.002826433 0.01488185 0.010416070 0.012186204 I(IV^(1/3)):TREATLow sigma Lag 0 1.000000000 1.000000000 Lag 1 0.085966027 0.019297999 Lag 5 0.004263184 -0.001089781 Lag 10 -0.007911326 0.013127724 Lag 50 0.002241020 0.019541951
library(rstanarm) library(coda) stan_ac(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
stan_rhat(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
stan_ess(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
# using Bayeseplot library(bayesplot) detach("package:reshape") mcmc_trace(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")
mcmc_trace(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
mcmc_dens(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
detach("package:reshape") library(bayesplot) mcmc_combo(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")
library(rstanarm) posterior_vs_prior(constable.rstanarm, color_by = "vs", group_by = TRUE, facet_args = list(scales = "free_y"))
Gradient evaluation took 2.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds. Adjust your expectations accordingly! Elapsed Time: 0.048119 seconds (Warm-up) 0.080688 seconds (Sampling) 0.128807 seconds (Total) Gradient evaluation took 1.5e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds. Adjust your expectations accordingly! Elapsed Time: 0.038981 seconds (Warm-up) 0.124669 seconds (Sampling) 0.16365 seconds (Total)
library(coda) library(brms) constable.mcmc = as.mcmc(constable.brm) plot(constable.mcmc)
raftery.diag(constable.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
autocorr.diag(constable.mcmc)
Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
library(coda) stan_ac(constable.brm$fit)
stan_rhat(constable.brm$fit)
stan_ess(constable.brm$fit)
- Perform model validation
library(MCMCpack) constable.mcmc = as.data.frame(constable.mcmcpack) # generate a model matrix newdata = constable Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = apply(constable.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = constable$SUTW - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against TREAT level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
library(MCMCpack) constable.mcmc = as.matrix(constable.mcmcpack) # generate a model matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = (coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable), fit[i, ], sqrt(constable.mcmc[i, "sigma2"]))) newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>% gather(key = Sample, value = Value, -TREAT, -IV) ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"), alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT, fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV, color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = IV, group = IV, color = TREAT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")
mcmc_areas(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")
constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix # generate a model matrix newdata = constable Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = apply(constable.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = constable$SUTW - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against TREAT level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix # generate a model matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = constable.mcmc[, wch] fit = (coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable), fit[i, ], constable.mcmc[i, "sigma"])) newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>% gather(key = Sample, value = Value, -TREAT, -IV) ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"), alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT, fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV, color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = IV, group = IV, color = TREAT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
mcmc_areas(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
constable.mcmc = as.matrix(constable.rstan) # generate a model matrix newdata = constable Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = apply(constable.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = constable$SUTW - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against TREAT level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
constable.mcmc = as.matrix(constable.rstan) # generate a model matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = constable.mcmc[, wch] fit = (coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable), fit[i, ], constable.mcmc[i, "sigma"])) newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>% gather(key = Sample, value = Value, -TREAT, -IV) ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"), alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT, fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV, color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = IV, group = IV, color = TREAT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(constable.rstan), regex_pars = "beta|sigma")
mcmc_areas(as.matrix(constable.rstan), regex_pars = "beta|sigma")
constable.mcmc = as.matrix(constable.rstanarm) # generate a model matrix newdata = constable Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = apply(constable.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = (constable$SUTW) - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against TREAT level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
constable.mcmc = as.matrix(constable.rstanarm) # generate a model matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = (coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable), fit[i, ], constable.mcmc[i, "sigma"])) newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>% gather(key = Sample, value = Value, -TREAT, -IV) ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"), alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT, fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV, color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = IV, group = IV, color = TREAT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
mcmc_areas(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
constable.mcmc = as.matrix(constable.brm) # generate a model matrix newdata = constable Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("b_", colnames(constable.mcmc)) coefs = apply(constable.mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = constable$SUTW - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
# Plot residuals against TREAT level newdata = newdata %>% cbind(fit, resid) newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV) ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
# Standardized residuals sresid = resid/sd(resid) ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
constable.mcmc = as.matrix(constable.brm) # generate a model matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata) ## get median parameter estimates wch = grep("b_", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = (coefs %*% t(Xmat)) ## draw samples from this model yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable), fit[i, ], constable.mcmc[i, "sigma"])) newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>% gather(key = Sample, value = Value, -TREAT, -IV) ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"), alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT, fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV, color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW, x = IV, group = IV, color = TREAT))
We can also explore the posteriors of each parameter.
library(bayesplot) mcmc_intervals(as.matrix(constable.brm), regex_pars = "b_|sigma")
mcmc_areas(as.matrix(constable.brm), regex_pars = "b_|sigma")
- Explore parameter estimates
library(MCMCpack) summary(constable.mcmcpack)
Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) -0.0281407 3.212e-02 3.212e-04 3.212e-04 I(IV^(1/3)) 0.0467453 1.256e-02 1.256e-04 1.277e-04 TREATInitial -0.0004898 3.950e-02 3.950e-04 3.950e-04 TREATLow 0.0627394 3.903e-02 3.903e-04 3.903e-04 I(IV^(1/3)):TREATInitial -0.0113577 1.512e-02 1.512e-04 1.512e-04 I(IV^(1/3)):TREATLow -0.0376004 1.480e-02 1.480e-04 1.480e-04 sigma2 0.0004486 8.176e-05 8.176e-07 8.955e-07 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -0.0904732 -0.0498242 -0.0280381 -0.0068701 0.0348912 I(IV^(1/3)) 0.0220741 0.0384528 0.0468257 0.0552233 0.0709941 TREATInitial -0.0774232 -0.0271221 -0.0004452 0.0258165 0.0769013 TREATLow -0.0146895 0.0369222 0.0626721 0.0889185 0.1394793 I(IV^(1/3)):TREATInitial -0.0409856 -0.0213530 -0.0114781 -0.0011979 0.0179675 I(IV^(1/3)):TREATLow -0.0663009 -0.0474367 -0.0376417 -0.0278228 -0.0084968 sigma2 0.0003162 0.0003897 0.0004405 0.0004971 0.0006353
library(broom) tidyMCMC(constable.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 (Intercept) -0.0281407318 3.211710e-02 -0.0921183584 0.0328974751 2 I(IV^(1/3)) 0.0467453220 1.256260e-02 0.0216638045 0.0704309315 3 TREATInitial -0.0004897981 3.950309e-02 -0.0768221293 0.0772328696 4 TREATLow 0.0627394109 3.902797e-02 -0.0164210766 0.1365547275 5 I(IV^(1/3)):TREATInitial -0.0113576982 1.511950e-02 -0.0399768488 0.0185830940 6 I(IV^(1/3)):TREATLow -0.0376004095 1.480079e-02 -0.0661201076 -0.0083137914 7 sigma2 0.0004485999 8.176199e-05 0.0003025742 0.0006110242
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) terms = attr(Xmat, "assign") ## Effect effect for (i in 1:length(terms)) mcmcpvalue(constable.mcmcpack[, i]) ## Effect of covariate mcmcpvalue(constable.mcmcpack[, which(terms == 1)])
[1] 6e-04
## Effect of treatment (marginal) mcmcpvalue(constable.mcmcpack[, which(terms == 2)])
[1] 0.096
## Effect of interaction mcmcpvalue(constable.mcmcpack[, which(terms == 3)])
[1] 0.0165
# Overal model wch = grep("TREAT|IV", colnames(constable.mcmcpack)) mcmcpvalue(constable.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
print(constable.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10 n.sims = 14100 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 0.047 0.012 0.022 0.039 0.047 0.055 0.071 1.001 14000 beta[2] 0.000 0.039 -0.077 -0.026 0.000 0.025 0.076 1.001 14000 beta[3] 0.063 0.038 -0.013 0.037 0.063 0.088 0.139 1.001 14000 beta[4] -0.011 0.015 -0.040 -0.021 -0.012 -0.001 0.018 1.001 14000 beta[5] -0.038 0.015 -0.067 -0.047 -0.038 -0.028 -0.009 1.001 14000 beta0 -0.028 0.032 -0.090 -0.049 -0.028 -0.007 0.035 1.001 14000 sigma 0.021 0.002 0.018 0.020 0.021 0.022 0.025 1.001 6700 deviance -354.340 3.998 -360.045 -357.282 -355.026 -352.167 -344.666 1.001 14000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 8.0 and DIC = -346.3 DIC is an estimate of expected predictive error (lower deviance is better).
library(broom) tidyMCMC(constable.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta[1] 4.670045e-02 0.012301012 0.02305671 7.161889e-02 2 beta[2] -4.557708e-04 0.038871071 -0.07545077 7.762928e-02 3 beta[3] 6.281161e-02 0.038401967 -0.01312925 1.383286e-01 4 beta[4] -1.136903e-02 0.014860096 -0.04083542 1.740297e-02 5 beta[5] -3.762094e-02 0.014575422 -0.06722313 -9.735764e-03 6 beta0 -2.801120e-02 0.031515102 -0.08943297 3.500509e-02 7 deviance -3.543402e+02 3.997734559 -360.86935897 -3.465127e+02 8 sigma 2.089732e-02 0.001885792 0.01735789 2.460693e-02
constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) constable.mcmc = constable.mcmc[, wch] str(constable.mcmc)
num [1:14100, 1:6] -0.0208 -0.00522 -0.01208 -0.00336 0.07698 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) terms = attr(Xmat, "assign") ## Effect effect for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
[1] 0.3675887 [1] 0.000141844 [1] 0.9907801 [1] 0.1035461 [1] 0.4405674 [1] 0.01078014
## Effect of covariate mcmcpvalue(constable.mcmc[, which(terms == 1)])
[1] 0.000141844
## Effect of treatment (marginal) mcmcpvalue(constable.mcmc[, which(terms == 2)])
[1] 0.08524823
## Effect of interaction mcmcpvalue(constable.mcmc[, which(terms == 3)])
[1] 0.01297872
# Overal model wch = grep("beta\\[", colnames(constable.mcmc)) mcmcpvalue(constable.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
library(broom) tidyMCMC(constable.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"), ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 beta0 -0.0278369985 0.032630197 -0.08856619 0.03976331 1.0011292 1778 2 beta[1] 0.0466281311 0.012731092 0.01859167 0.06908683 1.0010891 2018 3 beta[2] -0.0007384085 0.039219652 -0.07668034 0.07500954 0.9997629 2152 4 beta[3] 0.0618907446 0.039193081 -0.01722613 0.13859100 1.0000331 1810 5 beta[4] -0.0112233626 0.014999941 -0.04061147 0.01841634 0.9998709 2106 6 beta[5] -0.0372331205 0.014919054 -0.06591119 -0.00682343 1.0001722 1767 7 sigma 0.0208891691 0.001872266 0.01713790 0.02446211 1.0004836 2571
constable.mcmc = as.matrix(constable.rstan) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) constable.mcmc = constable.mcmc[, wch] str(constable.mcmc)
num [1:2700, 1:6] -0.097 -0.0165 -0.0329 -0.0255 0.0274 ... - attr(*, "dimnames")=List of 2 ..$ iterations: NULL ..$ parameters: chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) terms = attr(Xmat, "assign") ## Effect effect for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
[1] 0.387037 [1] 0.0007407407 [1] 0.987037 [1] 0.1166667 [1] 0.422963 [1] 0.01444444
## Effect of covariate mcmcpvalue(constable.mcmc[, which(terms == 1)])
[1] 0.0007407407
## Effect of treatment (marginal) mcmcpvalue(constable.mcmc[, which(terms == 2)])
[1] 0.0962963
## Effect of interaction mcmcpvalue(constable.mcmc[, which(terms == 3)])
[1] 0.02074074
# Overal model wch = grep("beta\\[", colnames(constable.mcmc)) mcmcpvalue(constable.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
summary(constable.rstan)
$summary mean se_mean sd 2.5% 25% 50% beta[1] 4.662813e-02 2.834136e-04 0.012731092 0.02031882 0.03842780 4.691583e-02 beta[2] -7.384085e-04 8.455188e-04 0.039219652 -0.07661515 -0.02554020 5.813345e-05 beta[3] 6.189074e-02 9.213031e-04 0.039193081 -0.01636681 0.03653035 6.230858e-02 beta[4] -1.122336e-02 3.268895e-04 0.014999941 -0.04039101 -0.02063698 -1.126060e-02 beta[5] -3.723312e-02 3.549614e-04 0.014919054 -0.06648371 -0.04703486 -3.747664e-02 cbeta0 7.244843e-02 4.899537e-05 0.002545874 0.06748716 0.07071114 7.243238e-02 sigma 2.088917e-02 3.692373e-05 0.001872266 0.01745036 0.01961179 2.082794e-02 beta0 -2.783700e-02 7.738103e-04 0.032630197 -0.09105514 -0.04933693 -2.819851e-02 log_lik[1] 2.555053e+00 8.279597e-03 0.412625825 1.48002609 2.37565170 2.668924e+00 log_lik[2] 2.689869e+00 5.886304e-03 0.291712678 1.94169390 2.56823620 2.765803e+00 log_lik[3] 2.481437e+00 6.408258e-03 0.321730766 1.70879340 2.29899530 2.541077e+00 log_lik[4] 2.897622e+00 2.274280e-03 0.116633788 2.64138042 2.83049352 2.907077e+00 log_lik[5] 2.705299e+00 3.510597e-03 0.174263133 2.31349245 2.61010610 2.728462e+00 log_lik[6] 2.876877e+00 2.286666e-03 0.117374730 2.61123537 2.80959929 2.887540e+00 log_lik[7] 2.560570e+00 3.857742e-03 0.194705349 2.12570324 2.45110459 2.581614e+00 log_lik[8] 2.752342e+00 2.840581e-03 0.146866659 2.40683117 2.67032992 2.770447e+00 log_lik[9] 1.775614e+00 6.846079e-03 0.348432336 0.99574820 1.56920160 1.809863e+00 log_lik[10] 2.778273e+00 2.698093e-03 0.135681420 2.48780917 2.70187045 2.792084e+00 log_lik[11] 2.673033e+00 3.005238e-03 0.156156766 2.33437255 2.58537417 2.689725e+00 log_lik[12] 2.686719e+00 2.928559e-03 0.152172402 2.33092385 2.59809399 2.701096e+00 log_lik[13] 1.661739e+00 6.924671e-03 0.359816441 0.87329102 1.45046387 1.695174e+00 log_lik[14] 2.316722e+00 4.462239e-03 0.231864737 1.78428279 2.18156727 2.335296e+00 log_lik[15] 2.333874e+00 4.626360e-03 0.240392696 1.79242627 2.19062150 2.361844e+00 log_lik[16] 2.745435e+00 2.852268e-03 0.148208202 2.40700659 2.66093197 2.759778e+00 log_lik[17] 2.845838e+00 2.312208e-03 0.120145827 2.58647190 2.77648380 2.855412e+00 log_lik[18] 2.849556e+00 2.509111e-03 0.128577370 2.55917320 2.77590000 2.860467e+00 log_lik[19] 2.822053e+00 2.724216e-03 0.141554399 2.49033298 2.74328713 2.837016e+00 log_lik[20] 2.873692e+00 2.581911e-03 0.131199533 2.55664162 2.80698230 2.888037e+00 log_lik[21] 2.026990e+00 8.336486e-03 0.433176512 1.04071341 1.76431161 2.086384e+00 log_lik[22] 2.787911e+00 3.709842e-03 0.192769033 2.29265457 2.70351788 2.824156e+00 log_lik[23] 2.399441e+00 6.861572e-03 0.356537753 1.53604209 2.20306355 2.459279e+00 log_lik[24] 2.839579e+00 3.259153e-03 0.169350543 2.39523909 2.76589671 2.871430e+00 log_lik[25] 2.762956e+00 4.962960e-03 0.241860710 2.11040028 2.66516845 2.820364e+00 log_lik[26] 2.334286e+00 7.549496e-03 0.372101667 1.41557147 2.13302617 2.400365e+00 log_lik[27] 2.723067e+00 4.487012e-03 0.218326652 2.16329698 2.61613234 2.763296e+00 log_lik[28] 2.078010e+00 7.032911e-03 0.364625211 1.26525032 1.86876336 2.121607e+00 log_lik[29] 2.581916e+00 4.920716e-03 0.255687918 1.97016520 2.45489963 2.616756e+00 log_lik[30] 2.886086e+00 2.276165e-03 0.114568562 2.63526733 2.82031241 2.895578e+00 log_lik[31] 2.910017e+00 2.060950e-03 0.104986513 2.68691980 2.84838520 2.915109e+00 log_lik[32] 2.222772e+00 5.339118e-03 0.275582619 1.61533570 2.06599337 2.248303e+00 log_lik[33] 2.924467e+00 1.925423e-03 0.097708975 2.72776828 2.86256038 2.927915e+00 log_lik[34] 2.924467e+00 1.925423e-03 0.097708975 2.72776828 2.86256038 2.927915e+00 log_lik[35] 2.752860e+00 2.808358e-03 0.144358905 2.43182351 2.67047426 2.770488e+00 log_lik[36] 2.490421e+00 3.964137e-03 0.200794362 2.03246369 2.37362879 2.510642e+00 log_lik[37] 2.930899e+00 1.871049e-03 0.095021752 2.73934583 2.86974024 2.932586e+00 log_lik[38] 2.929999e+00 1.879522e-03 0.095393674 2.73761400 2.86804181 2.931895e+00 log_lik[39] 2.438143e+00 4.251045e-03 0.219655037 1.93589288 2.30486686 2.461310e+00 log_lik[40] 2.927682e+00 1.885400e-03 0.096231668 2.73082310 2.86582223 2.929051e+00 log_lik[41] 2.872912e+00 2.539450e-03 0.131822831 2.56366237 2.80429922 2.889157e+00 log_lik[42] 2.615441e+00 5.050475e-03 0.262430360 1.97208172 2.47776371 2.660553e+00 log_lik[43] 2.864143e+00 2.761726e-03 0.143503495 2.50170516 2.79416131 2.883797e+00 log_lik[44] 2.851867e+00 2.988698e-03 0.155297296 2.45692210 2.77823353 2.876845e+00 log_lik[45] 2.873560e+00 2.681272e-03 0.139322967 2.54704307 2.80559996 2.891678e+00 log_lik[46] 2.689669e+00 5.963575e-03 0.280196813 1.97318990 2.56913396 2.756082e+00 log_lik[47] 2.435259e+00 6.821403e-03 0.324916984 1.68842917 2.25753450 2.485678e+00 log_lik[48] 2.699319e+00 5.452140e-03 0.238778108 2.11922450 2.58832085 2.747008e+00 log_lik[49] 2.279966e+00 8.200409e-03 0.392277547 1.34804999 2.06478509 2.343175e+00 log_lik[50] 2.719443e+00 3.563200e-03 0.177344190 2.31249922 2.62214007 2.743720e+00 log_lik[51] 2.418619e+00 4.698342e-03 0.240938206 1.86752765 2.27902509 2.445047e+00 log_lik[52] 1.504658e+00 7.710691e-03 0.400659237 0.63779990 1.25467856 1.541890e+00 log_lik[53] 2.672471e+00 3.090797e-03 0.160602534 2.32001149 2.57212431 2.685993e+00 log_lik[54] 1.854814e+00 6.997527e-03 0.360937583 1.04480966 1.63603052 1.886166e+00 log_lik[55] 2.869673e+00 2.582922e-03 0.128647935 2.56622994 2.80382516 2.884795e+00 log_lik[56] 2.714235e+00 3.056682e-03 0.158474265 2.35245828 2.62540402 2.731998e+00 log_lik[57] 1.609668e+00 7.477827e-03 0.388559294 0.76105837 1.36902866 1.643380e+00 log_lik[58] 2.883513e+00 2.143228e-03 0.109107631 2.64053696 2.82097839 2.890654e+00 log_lik[59] 2.903621e+00 2.067792e-03 0.103244226 2.68945368 2.83949878 2.906093e+00 log_lik[60] 2.841091e+00 2.293749e-03 0.119186689 2.57997266 2.77081249 2.847705e+00 log_lik[61] 2.171238e+00 5.368653e-03 0.277867834 1.58346702 1.99630633 2.198158e+00 log_lik[62] 2.894702e+00 2.153447e-03 0.107158693 2.67477624 2.83273941 2.900048e+00 log_lik[63] -2.080309e+00 2.132534e-02 1.075508483 -4.42410661 -2.74679165 -1.978252e+00 log_lik[64] 2.529744e-01 1.288737e-02 0.665509296 -1.22137507 -0.16784435 3.028655e-01 log_lik[65] 2.866611e+00 2.735208e-03 0.134152799 2.55952547 2.79899747 2.881208e+00 log_lik[66] 1.357742e+00 1.521447e-02 0.735355609 -0.30509519 0.93351831 1.446848e+00 log_lik[67] 1.958808e+00 9.352149e-03 0.462405731 0.96385176 1.67520242 2.014435e+00 log_lik[68] 2.829446e+00 3.852587e-03 0.185710982 2.35089393 2.75399152 2.865516e+00 log_lik[69] 1.858093e+00 1.656846e-02 0.786121382 0.08716029 1.44708394 2.012602e+00 log_lik[70] 2.020418e+00 9.992458e-03 0.517707061 0.79164539 1.72680316 2.100529e+00 log_lik[71] 2.828951e+00 3.433464e-03 0.178408006 2.35566306 2.75449274 2.862395e+00 log_lik[72] 2.224567e+00 6.792642e-03 0.329938655 1.44633068 2.04324904 2.270276e+00 lp__ 2.394017e+02 4.082506e-02 2.031752421 234.41983863 238.38380889 2.397764e+02 75% 97.5% n_eff Rhat beta[1] 0.055023423 0.071624769 2017.855 1.0010891 beta[2] 0.023974137 0.074990062 2151.597 0.9997629 beta[3] 0.088135179 0.140132781 1809.730 1.0000331 beta[4] -0.001792672 0.018702992 2105.607 0.9998709 beta[5] -0.027204509 -0.007036234 1766.527 1.0001722 cbeta0 0.074176438 0.077529500 2700.000 0.9999105 sigma 0.022074894 0.024931207 2571.130 1.0004836 beta0 -0.006626736 0.037656284 1778.158 1.0011292 log_lik[1] 2.854702141 3.027767860 2483.673 0.9998988 log_lik[2] 2.895481029 3.046566686 2455.982 0.9999209 log_lik[3] 2.723139669 2.933952105 2520.607 0.9994487 log_lik[4] 2.972367347 3.102625886 2630.032 1.0005403 log_lik[5] 2.826250193 2.983700261 2464.047 0.9997244 log_lik[6] 2.955006090 3.080157288 2634.777 1.0002897 log_lik[7] 2.694386076 2.885570014 2547.355 0.9994033 log_lik[8] 2.852596334 2.992124738 2673.204 0.9994858 log_lik[9] 2.015621018 2.348219252 2590.318 0.9992977 log_lik[10] 2.870571012 3.011641778 2528.873 0.9998255 log_lik[11] 2.780840934 2.935968260 2700.000 0.9993728 log_lik[12] 2.791360985 2.936695879 2700.000 0.9991006 log_lik[13] 1.914075113 2.262740896 2700.000 0.9992730 log_lik[14] 2.481417212 2.695023365 2700.000 0.9989629 log_lik[15] 2.504695666 2.736091080 2700.000 0.9989709 log_lik[16] 2.845698153 2.998183094 2700.000 0.9993760 log_lik[17] 2.929761685 3.048195661 2700.000 0.9996054 log_lik[18] 2.933501427 3.068342788 2625.968 0.9996591 log_lik[19] 2.917454089 3.057175764 2700.000 0.9994765 log_lik[20] 2.958389389 3.088677398 2582.152 0.9995303 log_lik[21] 2.339246000 2.712177499 2700.000 0.9991631 log_lik[22] 2.916649381 3.066762554 2700.000 0.9992182 log_lik[23] 2.659415848 2.913365312 2700.000 0.9994161 log_lik[24] 2.949364739 3.078818204 2700.000 0.9996608 log_lik[25] 2.923960888 3.063099634 2374.921 0.9992894 log_lik[26] 2.596365553 2.875991184 2429.334 0.9990705 log_lik[27] 2.877477863 3.028095503 2367.550 0.9991650 log_lik[28] 2.331170183 2.673624000 2687.962 0.9990338 log_lik[29] 2.763275814 2.961246284 2700.000 0.9990731 log_lik[30] 2.963759598 3.084771224 2533.514 0.9998388 log_lik[31] 2.979865899 3.100417988 2594.970 1.0002251 log_lik[32] 2.419334769 2.679983872 2664.186 0.9990859 log_lik[33] 2.989942674 3.104349686 2575.233 1.0003795 log_lik[34] 2.989942674 3.104349686 2575.233 1.0003795 log_lik[35] 2.851052734 2.995707580 2642.301 0.9994472 log_lik[36] 2.632252355 2.819382873 2565.699 0.9994201 log_lik[37] 2.994852604 3.109550116 2579.145 1.0004301 log_lik[38] 2.994345912 3.109560401 2575.990 1.0004253 log_lik[39] 2.592934426 2.801667081 2669.874 0.9992269 log_lik[40] 2.993273134 3.109223738 2605.128 1.0003846 log_lik[41] 2.959792583 3.082061406 2694.648 0.9998075 log_lik[42] 2.806068088 2.982019372 2700.000 0.9992635 log_lik[43] 2.959054769 3.085870780 2700.000 0.9997016 log_lik[44] 2.954728408 3.084005662 2700.000 0.9996167 log_lik[45] 2.964382735 3.087830485 2700.000 0.9997871 log_lik[46] 2.884750873 3.039018211 2207.563 1.0007124 log_lik[47] 2.673113795 2.900754398 2268.806 1.0009325 log_lik[48] 2.864892011 3.029914666 1918.029 1.0007769 log_lik[49] 2.564203145 2.851315200 2288.316 1.0009860 log_lik[50] 2.841307097 2.998910586 2477.156 1.0004831 log_lik[51] 2.590685805 2.803870352 2629.796 1.0003989 log_lik[52] 1.785915792 2.185808724 2700.000 1.0004247 log_lik[53] 2.786300204 2.939195545 2700.000 0.9999469 log_lik[54] 2.109215956 2.466082445 2660.573 1.0002311 log_lik[55] 2.951541567 3.081437035 2480.750 1.0001909 log_lik[56] 2.826197665 2.975459589 2687.924 0.9999721 log_lik[57] 1.883403557 2.264911382 2700.000 1.0004782 log_lik[58] 2.956303334 3.078375450 2591.634 0.9998096 log_lik[59] 2.973835591 3.091447248 2492.975 1.0004020 log_lik[60] 2.921379864 3.054351592 2700.000 1.0001861 log_lik[61] 2.371166175 2.625066849 2678.834 0.9994100 log_lik[62] 2.967469714 3.092051497 2476.205 1.0004038 log_lik[63] -1.345084019 -0.151002795 2543.522 0.9998128 log_lik[64] 0.716672150 1.386447846 2666.733 1.0004144 log_lik[65] 2.954903918 3.080417715 2405.574 1.0002802 log_lik[66] 1.901290620 2.479112735 2336.044 1.0002470 log_lik[67] 2.299771383 2.682749103 2444.689 1.0001665 log_lik[68] 2.947306682 3.076343209 2323.648 1.0001059 log_lik[69] 2.455324591 2.901446564 2251.204 1.0005565 log_lik[70] 2.404917313 2.781645283 2684.253 0.9993194 log_lik[71] 2.945926717 3.076691532 2700.000 0.9994939 log_lik[72] 2.449956379 2.748148303 2359.331 0.9991587 lp__ 240.880868913 242.260342463 2476.783 1.0000042 $c_summary , , chains = chain:1 stats parameter mean sd 2.5% 25% 50% 75% beta[1] 0.04643878 0.012628851 0.02049142 0.03819352 0.047001702 0.054672036 beta[2] -0.00140730 0.038707451 -0.07566133 -0.02657238 -0.001206666 0.023579548 beta[3] 0.06144959 0.038584649 -0.01440674 0.03616748 0.062178000 0.086730348 beta[4] -0.01092832 0.014820893 -0.03970598 -0.01989798 -0.010928983 -0.001653754 beta[5] -0.03703277 0.014717248 -0.06681856 -0.04624826 -0.037645450 -0.027113155 cbeta0 0.07233788 0.002567254 0.06744530 0.07047948 0.072270528 0.074185837 sigma 0.02088342 0.001801060 0.01756746 0.01968722 0.020763381 0.022048136 beta0 -0.02752243 0.032411022 -0.09009245 -0.04939535 -0.028054042 -0.006953943 log_lik[1] 2.56619719 0.386706491 1.59400980 2.37328301 2.672661552 2.859236864 log_lik[2] 2.69769984 0.272845128 1.98620346 2.56874004 2.761938188 2.897393450 log_lik[3] 2.47594474 0.323693862 1.69910023 2.28717682 2.547969182 2.728703862 log_lik[4] 2.89727496 0.111618302 2.64978004 2.83629549 2.906427271 2.970347414 log_lik[5] 2.70914643 0.170436041 2.32168344 2.61641677 2.726831739 2.829876539 log_lik[6] 2.87563467 0.114646737 2.60723928 2.81059665 2.885256035 2.957128443 log_lik[7] 2.56457166 0.192918143 2.12902567 2.45110459 2.584004526 2.697265641 log_lik[8] 2.75010690 0.147052720 2.40386344 2.66460552 2.771962688 2.852821975 log_lik[9] 1.77134842 0.349560716 0.99003727 1.56550220 1.815295936 2.012138440 log_lik[10] 2.78006235 0.133421540 2.47801156 2.70383799 2.792799223 2.868294665 log_lik[11] 2.67475691 0.155625778 2.32203922 2.58018861 2.692522610 2.782818989 log_lik[12] 2.68498164 0.155023970 2.29977060 2.59330457 2.698320431 2.798590880 log_lik[13] 1.65985037 0.360418884 0.89068235 1.46004939 1.697094096 1.914231916 log_lik[14] 2.31487288 0.235761992 1.77381209 2.17536525 2.334570894 2.487175516 log_lik[15] 2.33468537 0.244450693 1.79728694 2.18363611 2.363513939 2.507809026 log_lik[16] 2.74531350 0.146942403 2.40098995 2.66206681 2.759728108 2.845661447 log_lik[17] 2.84473866 0.121651793 2.56637788 2.77018750 2.855964719 2.930368125 log_lik[18] 2.84836050 0.124991501 2.56720774 2.77502286 2.860786724 2.930722856 log_lik[19] 2.82067855 0.138656474 2.49797805 2.74079294 2.836743786 2.914811821 log_lik[20] 2.87160993 0.127822155 2.55664162 2.80458157 2.887362248 2.956733309 log_lik[21] 2.02326072 0.442043884 1.01995598 1.76598398 2.091522685 2.337897774 log_lik[22] 2.78483581 0.191962506 2.29043026 2.70465406 2.825989913 2.914067188 log_lik[23] 2.40144099 0.360625988 1.52170229 2.21428410 2.454859425 2.663272843 log_lik[24] 2.83878972 0.171493897 2.36936858 2.77060961 2.870977392 2.951985745 log_lik[25] 2.76133288 0.244915410 2.08467745 2.66901967 2.814460740 2.927555797 log_lik[26] 2.33706705 0.373495823 1.40039185 2.14087735 2.401893557 2.586771140 log_lik[27] 2.72367597 0.217117568 2.14375907 2.62772142 2.759618192 2.880211910 log_lik[28] 2.07252272 0.377822224 1.22616249 1.87548557 2.126163599 2.335521250 log_lik[29] 2.57746553 0.270483601 1.91450699 2.46143195 2.616645250 2.757665637 log_lik[30] 2.88675406 0.110410301 2.64244539 2.82040964 2.894386823 2.965726700 log_lik[31] 2.90888211 0.105058582 2.68203084 2.85311175 2.915896937 2.980100965 log_lik[32] 2.21823719 0.281967292 1.59731029 2.06529730 2.244658537 2.418590396 log_lik[33] 2.92415553 0.094569850 2.73343181 2.86704316 2.927491276 2.989958024 log_lik[34] 2.92415553 0.094569850 2.73343181 2.86704316 2.927491276 2.989958024 log_lik[35] 2.75061357 0.145843757 2.43537000 2.66769711 2.765627025 2.848613027 log_lik[36] 2.49607600 0.198002437 2.04899646 2.38959501 2.512395924 2.637258523 log_lik[37] 2.93143450 0.090363117 2.75250085 2.87302732 2.934560232 2.994320317 log_lik[38] 2.93030317 0.090979234 2.74790741 2.87207759 2.932461862 2.992971635 log_lik[39] 2.43531206 0.213065398 1.98449071 2.30651265 2.453205797 2.588139651 log_lik[40] 2.92917402 0.091618923 2.74163152 2.87227354 2.933409764 2.992376304 log_lik[41] 2.87407193 0.128092826 2.57449305 2.80683064 2.888796562 2.955843418 log_lik[42] 2.61463105 0.254516015 2.01089367 2.49005984 2.656709687 2.797789416 log_lik[43] 2.86815347 0.141087264 2.53640461 2.80010385 2.885622298 2.967183895 log_lik[44] 2.85616001 0.152884332 2.47186398 2.78796614 2.878885033 2.960758216 log_lik[45] 2.87559553 0.138250730 2.56917960 2.81058096 2.890960303 2.963947123 log_lik[46] 2.69008281 0.283917921 1.96586691 2.57628890 2.763585968 2.887108630 log_lik[47] 2.43747472 0.328414163 1.65197874 2.26339289 2.486732406 2.676129202 log_lik[48] 2.69839958 0.236417416 2.17502125 2.58295372 2.745340511 2.861814041 log_lik[49] 2.27883673 0.386355982 1.45222305 2.06555808 2.343264621 2.557833239 log_lik[50] 2.71678324 0.179708493 2.32029715 2.61505324 2.742927293 2.842417539 log_lik[51] 2.42328589 0.242035474 1.86492726 2.28534136 2.445765955 2.598111981 log_lik[52] 1.51516071 0.405765066 0.63754932 1.24500374 1.561765287 1.811060194 log_lik[53] 2.66809741 0.165060831 2.31515588 2.56318601 2.684335816 2.788183906 log_lik[54] 1.84906107 0.357617969 1.07283639 1.63513126 1.878543706 2.100243272 log_lik[55] 2.87035773 0.125928891 2.56220684 2.81045538 2.887266935 2.951675898 log_lik[56] 2.71733895 0.155961154 2.35872958 2.63572326 2.737815662 2.825749716 log_lik[57] 1.61918583 0.393637151 0.74655674 1.37246160 1.661039538 1.904720334 log_lik[58] 2.88490353 0.104276129 2.64891234 2.82659054 2.895265125 2.953385111 log_lik[59] 2.90174367 0.103735279 2.68693494 2.83570346 2.904555372 2.973073249 log_lik[60] 2.83763561 0.121571323 2.58481459 2.76487096 2.840841214 2.926235407 log_lik[61] 2.16193728 0.280500856 1.57032379 1.98537353 2.188761272 2.363179124 log_lik[62] 2.89219948 0.107328018 2.67744335 2.83274848 2.896589033 2.967400958 log_lik[63] -2.10034498 1.041646594 -4.24981910 -2.73987793 -2.014045595 -1.380158421 log_lik[64] 0.27561537 0.665348554 -1.20843591 -0.15119014 0.330771393 0.725514176 log_lik[65] 2.86342990 0.134861235 2.55853032 2.79991943 2.876403988 2.951796612 log_lik[66] 1.33790401 0.735276393 -0.38787265 0.92206097 1.431817388 1.869504079 log_lik[67] 1.97541155 0.463248298 0.98445109 1.70116814 2.043817962 2.308607555 log_lik[68] 2.82591211 0.188062849 2.33717742 2.75523581 2.863449237 2.944952099 log_lik[69] 1.88260766 0.782023975 0.09272054 1.50201925 2.030936022 2.469296567 log_lik[70] 2.03136356 0.502394575 0.76271761 1.72487556 2.107985446 2.397895212 log_lik[71] 2.83371694 0.175854009 2.36938926 2.76275895 2.868649328 2.955230203 log_lik[72] 2.22923323 0.331626335 1.42320658 2.05182700 2.271105437 2.446813604 lp__ 239.44026955 2.010619423 234.67689436 238.46085590 239.797177966 240.872201552 stats parameter 97.5% beta[1] 0.070997930 beta[2] 0.073470942 beta[3] 0.135432823 beta[4] 0.017911495 beta[5] -0.006989883 cbeta0 0.077166216 sigma 0.024781110 beta0 0.036654853 log_lik[1] 3.035155201 log_lik[2] 3.058167044 log_lik[3] 2.914011898 log_lik[4] 3.087978727 log_lik[5] 2.996974110 log_lik[6] 3.070112311 log_lik[7] 2.889723644 log_lik[8] 2.988011154 log_lik[9] 2.346405109 log_lik[10] 3.015262487 log_lik[11] 2.935126827 log_lik[12] 2.934023523 log_lik[13] 2.262333605 log_lik[14] 2.697797765 log_lik[15] 2.735874665 log_lik[16] 2.998391043 log_lik[17] 3.039732820 log_lik[18] 3.068541307 log_lik[19] 3.056813475 log_lik[20] 3.088247309 log_lik[21] 2.719985017 log_lik[22] 3.056647660 log_lik[23] 2.918513926 log_lik[24] 3.070462984 log_lik[25] 3.058959782 log_lik[26] 2.870613272 log_lik[27] 3.021389854 log_lik[28] 2.663906666 log_lik[29] 2.957058564 log_lik[30] 3.081060162 log_lik[31] 3.099254939 log_lik[32] 2.662048937 log_lik[33] 3.102196023 log_lik[34] 3.102196023 log_lik[35] 3.000591811 log_lik[36] 2.810342407 log_lik[37] 3.102119633 log_lik[38] 3.102970987 log_lik[39] 2.792410045 log_lik[40] 3.105380151 log_lik[41] 3.087018410 log_lik[42] 2.973726612 log_lik[43] 3.071301206 log_lik[44] 3.072751994 log_lik[45] 3.093960956 log_lik[46] 3.023221570 log_lik[47] 2.887093200 log_lik[48] 3.018906758 log_lik[49] 2.855439411 log_lik[50] 3.002832613 log_lik[51] 2.803885796 log_lik[52] 2.193960653 log_lik[53] 2.939089632 log_lik[54] 2.466902790 log_lik[55] 3.072813188 log_lik[56] 2.959424332 log_lik[57] 2.264911382 log_lik[58] 3.066635705 log_lik[59] 3.087659949 log_lik[60] 3.045881107 log_lik[61] 2.622928278 log_lik[62] 3.082103318 log_lik[63] -0.240078709 log_lik[64] 1.404206011 log_lik[65] 3.074284684 log_lik[66] 2.444088025 log_lik[67] 2.705141768 log_lik[68] 3.070300109 log_lik[69] 2.907808508 log_lik[70] 2.770486038 log_lik[71] 3.068707350 log_lik[72] 2.761004676 lp__ 242.332331928 , , chains = chain:2 stats parameter mean sd 2.5% 25% 50% 75% beta[1] 0.046573687 0.012968365 0.01922452 0.03835562 4.661458e-02 0.055590980 beta[2] -0.000679594 0.040203025 -0.07731360 -0.02586551 5.623988e-04 0.024419305 beta[3] 0.061865889 0.039153730 -0.01625011 0.03530993 6.245849e-02 0.088713322 beta[4] -0.011300228 0.015393923 -0.04019140 -0.02072802 -1.139336e-02 -0.001914545 beta[5] -0.037263246 0.014880247 -0.06671254 -0.04719578 -3.782046e-02 -0.027202808 cbeta0 0.072487794 0.002452994 0.06759219 0.07086659 7.248487e-02 0.074086053 sigma 0.020822161 0.001911947 0.01719854 0.01954794 2.080912e-02 0.021998851 beta0 -0.027567000 0.033285659 -0.09202049 -0.04997486 -2.714554e-02 -0.006274045 log_lik[1] 2.555705813 0.403423762 1.48454389 2.35885186 2.666857e+00 2.857311348 log_lik[2] 2.692566472 0.282362063 1.98030750 2.55752521 2.773079e+00 2.895380478 log_lik[3] 2.488818904 0.314253339 1.73848551 2.30260169 2.550341e+00 2.718012127 log_lik[4] 2.904301284 0.114464995 2.66369293 2.83579996 2.910541e+00 2.973724689 log_lik[5] 2.708181240 0.165508750 2.34289436 2.60738262 2.726828e+00 2.825767111 log_lik[6] 2.882838378 0.116550615 2.64664631 2.81232773 2.891636e+00 2.959361921 log_lik[7] 2.562381074 0.185216998 2.15783856 2.44856916 2.578104e+00 2.694194514 log_lik[8] 2.757078048 0.143584590 2.44330464 2.67191527 2.772100e+00 2.858925630 log_lik[9] 1.772948432 0.336986219 1.01470755 1.56772350 1.812002e+00 2.003120904 log_lik[10] 2.782270219 0.129327353 2.51908851 2.70783879 2.789442e+00 2.872401573 log_lik[11] 2.676493721 0.149165473 2.37028892 2.58833726 2.685417e+00 2.781310637 log_lik[12] 2.689346179 0.147548781 2.34898298 2.59842165 2.703917e+00 2.790301885 log_lik[13] 1.655117646 0.350212508 0.89035564 1.44754192 1.692204e+00 1.897835567 log_lik[14] 2.315735602 0.223570596 1.82856620 2.18855504 2.339754e+00 2.470968679 log_lik[15] 2.336257552 0.234659918 1.80962277 2.20220904 2.359357e+00 2.499221976 log_lik[16] 2.750067765 0.144799562 2.43936329 2.66846252 2.761700e+00 2.847352381 log_lik[17] 2.849237344 0.119040432 2.59302233 2.77905012 2.855412e+00 2.936732183 log_lik[18] 2.854873665 0.128383725 2.56594111 2.78634984 2.867374e+00 2.939324668 log_lik[19] 2.827445694 0.141079131 2.49274560 2.75050895 2.844707e+00 2.921729538 log_lik[20] 2.879222346 0.132562882 2.56239744 2.81618183 2.892368e+00 2.960501675 log_lik[21] 2.031276845 0.430813964 1.04375922 1.77254493 2.092124e+00 2.340981437 log_lik[22] 2.794210646 0.192898810 2.29975020 2.70992941 2.829427e+00 2.922225572 log_lik[23] 2.394610738 0.354565258 1.57687928 2.19509281 2.461157e+00 2.655566303 log_lik[24] 2.842089822 0.170187987 2.42147977 2.75955151 2.876108e+00 2.954115976 log_lik[25] 2.766117805 0.248386281 2.10687229 2.67176434 2.825790e+00 2.924098495 log_lik[26] 2.331365276 0.375907485 1.38641093 2.13389389 2.390509e+00 2.596039001 log_lik[27] 2.724524902 0.222618176 2.17864248 2.62328313 2.766477e+00 2.875725092 log_lik[28] 2.079046902 0.359767402 1.25986051 1.88166778 2.115781e+00 2.314875490 log_lik[29] 2.586032398 0.249523499 1.99385199 2.45626700 2.620800e+00 2.763710609 log_lik[30] 2.888643896 0.117231901 2.63984452 2.82861591 2.899498e+00 2.961617625 log_lik[31] 2.913703139 0.107853256 2.68263237 2.84652764 2.916332e+00 2.982253251 log_lik[32] 2.223400348 0.275288841 1.61533570 2.06813725 2.253168e+00 2.415660064 log_lik[33] 2.927525467 0.101472925 2.72150855 2.86208301 2.929968e+00 2.992745671 log_lik[34] 2.927525467 0.101472925 2.72150855 2.86208301 2.929968e+00 2.992745671 log_lik[35] 2.755647376 0.147836897 2.41523010 2.67709073 2.772998e+00 2.855836564 log_lik[36] 2.487393541 0.206046024 1.98245468 2.36167166 2.510462e+00 2.632465902 log_lik[37] 2.933274090 0.098615500 2.74204186 2.86895264 2.934076e+00 2.997768181 log_lik[38] 2.932564812 0.099111750 2.73933259 2.86591848 2.934297e+00 2.996539613 log_lik[39] 2.437828466 0.228237669 1.90492161 2.29833236 2.456494e+00 2.601064487 log_lik[40] 2.929237732 0.099076377 2.73652812 2.86610808 2.928029e+00 2.993847562 log_lik[41] 2.872977576 0.137155080 2.53364805 2.80200486 2.891932e+00 2.961984770 log_lik[42] 2.613639764 0.273410265 1.94461724 2.46231211 2.656621e+00 2.815443167 log_lik[43] 2.863288232 0.147422553 2.49647431 2.79417469 2.882816e+00 2.954996633 log_lik[44] 2.850717739 0.159741131 2.43437337 2.77547577 2.877347e+00 2.949145748 log_lik[45] 2.872785300 0.143342279 2.52580455 2.80503759 2.892572e+00 2.962509449 log_lik[46] 2.685380832 0.292611053 1.93400117 2.55999152 2.750716e+00 2.886388763 log_lik[47] 2.427475556 0.333365670 1.63538610 2.24720149 2.475169e+00 2.674218088 log_lik[48] 2.703845523 0.238349124 2.11995209 2.58779645 2.756249e+00 2.868074998 log_lik[49] 2.284398132 0.394152205 1.31242504 2.05321112 2.354677e+00 2.571721537 log_lik[50] 2.725743277 0.175421428 2.32753495 2.62855619 2.751185e+00 2.842482819 log_lik[51] 2.410985515 0.243619157 1.83048111 2.27471731 2.440028e+00 2.582200688 log_lik[52] 1.484840583 0.407022483 0.58639078 1.23938840 1.509267e+00 1.760637242 log_lik[53] 2.679988030 0.157482077 2.34833440 2.57916554 2.699494e+00 2.793792204 log_lik[54] 1.860775940 0.356882336 1.03674010 1.63799099 1.898505e+00 2.115788390 log_lik[55] 2.870298842 0.134041249 2.56742267 2.80068961 2.882107e+00 2.952984959 log_lik[56] 2.711573351 0.159903051 2.33805445 2.62600057 2.723940e+00 2.823512093 log_lik[57] 1.591186467 0.394174023 0.73181106 1.35362082 1.613820e+00 1.866826900 log_lik[58] 2.884902478 0.110680249 2.63966371 2.82448235 2.888689e+00 2.957854051 log_lik[59] 2.909263246 0.105483239 2.69454185 2.84375911 2.913201e+00 2.975829342 log_lik[60] 2.848098240 0.120460802 2.60433367 2.77054882 2.859614e+00 2.924853777 log_lik[61] 2.179722887 0.267678945 1.63529342 2.01854716 2.206163e+00 2.368992763 log_lik[62] 2.900948499 0.110290993 2.67519424 2.83559986 2.908828e+00 2.972706274 log_lik[63] -2.089037883 1.055614388 -4.46485297 -2.74823615 -1.978084e+00 -1.388993058 log_lik[64] 0.217412591 0.691331758 -1.30244131 -0.19300339 2.577232e-01 0.718684971 log_lik[65] 2.873034012 0.138028791 2.57703490 2.80208973 2.890364e+00 2.958817045 log_lik[66] 1.365285258 0.724725875 -0.19645973 0.93482568 1.448547e+00 1.898831201 log_lik[67] 1.943082513 0.468814524 0.92274043 1.65337128 2.000942e+00 2.292893347 log_lik[68] 2.835324941 0.191329297 2.38977540 2.75220845 2.876349e+00 2.954466999 log_lik[69] 1.837905780 0.800885947 0.03459476 1.41330361 2.013041e+00 2.437525180 log_lik[70] 2.011678229 0.546238574 0.75603394 1.71100922 2.105416e+00 2.420740559 log_lik[71] 2.827298672 0.183832251 2.34311443 2.74559818 2.865355e+00 2.942688040 log_lik[72] 2.219976843 0.331687049 1.45466437 2.03431183 2.269058e+00 2.444227358 lp__ 239.409255794 2.010666043 234.21417225 238.40189389 2.397621e+02 240.927403580 stats parameter 97.5% beta[1] 0.071992668 beta[2] 0.073293156 beta[3] 0.141115378 beta[4] 0.019578524 beta[5] -0.007556109 cbeta0 0.077377172 sigma 0.024884828 beta0 0.039992168 log_lik[1] 3.026332813 log_lik[2] 3.047029543 log_lik[3] 2.936209844 log_lik[4] 3.118233931 log_lik[5] 2.975066210 log_lik[6] 3.095420634 log_lik[7] 2.872583557 log_lik[8] 2.988620973 log_lik[9] 2.327503062 log_lik[10] 3.010633233 log_lik[11] 2.928746939 log_lik[12] 2.931229033 log_lik[13] 2.219096585 log_lik[14] 2.685523533 log_lik[15] 2.739687069 log_lik[16] 2.991905960 log_lik[17] 3.061505579 log_lik[18] 3.070723784 log_lik[19] 3.060320927 log_lik[20] 3.094216870 log_lik[21] 2.706689967 log_lik[22] 3.069935435 log_lik[23] 2.914197728 log_lik[24] 3.096854114 log_lik[25] 3.072929553 log_lik[26] 2.874959521 log_lik[27] 3.027264044 log_lik[28] 2.673131999 log_lik[29] 2.954345189 log_lik[30] 3.091642472 log_lik[31] 3.112538394 log_lik[32] 2.674622119 log_lik[33] 3.120917517 log_lik[34] 3.120917517 log_lik[35] 2.991665016 log_lik[36] 2.821154493 log_lik[37] 3.123357094 log_lik[38] 3.121133963 log_lik[39] 2.820126271 log_lik[40] 3.112857625 log_lik[41] 3.092796353 log_lik[42] 3.000980740 log_lik[43] 3.093727037 log_lik[44] 3.092411004 log_lik[45] 3.094175887 log_lik[46] 3.048142404 log_lik[47] 2.897845283 log_lik[48] 3.034282903 log_lik[49] 2.857102923 log_lik[50] 3.005570431 log_lik[51] 2.803074764 log_lik[52] 2.164799968 log_lik[53] 2.945753353 log_lik[54] 2.481880445 log_lik[55] 3.091733647 log_lik[56] 2.979217346 log_lik[57] 2.244341122 log_lik[58] 3.084134002 log_lik[59] 3.107411797 log_lik[60] 3.063353061 log_lik[61] 2.617976030 log_lik[62] 3.107239376 log_lik[63] -0.104810738 log_lik[64] 1.361214817 log_lik[65] 3.101222371 log_lik[66] 2.505238319 log_lik[67] 2.673459163 log_lik[68] 3.088420640 log_lik[69] 2.888233643 log_lik[70] 2.812263455 log_lik[71] 3.088213914 log_lik[72] 2.745370189 lp__ 242.154798132 , , chains = chain:3 stats parameter mean sd 2.5% 25% 50% 75% beta[1] 4.687192e-02 0.012603092 0.02224220 0.03867635 0.047139790 0.054893417 beta[2] -1.283318e-04 0.038763274 -0.07783005 -0.02446199 0.000084502 0.023693054 beta[3] 6.235675e-02 0.039868629 -0.01668995 0.03842740 0.061796315 0.088541641 beta[4] -1.144154e-02 0.014789280 -0.04132340 -0.02073492 -0.011456795 -0.001800738 beta[5] -3.740335e-02 0.015170379 -0.06604108 -0.04750805 -0.037285619 -0.027253067 cbeta0 7.251963e-02 0.002613811 0.06749783 0.07076701 0.072521665 0.074274167 sigma 2.096193e-02 0.001901251 0.01773725 0.01956603 0.020908996 0.022155558 beta0 -2.842157e-02 0.032212332 -0.08978096 -0.04848694 -0.029013968 -0.007174124 log_lik[1] 2.543255e+00 0.445648129 1.35643247 2.38973617 2.668732449 2.848637326 log_lik[2] 2.679340e+00 0.318002708 1.83242494 2.57231879 2.767129757 2.891120727 log_lik[3] 2.479547e+00 0.327324202 1.66949962 2.31774815 2.533466420 2.720535050 log_lik[4] 2.891291e+00 0.123267181 2.61890325 2.82533162 2.903477493 2.971991740 log_lik[5] 2.698571e+00 0.186173125 2.27612565 2.61092779 2.729292028 2.824701722 log_lik[6] 2.872157e+00 0.120721951 2.60140898 2.80731748 2.886327032 2.950186436 log_lik[7] 2.554756e+00 0.205521180 2.08956405 2.45378128 2.586371941 2.691147705 log_lik[8] 2.749841e+00 0.149942242 2.39220996 2.67530446 2.766726334 2.848744030 log_lik[9] 1.782544e+00 0.358691448 0.96038763 1.57342869 1.801725763 2.028949554 log_lik[10] 2.772485e+00 0.143839111 2.46159142 2.69159289 2.793381958 2.871859762 log_lik[11] 2.667849e+00 0.163393600 2.32331294 2.58386800 2.689725491 2.778114534 log_lik[12] 2.685829e+00 0.153970898 2.33092385 2.60561674 2.700734799 2.788799366 log_lik[13] 1.670250e+00 0.368810375 0.81809852 1.45175679 1.697262192 1.924279780 log_lik[14] 2.319557e+00 0.236269756 1.77811673 2.18246032 2.333584566 2.484271179 log_lik[15] 2.330679e+00 0.242191082 1.77962269 2.18436460 2.361122876 2.501892752 log_lik[16] 2.740925e+00 0.152790818 2.38029758 2.65480512 2.757621960 2.843398770 log_lik[17] 2.843539e+00 0.119788649 2.60376944 2.77674104 2.854694013 2.921465129 log_lik[18] 2.845433e+00 0.132216436 2.54584447 2.77123839 2.854735799 2.931767124 log_lik[19] 2.818036e+00 0.144849657 2.48360608 2.73542575 2.830595022 2.914239450 log_lik[20] 2.870244e+00 0.133116362 2.55836437 2.80230385 2.884702813 2.958389389 log_lik[21] 2.026432e+00 0.426974261 1.06753753 1.76148862 2.075696186 2.334955456 log_lik[22] 2.784688e+00 0.193502497 2.29078868 2.69561585 2.821072439 2.914967101 log_lik[23] 2.402272e+00 0.354735536 1.53525342 2.21494043 2.463363730 2.664450834 log_lik[24] 2.837859e+00 0.166489089 2.43289367 2.76790740 2.867090885 2.942032049 log_lik[25] 2.761416e+00 0.232218915 2.16651161 2.65500375 2.818873499 2.922452774 log_lik[26] 2.334425e+00 0.367239856 1.43915107 2.12329551 2.406861297 2.610997989 log_lik[27] 2.720999e+00 0.215406481 2.18526727 2.60539194 2.764038407 2.877389917 log_lik[28] 2.082459e+00 0.356254341 1.30625966 1.85106525 2.123343001 2.347841754 log_lik[29] 2.582251e+00 0.246605185 1.99841557 2.44865824 2.614345599 2.765280355 log_lik[30] 2.882860e+00 0.115999558 2.62788423 2.81367354 2.894781170 2.963137365 log_lik[31] 2.907466e+00 0.101980319 2.70196050 2.84611255 2.911904471 2.977204132 log_lik[32] 2.226680e+00 0.269593263 1.63323880 2.06213572 2.246627350 2.423137703 log_lik[33] 2.921720e+00 0.096980106 2.72719266 2.85846907 2.925002858 2.988224740 log_lik[34] 2.921720e+00 0.096980106 2.72719266 2.85846907 2.925002858 2.988224740 log_lik[35] 2.752319e+00 0.139375641 2.45220917 2.66904890 2.767407925 2.850811976 log_lik[36] 2.487794e+00 0.198335059 2.06437099 2.37395622 2.506328349 2.625296854 log_lik[37] 2.927989e+00 0.095929992 2.73202238 2.86684606 2.929645461 2.993729564 log_lik[38] 2.927129e+00 0.095941733 2.73326851 2.86616420 2.927792276 2.993193641 log_lik[39] 2.441287e+00 0.217589122 1.91279392 2.31437840 2.476516993 2.590618562 log_lik[40] 2.924636e+00 0.097867797 2.71596945 2.86192363 2.926170520 2.992888492 log_lik[41] 2.871686e+00 0.130185605 2.56676868 2.80296213 2.887305574 2.962437543 log_lik[42] 2.618054e+00 0.259268166 1.97570786 2.48792198 2.670616488 2.808522364 log_lik[43] 2.860988e+00 0.141985036 2.51742669 2.78987478 2.882226215 2.954328525 log_lik[44] 2.848723e+00 0.153247799 2.46276145 2.77345460 2.875510662 2.949810676 log_lik[45] 2.872298e+00 0.136415664 2.56090920 2.80057521 2.892332344 2.966459442 log_lik[46] 2.693543e+00 0.263516998 2.00993637 2.57235205 2.755530615 2.880694694 log_lik[47] 2.440828e+00 0.312830395 1.75240830 2.25904706 2.490531141 2.665107608 log_lik[48] 2.695712e+00 0.241731074 2.08650310 2.59475265 2.739669276 2.859903144 log_lik[49] 2.276663e+00 0.396646685 1.30603706 2.07779517 2.335814769 2.560675260 log_lik[50] 2.715804e+00 0.176903798 2.28948005 2.62036002 2.736187641 2.836878428 log_lik[51] 2.421584e+00 0.237196957 1.89738930 2.27718525 2.447709815 2.593513923 log_lik[52] 1.513972e+00 0.388635526 0.67569262 1.27511861 1.557145771 1.781665415 log_lik[53] 2.669327e+00 0.159077357 2.27953195 2.57883574 2.678221201 2.780516762 log_lik[54] 1.854605e+00 0.368501379 1.02196362 1.63704496 1.883634141 2.119129762 log_lik[55] 2.868364e+00 0.125936387 2.56736765 2.80216209 2.882735413 2.950846082 log_lik[56] 2.713792e+00 0.159650706 2.35490186 2.61368570 2.735357061 2.827847699 log_lik[57] 1.618631e+00 0.377404851 0.79196877 1.38615520 1.660955990 1.877974517 log_lik[58] 2.880732e+00 0.112270531 2.63823331 2.80849040 2.885498452 2.959566422 log_lik[59] 2.899855e+00 0.100322020 2.68656831 2.83802576 2.902549832 2.972149943 log_lik[60] 2.837540e+00 0.115255717 2.57048769 2.77478079 2.841990214 2.914173713 log_lik[61] 2.172053e+00 0.285151335 1.56670605 1.98080320 2.206015395 2.383092413 log_lik[62] 2.890959e+00 0.103594113 2.66722047 2.82970932 2.896851054 2.965754884 log_lik[63] -2.051545e+00 1.127862154 -4.48890840 -2.74414429 -1.920156284 -1.241284998 log_lik[64] 2.658952e-01 0.638058407 -1.14200598 -0.14728861 0.329610001 0.713418734 log_lik[65] 2.863370e+00 0.129342612 2.55939992 2.79486933 2.877800212 2.951068869 log_lik[66] 1.370036e+00 0.746314536 -0.27054195 0.93559717 1.461945778 1.951360927 log_lik[67] 1.957931e+00 0.454997252 0.98142421 1.67520242 2.000501556 2.306393491 log_lik[68] 2.827101e+00 0.177525467 2.35745483 2.75316732 2.859642402 2.943337063 log_lik[69] 1.853766e+00 0.775452868 0.13718003 1.41781292 1.995595237 2.456305549 log_lik[70] 2.018212e+00 0.503666454 0.85765690 1.74248777 2.088647125 2.386342436 log_lik[71] 2.825837e+00 0.175513384 2.36819786 2.74830371 2.854341721 2.941178275 log_lik[72] 2.224492e+00 0.326780516 1.46098246 2.03768262 2.272001753 2.459073299 lp__ 2.393557e+02 2.074657630 234.39765993 238.34523662 239.750981231 240.847796260 stats parameter 97.5% beta[1] 0.071101532 beta[2] 0.078602922 beta[3] 0.140317772 beta[4] 0.018162768 beta[5] -0.007173446 cbeta0 0.077935288 sigma 0.025175510 beta0 0.036585080 log_lik[1] 3.020214142 log_lik[2] 3.039543731 log_lik[3] 2.944433765 log_lik[4] 3.095518023 log_lik[5] 2.981648625 log_lik[6] 3.065689222 log_lik[7] 2.889227799 log_lik[8] 3.000572582 log_lik[9] 2.378855814 log_lik[10] 3.009334319 log_lik[11] 2.939300034 log_lik[12] 2.945842961 log_lik[13] 2.308624793 log_lik[14] 2.699801767 log_lik[15] 2.731453705 log_lik[16] 2.994657001 log_lik[17] 3.045674943 log_lik[18] 3.065330036 log_lik[19] 3.056970236 log_lik[20] 3.081258086 log_lik[21] 2.707667044 log_lik[22] 3.067973384 log_lik[23] 2.906512509 log_lik[24] 3.066921951 log_lik[25] 3.057859099 log_lik[26] 2.883942507 log_lik[27] 3.034807944 log_lik[28] 2.673624000 log_lik[29] 2.972272327 log_lik[30] 3.082047715 log_lik[31] 3.088764392 log_lik[32] 2.687800554 log_lik[33] 3.092161010 log_lik[34] 3.092161010 log_lik[35] 2.989691164 log_lik[36] 2.824570429 log_lik[37] 3.100103357 log_lik[38] 3.097497426 log_lik[39] 2.791071280 log_lik[40] 3.097253257 log_lik[41] 3.074381737 log_lik[42] 2.977064743 log_lik[43] 3.095820345 log_lik[44] 3.092828391 log_lik[45] 3.076685914 log_lik[46] 3.039502292 log_lik[47] 2.907380515 log_lik[48] 3.028609563 log_lik[49] 2.837183843 log_lik[50] 2.985369808 log_lik[51] 2.805323449 log_lik[52] 2.217762713 log_lik[53] 2.932874831 log_lik[54] 2.444394038 log_lik[55] 3.071198074 log_lik[56] 2.986448837 log_lik[57] 2.291878056 log_lik[58] 3.070533537 log_lik[59] 3.084007544 log_lik[60] 3.047210898 log_lik[61] 2.627840652 log_lik[62] 3.083295756 log_lik[63] -0.135317133 log_lik[64] 1.394156822 log_lik[65] 3.074725146 log_lik[66] 2.480596310 log_lik[67] 2.685455658 log_lik[68] 3.069122790 log_lik[69] 2.911229857 log_lik[70] 2.775577287 log_lik[71] 3.080374891 log_lik[72] 2.729579098 lp__ 242.277502989
library(broom) tidyMCMC(constable.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 (Intercept) -0.02662339 0.030640340 -0.08652069 0.032339963 0.9990184 2266 2 I(IV^(1/3)) 0.04617694 0.011955033 0.02348520 0.069677622 0.9990403 2279 3 TREATInitial -0.00262214 0.037692592 -0.07602074 0.073892281 0.9990874 2277 4 TREATLow 0.06178365 0.037596699 -0.01092534 0.133310013 0.9989826 2298 5 I(IV^(1/3)):TREATInitial -0.01050189 0.014390980 -0.03897242 0.018017304 0.9991925 2267 6 I(IV^(1/3)):TREATLow -0.03725927 0.014237953 -0.06442732 -0.009792636 0.9990557 2308 7 sigma 0.02083855 0.001867566 0.01735171 0.024641950 0.9999474 2569 8 mean_PPD 0.07249442 0.003450626 0.06591716 0.079287256 0.9993001 2598 9 log-posterior 169.80857221 1.999673060 165.95405422 172.982238703 1.0003394 2513
constable.mcmc = as.matrix(constable.rstanarm) str(constable.mcmc)
num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ... - attr(*, "dimnames")=List of 2 ..$ iterations: NULL ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) terms = attr(Xmat, "assign") ## Effect effect for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
[1] 0.3711111 [1] 0.0003703704 [1] 0.9322222 [1] 0.09259259 [1] 0.4592593 [1] 0.01111111
## Effect of covariate mcmcpvalue(constable.mcmc[, which(terms == 1)])
[1] 0.0003703704
## Effect of treatment (marginal) mcmcpvalue(constable.mcmc[, which(terms == 2)])
[1] 0.08888889
## Effect of interaction mcmcpvalue(constable.mcmc[, which(terms == 3)])
[1] 0.01555556
# Overal model wch = grep("TREAT|IV", colnames(constable.mcmc)) mcmcpvalue(constable.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
summary(constable.brm)
Family: gaussian(identity) Formula: SUTW ~ I(IV^(1/3)) * TREAT Data: constable (Number of observations: 72) Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 5; total post-warmup samples = 2700 ICs: LOO = NA; WAIC = NA; R2 = NA Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -0.03 0.03 -0.09 0.03 2365 1 IIVE1D3 0.05 0.01 0.02 0.07 2361 1 TREATInitial 0.00 0.04 -0.07 0.07 2375 1 TREATLow 0.06 0.04 -0.01 0.14 2432 1 IIVE1D3:TREATInitial -0.01 0.01 -0.04 0.02 2369 1 IIVE1D3:TREATLow -0.04 0.01 -0.07 -0.01 2417 1 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 0.02 0 0.02 0.02 2583 1 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).
library(broom) tidyMCMC(constable.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 b_Intercept -0.0280889091 0.030786671 -0.08771759 0.030012311 1.0014465 2365 2 b_IIVE1D3 0.0467122190 0.012000835 0.02321772 0.068893899 1.0012960 2361 3 b_TREATInitial -0.0009689161 0.038416903 -0.07024299 0.077346297 1.0008347 2375 4 b_TREATLow 0.0629078547 0.038275435 -0.01974641 0.129499738 1.0000511 2432 5 b_IIVE1D3:TREATInitial -0.0111653298 0.014644996 -0.04043846 0.016701214 1.0009657 2369 6 b_IIVE1D3:TREATLow -0.0376073240 0.014462735 -0.06485635 -0.008515056 1.0000974 2417 7 sigma 0.0209160338 0.001855845 0.01739081 0.024625223 0.9993802 2583
constable.mcmc = as.matrix(constable.rstanarm) str(constable.mcmc)
num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ... - attr(*, "dimnames")=List of 2 ..$ iterations: NULL ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) terms = attr(Xmat, "assign") ## Effect effect for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
[1] 0.3711111 [1] 0.0003703704 [1] 0.9322222 [1] 0.09259259 [1] 0.4592593 [1] 0.01111111
## Effect of covariate mcmcpvalue(constable.mcmc[, which(terms == 1)])
[1] 0.0003703704
## Effect of treatment (marginal) mcmcpvalue(constable.mcmc[, which(terms == 2)])
[1] 0.08888889
## Effect of interaction mcmcpvalue(constable.mcmc[, which(terms == 3)])
[1] 0.01555556
# Overal model wch = grep("b_TREAT|IV", colnames(constable.mcmc)) mcmcpvalue(constable.mcmcpack[, wch])
[1] 0
## Frequentist for comparison summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
- Explore the range of suture widths for which pairwise treatment differences occur.
library(MCMCpack) constable.mcmc = constable.mcmcpack ## It is easier to first determine the range of IV for which the the treatment pairs do not differ. ## After which we can subtract this interval from the full domain Lets consider the three treatments ## (High, Initial, Low) Start by calculating the intercepts and slopes. a.High = as.vector(constable.mcmc[, 1]) a.Initial = as.vector(constable.mcmc[, 3]) + a.High a.Low = as.vector(constable.mcmc[, 4]) + a.High b.High = as.vector(constable.mcmc[, 2]) b.Initial = as.vector(constable.mcmc[, 5]) + b.High b.Low = as.vector(constable.mcmc[, 6]) + b.High ## Generate a function that finds the intersection of two lines inter <- function(m1, a1, m2, a2) { data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1) } ## We will also define the maximum and minimum domain in which to explore the range of inital volume. minx = min(constable$IV) maxx = max(constable$IV) ## Start with High vs Initial intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial) ## Narrow this to the likely domain intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval") intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
lower upper 1 9.026002 25.59764
## Now High vs Low intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
lower upper 1 10.62578 47.5
## Finally Initial vs Low intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
lower upper 1 23.49624 47.5
constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) constable.mcmc = constable.mcmc[, wch] ## It is easier to first determine the range of IV for which the the treatment pairs do not differ. ## After which we can subtract this interval from the full domain Lets consider the three treatments ## (High, Initial, Low) Start by calculating the intercepts and slopes. a.High = as.vector(constable.mcmc[, 1]) a.Initial = as.vector(constable.mcmc[, 3]) + a.High a.Low = as.vector(constable.mcmc[, 4]) + a.High b.High = as.vector(constable.mcmc[, 2]) b.Initial = as.vector(constable.mcmc[, 5]) + b.High b.Low = as.vector(constable.mcmc[, 6]) + b.High ## Generate a function that finds the intersection of two lines inter <- function(m1, a1, m2, a2) { data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1) } ## We will also define the maximum and minimum domain in which to explore the range of inital volume. minx = min(constable$IV) maxx = max(constable$IV) ## Start with High vs Initial intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial) ## Narrow this to the likely domain intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval") intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
lower upper 1 8.816773 31.58686
## Now High vs Low intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
lower upper 1 10.39011 47.5
## Finally Initial vs Low intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
lower upper 1 22.97178 47.5
constable.mcmc = as.matrix(constable.rstan) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) constable.mcmc = constable.mcmc[, wch] ## It is easier to first determine the range of IV for which the the treatment pairs do not differ. ## After which we can subtract this interval from the full domain Lets consider the three treatments ## (High, Initial, Low) Start by calculating the intercepts and slopes. a.High = as.vector(constable.mcmc[, 1]) a.Initial = as.vector(constable.mcmc[, 3]) + a.High a.Low = as.vector(constable.mcmc[, 4]) + a.High b.High = as.vector(constable.mcmc[, 2]) b.Initial = as.vector(constable.mcmc[, 5]) + b.High b.Low = as.vector(constable.mcmc[, 6]) + b.High ## Generate a function that finds the intersection of two lines inter <- function(m1, a1, m2, a2) { data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1) } ## We will also define the maximum and minimum domain in which to explore the range of inital volume. minx = min(constable$IV) maxx = max(constable$IV) ## Start with High vs Initial intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial) ## Narrow this to the likely domain intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval") intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
lower upper 1 9.191592 28.29357
## Now High vs Low intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
lower upper 1 10.52219 47.5
## Finally Initial vs Low intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
lower upper 1 23.28955 47.5
constable.mcmc = as.matrix(constable.rstan) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) constable.mcmc = constable.mcmc[, wch] ## It is easier to first determine the range of IV for which the the treatment pairs do not differ. ## After which we can subtract this interval from the full domain Lets consider the three treatments ## (High, Initial, Low) Start by calculating the intercepts and slopes. a.High = as.vector(constable.mcmc[, 1]) a.Initial = as.vector(constable.mcmc[, 3]) + a.High a.Low = as.vector(constable.mcmc[, 4]) + a.High b.High = as.vector(constable.mcmc[, 2]) b.Initial = as.vector(constable.mcmc[, 5]) + b.High b.Low = as.vector(constable.mcmc[, 6]) + b.High ## Generate a function that finds the intersection of two lines inter <- function(m1, a1, m2, a2) { data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1) } ## We will also define the maximum and minimum domain in which to explore the range of inital volume. minx = min(constable$IV) maxx = max(constable$IV) ## Start with High vs Initial intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial) ## Narrow this to the likely domain intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval") intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
lower upper 1 9.191592 28.29357
## Now High vs Low intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
lower upper 1 10.52219 47.5
## Finally Initial vs Low intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
lower upper 1 23.28955 47.5
constable.mcmc = as.matrix(constable.brm) wch = grep("b_", colnames(constable.mcmc)) constable.mcmc = constable.mcmc[, wch] ## It is easier to first determine the range of IV for which the the treatment pairs do not differ. ## After which we can subtract this interval from the full domain Lets consider the three treatments ## (High, Initial, Low) Start by calculating the intercepts and slopes. a.High = as.vector(constable.mcmc[, 1]) a.Initial = as.vector(constable.mcmc[, 3]) + a.High a.Low = as.vector(constable.mcmc[, 4]) + a.High b.High = as.vector(constable.mcmc[, 2]) b.Initial = as.vector(constable.mcmc[, 5]) + b.High b.Low = as.vector(constable.mcmc[, 6]) + b.High ## Generate a function that finds the intersection of two lines inter <- function(m1, a1, m2, a2) { data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1) } ## We will also define the maximum and minimum domain in which to explore the range of inital volume. minx = min(constable$IV) maxx = max(constable$IV) ## Start with High vs Initial intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial) ## Narrow this to the likely domain intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval") intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
lower upper 1 8.654387 31.48765
## Now High vs Low intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
lower upper 1 10.36942 47.5
## Finally Initial vs Low intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval") (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
lower upper 1 22.98159 47.5
- Generate a summary figure.
library(MCMCpack) constable.mcmc = constable.mcmcpack newdata <- with(constable, expand.grid(TREAT = levels(TREAT), IV = seq(min(IV), max(IV), len = 100))) Xmat <- model.matrix(~I(IV^(1/3)) * TREAT, data = newdata) wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = IV, linetype = TREAT)) + geom_point(data = constable, aes(y = SUTW, x = IV, shape = TREAT, fill = TREAT)) + geom_segment(data = NULL, aes(y = 0.02, yend = 0.02, x = comp.HvsI[1, 1], xend = comp.HvsI[1, 2])) + annotate(geom = "text", y = 0.025, x = rowMeans(comp.HvsI[1, 1:2]), label = "High vs Initial") + geom_segment(data = NULL, aes(y = 0.01, yend = 0.01, x = comp.HvsL[1, 1], xend = comp.HvsL[1, 2])) + annotate(geom = "text", y = 0.015, x = rowMeans(comp.HvsL[1, 1:2]), label = "High vs Low") + geom_segment(data = NULL, aes(y = 0, yend = 0, x = comp.IvsL[1, 1], xend = comp.IvsL[1, 2])) + annotate(geom = "text", y = 0.005, x = rowMeans(comp.IvsL[1, 1:2]), label = "Initial vs Low") + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + scale_y_continuous("Suture width (mm)") + scale_x_continuous("Suture width (mm)", trans = trans_new("root", function(x) x^(1/3), function(x) x^3, domain = c(0, Inf))) + scale_shape_manual("Food regime", values = c(21, 21, 21)) + scale_fill_manual("Food regime", values = c("black", "grey40", "white")) + scale_linetype_manual("Food regime", values = c("solid", "dashed", "dotted")) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0, 1), legend.key.width = unit(1, "cm"), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2, size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix newdata <- with(constable, expand.grid(TREAT=levels(TREAT), IV=seq(min(IV), max(IV), len=100))) Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata) wch = c(which(colnames(constable.mcmc)=='beta0'), grep('beta\\[', colnames(constable.mcmc))) coefs = constable.mcmc[,wch] fit = coefs %*% t(Xmat) newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval')) ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) + geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+ geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+ annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+ geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+ annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+ geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+ annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+ geom_line()+ geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+ scale_y_continuous('Suture width (mm)')+ scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3), function(x) x^3, domain=c(0,Inf)))+ scale_shape_manual('Food regime', values=c(21,21,21))+ scale_fill_manual('Food regime', values=c('black','grey40','white'))+ scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+ theme_classic() + theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'), axis.title.y=element_text(vjust=2, size=rel(1.25)), axis.title.x=element_text(vjust=-2, size=rel(1.25)), plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
constable.mcmc = as.matrix(constable.rstan) newdata <- with(constable, expand.grid(TREAT=levels(TREAT), IV=seq(min(IV), max(IV), len=100))) Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = constable.mcmc[,wch] fit = coefs %*% t(Xmat) newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval')) ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) + geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+ geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+ annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+ geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+ annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+ geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+ annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+ geom_line()+ geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+ scale_y_continuous('Suture width (mm)')+ scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3), function(x) x^3, domain=c(0,Inf)))+ scale_shape_manual('Food regime', values=c(21,21,21))+ scale_fill_manual('Food regime', values=c('black','grey40','white'))+ scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+ theme_classic() + theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'), axis.title.y=element_text(vjust=2, size=rel(1.25)), axis.title.x=element_text(vjust=-2, size=rel(1.25)), plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
constable.mcmc = as.matrix(constable.rstanarm) newdata <- with(constable, expand.grid(TREAT=levels(TREAT), IV=seq(min(IV), max(IV), len=100))) Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata) wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = constable.mcmc[,wch] fit = coefs %*% t(Xmat) newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval')) ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) + geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+ geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+ annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+ geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+ annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+ geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+ annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+ geom_line()+ geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+ scale_y_continuous('Suture width (mm)')+ scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3), function(x) x^3, domain=c(0,Inf)))+ scale_shape_manual('Food regime', values=c(21,21,21))+ scale_fill_manual('Food regime', values=c('black','grey40','white'))+ scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+ theme_classic() + theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'), axis.title.y=element_text(vjust=2, size=rel(1.25)), axis.title.x=element_text(vjust=-2, size=rel(1.25)), plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
constable.mcmc = as.matrix(constable.brm) newdata <- with(constable, expand.grid(TREAT=levels(TREAT), IV=seq(min(IV), max(IV), len=100))) Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata) wch = grep("b_", colnames(constable.mcmc)) coefs = constable.mcmc[,wch] fit = coefs %*% t(Xmat) newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval')) ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) + geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+ geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+ annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+ geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+ annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+ geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+ annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+ geom_line()+ geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+ scale_y_continuous('Suture width (mm)')+ scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3), function(x) x^3, domain=c(0,Inf)))+ scale_shape_manual('Food regime', values=c(21,21,21))+ scale_fill_manual('Food regime', values=c('black','grey40','white'))+ scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+ theme_classic() + theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'), axis.title.y=element_text(vjust=2, size=rel(1.25)), axis.title.x=element_text(vjust=-2, size=rel(1.25)), plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
- Explore $R^2$
library(MCMCpack) library(broom) constable.mcmc <- constable.mcmcpack Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) wch = grep("Intercept|IV|TREAT", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, constable$SUTW, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.488892 0.06173019 0.3642251 0.596765
# for comparison with frequentist summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
library(broom) constable.mcmc <- constable.r2jags$BUGSoutput$sims.matrix Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, constable$SUTW, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.4889044 0.05956118 0.3673575 0.5925647
# for comparison with frequentist summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
library(broom) constable.mcmc <- as.matrix(constable.rstan) Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc))) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, constable$SUTW, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.4882446 0.06035995 0.3734576 0.6005657
# for comparison with frequentist summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
library(broom) constable.mcmc <- as.matrix(constable.rstanarm) Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, constable$SUTW, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.4892397 0.05991533 0.3732155 0.5966039
# for comparison with frequentist summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
library(broom) constable.mcmc <- as.matrix(constable.brm) Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable) wch = grep("b_", colnames(constable.mcmc)) coefs = constable.mcmc[, wch] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, constable$SUTW, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.4888112 0.06064694 0.3706096 0.5948545
# for comparison with frequentist summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
Call: lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable) Residuals: Min 1Q Median 3Q Max -0.047764 -0.012727 0.001153 0.012959 0.065303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0282017 0.0307338 -0.918 0.362163 I(IV^(1/3)) 0.0467744 0.0119981 3.898 0.000229 *** TREATInitial -0.0005899 0.0380584 -0.015 0.987680 TREATLow 0.0630730 0.0373498 1.689 0.095994 . I(IV^(1/3)):TREATInitial -0.0113263 0.0145247 -0.780 0.438302 I(IV^(1/3)):TREATLow -0.0377210 0.0141608 -2.664 0.009700 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.02048 on 66 degrees of freedom Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576 F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
It would appear that it is likely that homogeneity of slopes has been violated - the slope associated with the Low treatment appears less steep than the slopes associated with either High or Initial. Although the range of the continuous covariate (IV) is slightly less for the High treatment (compared to the Initial and Low treatments), the difference is not dramatic (and appears broadly symmetrical).
Constable applied a third root transform on IV to improve normality and linearity. For consistency sake, we will do the same.
plot(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
The chains appear well mixed and have converged on what is likely to be a stable posterior.
The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.
The presence of an interaction means that we cannot simply make conclusions about the effect of treatment without reference to intial volume. That is, we need to delve further into the interaction. One approach we took for multiple linear regression was to explore the effect of one predictor at set levels of the other predictor.
Whilst we could take a similar approach here, Constable, took a different approach. Rather than explore the effects of the different treatments at 3-5 different suture widths, Constable elected to estimate the range of suture widths over which pairwise treatment differences were different. We will take the same approach here, just in a Bayesian framework.