Tutorial 8.2b - Dealing with Heterogeneity
22 Jan 2018
Overview
The following representation of the standard linear model highlights the various assumptions that are imposed upon the underlying data. The current tutorial will focus directly on issues related to the nature of residual variability.
Dealing with heterogeneity
The validity and reliability of the above linear models are very much dependent on variance homogeneity. In particular, variances that increase (or decrease) with a change in expected values are substantial violations.
Whilst non-normality can also be a source of heterogeneity and therefore normalizing can address both issues, heterogeneity can also be independent of normality. Similarly, generalized linear models (that accommodate alternative residual distributions - such as Poisson, Binomial, Gamma etc) can be useful for more appropriate modelling of of both the distribution and variance of a model.
However, for Gaussian (normal) models in which there is evidence of heterogeneity of variance, yet no evidence of non-normality, it is also possible to specifically model in an alternative variance structure. For example, we can elect to allow variance to increase proportionally to a covariate.
To assist us in the following demonstration, we will generate another data set - one that has heteroscedasticity (unequal variance) by design. Rather than draw each residual (and thus observation) from a normal distribution with a constant standard deviation), we will draw the residuals from normal distributions whose variance is proportional to the X predictor.
set.seed(126) n <- 16 a <- 40 #intercept b <- 1.5 #slope x <- 1:n #values of the year covariate sigma <- 1.5 * x sigma
[1] 1.5 3.0 4.5 6.0 7.5 9.0 10.5 12.0 13.5 [10] 15.0 16.5 18.0 19.5 21.0 22.5 24.0
eps <- rnorm(n, mean = 0, sd = sigma) #residuals y <- a + b * x + eps #response variable # OR y <- (model.matrix(~x) %*% c(a, b)) + eps data.het <- data.frame(y = round(y, 1), x) #dataset head(data.het) #print out the first six rows of the data set
y x 1 42.1 1 2 44.2 2 3 41.2 3 4 51.7 4 5 43.5 5 6 48.3 6
# scatterplot of y against x library(car) scatterplot(y ~ x, data.het)
# regular simple linear regression data.het.lm <- lm(y ~ x, data.het) # plot of standardized residuals plot(rstandard(data.het.lm) ~ fitted(data.het.lm))
# plot of standardized residuals against the # predictor plot(rstandard(data.het.lm) ~ x)
- The above scatterplot suggests that variance may increase with increasing X.
- The residual plot (using standardized residuals) suggests that mean and variance could be related - there is a hint of a wedge-shaped pattern
- Importantly, the plot of standardized residuals against the predictor shows the same pattern as the residual plot implying that heterogeneity is likely to be due a relationship between variance X. That is, an increase in X is associated with an increase in variance.
In response to this, we could incorporate an alternative variance structure. The simple model we fit earlier assumed that the expected values were all drawn from normal distributions with the same level of $\tau$ and therefore variance. $$ \mathbf{V} = \sigma^2\times \underbrace{ \begin{pmatrix} \sigma^2&0&0&0&0&0&0&0&0&0&0&0&0&0&0\\ 0&\sigma^2&0&0&0&0&0&0&0&0&0&0&0&0&0\\ 0&0&\sigma^2&0&0&0&0&0&0&0&0&0&0&0&0\\ 0&0&0&\sigma^2&0&0&0&0&0&0&0&0&0&0&0\\ 0&0&0&0&\sigma^2&0&0&0&0&0&0&0&0&0&0\\ 0&0&0&0&0&\sigma^2&0&0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&\sigma^2&0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&\sigma^2&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0&\sigma^2&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0&0&\sigma^2&0&0&0&0&0\\ 0&0&0&0&0&0&0&0&0&0&\sigma^2&0&0&0&0\\ 0&0&0&0&0&0&0&0&0&0&0&\sigma^2&0&0&0\\ 0&0&0&0&0&0&0&0&0&0&0&0&\sigma^2&0&0\\ 0&0&0&0&0&0&0&0&0&0&0&0&0&\sigma^2&0\\ 0&0&0&0&0&0&0&0&0&0&0&0&0&0&\sigma^2\\ \end{pmatrix}}_\text{Variance matrix} $$ Which is often summarized as: $$ \mathbf{V} = \sigma^2\times \underbrace{ \begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&\vdots\\ \vdots&\cdots&1&\vdots\\ 0&\cdots&\cdots&1\\ \end{pmatrix}}_\text{Identity matrix} = \underbrace{ \begin{pmatrix} \sigma^2&0&\cdots&0\\ 0&\sigma^2&\cdots&\vdots\\ \vdots&\cdots&\sigma^2&\vdots\\ 0&\cdots&\cdots&\sigma^2\\ \end{pmatrix}}_\text{Variance matrix} $$ Rather than assume that each observation is drawn from a normal distribution with the same amount of variance, we could alternatively assume that the variance is proportional to the level of the covariate. $$ \mathbf{V} = \sigma^2\times X\times \underbrace{ \begin{pmatrix} 1&0&\cdots&0\\ 0&1&\cdots&\vdots\\ \vdots&\cdots&1&\vdots\\ 0&\cdots&\cdots&1\\ \end{pmatrix}}_\text{Identity matrix} = \underbrace{ \begin{pmatrix} \sigma^2\times X&0&\cdots&0\\ 0&\sigma^2\times X&\cdots&\vdots\\ \vdots&\cdots&\sigma^2\times X&\vdots\\ 0&\cdots&\cdots&\sigma^2\times X\\ \end{pmatrix}}_\text{Variance matrix} $$
With a couple of small adjustments, we can modify the JAGS code in order to accommodate a variance structure in which variance is proportional to the predictor variable. Note that since JAGS works with precision ($\tau = 1/\sigma^2$), I have elected to express the predictor as $1/x$. This way the weightings are compatible with precision rather than variance.
In previous tutorials, we have used a flat, uniform distribution [0,100] for
variance priors. Whilst this is a reasonable choice for a non-informative prior,
Gelman (2006)
suggests that half-cauchy priors
are more appropriate when the number of groups is low.
Model fitting or statistical analysis
All the Bayesian linear modelling tutorials so far (Tutorial 7.x), have demonstrated modelling using a variety of tools (such as MCMCpack, JAGS, RSTAN, RSTANARM and BRMS). This is also the intention of this tutorial. However, MCMCpack, RSTANARM and BRMS do not support heterogeneous variances and thus these tools will be not be demonstrated. Whilst JAGS and RSTAN are extremely flexible and thus allow models to be formulated that contain not only the simple model, but also additional derivatives, the other approaches are more restrictive. Consequently, I will mostly restrict models to just the minimum necessary and all derivatives will instead be calculated in R itself from the returned posteriors.
The observed response ($y_i$) are assumed to be drawn from a normal distribution with a given mean ($\mu$) and standard deviation weighted by 1 on the value of the covariate ($\sigma\times\omega$). The expected values ($\mu$) are themselves determined by the linear predictor ($\beta_0 + \beta X_i$). In this case, $\beta_0$ represents the mean of the first group and the set of $\beta$'s represent the differences between each other group and the first group.
MCMC sampling requires priors on all parameters. We will employ weakly informative priors. Specifying 'uninformative' priors is always a bit of a balancing act. If the priors are too vague (wide) the MCMC sampler can wander off into nonscence areas of likelihood rather than concentrate around areas of highest likelihood (desired when wanting the outcomes to be largely driven by the data). On the other hand, if the priors are too strong, they may have an influence on the parameters. In such a simple model, this balance is very forgiving - it is for more complex models that prior choice becomes more important.
For this simple model, we will go with zero-centered Gaussian (normal) priors with relatively large standard deviations (100) for both the intercept and the treatment effect and a wide half-cauchy (scale=5) for the standard deviation. $$ \begin{align} y_i &\sim{} N(\mu_i, \sigma\times\omega)\\ \mu_i &= \beta_0 + \beta X_i\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,100)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$
Specific formulation
For very simple models such as this example, we can write the models as: $$\begin{align} y_i&\sim{}N(\mu_i, \tau\times 1/X_i)\\ \mu_i &= \beta_0 + \beta X_i\\ \beta_0&\sim{}N(0,1.0{E-6}) \hspace{1cm}\mathsf{non-informative~prior~for~interept}\\ \beta_j&\sim{}N(0,1.0{E-6}) \hspace{1cm}\mathsf{non-informative~prior~for~partial~slopes}\\ \sigma &=z/\sqrt{\chi}\\ z&\sim{}N(0,0.04)I(0,)\\ \chi&\sim{}\Gamma(0.5,0.5)\\ \tau &= \sigma^{-2}\\ \end{align} $$
Define the model
Note the following example as group means calculated as derived posteriors
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mu[i],tau*(1/x[i])) mu[i] <- beta0+beta1*x[i] } #Priors and derivatives beta0 ~ dnorm(0,1.0E-6) beta1 ~ dnorm(0,1.0E-6) sigma <- z/sqrt(chSq) # prior for sigma; cauchy = normal/sqrt(chi^2) z ~ dnorm(0, 0.04)I(0,) chSq ~ dgamma(0.5, 0.5) # chi^2 with 1 d.f. tau <- pow(sigma, -2) } "
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- a numeric representation of the predictor variable (x)
- the total number of observed items (n)
data.het.list <- with(data.het, list(y = y, x = x, n = nrow(data.het)))
Define the MCMC chain parameters
Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:
- the nodes (estimated parameters) to monitor (return samples for)
- the number of MCMC chains (3)
- the number of burnin steps (1000)
- the thinning factor (10)
- the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains
params <- c("beta0", "beta1", "sigma") nChains = 3 burnInSteps = 3000 thinSteps = 10 numSavedSteps = 15000 #across all chains nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains) nIter
[1] 53000
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
## load the R2jags package library(R2jags)
data.het.r2jags <- jags(data = data.het.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 16 Unobserved stochastic nodes: 4 Total graph size: 148 Initializing model
print(data.het.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta0 41.505 2.588 36.324 39.849 41.505 43.155 46.587 1.001 15000 beta1 1.110 0.406 0.315 0.848 1.107 1.371 1.925 1.001 6100 sigma 3.070 0.642 2.118 2.617 2.971 3.408 4.578 1.001 15000 deviance 110.930 2.800 107.743 108.871 110.207 112.222 118.054 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 3.9 and DIC = 114.9 DIC is an estimate of expected predictive error (lower deviance is better).
data.mcmc.list <- as.mcmc(data.het.r2jags)
Whilst Gibbs sampling provides an elegantly simple MCMC sampling routine, very complex hierarchical models can take enormous numbers of iterations (often prohibitory large) to converge on a stable posterior distribution.
To address this, Andrew Gelman (and other collaborators) have implemented a variation on Hamiltonian Monte Carlo (HMC: a sampler that selects subsequent samples in a way that reduces the correlation between samples, thereby speeding up convergence) called the No-U-Turn (NUTS) sampler. All of these developments are brought together into a tool called Stan ("Sampling Through Adaptive Neighborhoods").
By design (to appeal to the vast BUGS users), Stan models are defined in a manner reminiscent of BUGS. Stan first converts these models into C++ code which is then compiled to allow very rapid computation.
Consistent with the use of C++, the model must be accompanied by variable declarations for all inputs and parameters.
One important difference between Stan and JAGS is that whereas BUGS (and thus JAGS) use precision rather than variance, Stan uses variance.
Stan itself is a stand-alone command line application. However, conveniently, the authors of Stan have also developed an R interface to Stan called Rstan which can be used much like R2jags.
Model matrix formulation
The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:- the normal distribution is defined by variance rather than precision
- rather than using a uniform prior for sigma, I am using a half-Cauchy
We now translate the likelihood model into STAN code.
$$\begin{align}
y_i&\sim{}N(\mu_i, \sigma\times\omega)\\
\omega&=1/X_i\\
\mu_i &= \beta_0+\beta X_i\\
\beta_0&\sim{}N(0,100)\\
\beta&\sim{}N(0,100)\\
\sigma&\sim{}Cauchy(0,5)\\
\end{align}
$$
Define the model
modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; vector [n] w; } parameters { vector[nX] beta; real<lower=0> sigma; } transformed parameters { vector[n] mu; mu = X*beta; } model { #Likelihood y~normal(mu,sigma*w); #Priors beta ~ normal(0,1000); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sigma*w[i]); } } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
Xmat <- model.matrix(~x, data.het) data.het.list <- with(data.het, list(y = y, X = Xmat, w = sqrt(data.het$x), n = nrow(data.het), nX = ncol(Xmat)))
Fit the model
Now run the STAN code via the rstan interface.
## load the rstan package library(rstan)
data.het.rstan <- stan(data = data.het.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0, from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4, from filef9862a00a36.cpp:8: /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined # define BOOST_NO_CXX11_RVALUE_REFERENCES ^ <command-line>:0:0: note: this is the location of the previous definition SAMPLING FOR MODEL '9618c7d3665ad12862b950f5eeac698e' NOW (CHAIN 1). Gradient evaluation took 1.3e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.02066 seconds (Warm-up) 0.05731 seconds (Sampling) 0.07797 seconds (Total) SAMPLING FOR MODEL '9618c7d3665ad12862b950f5eeac698e' NOW (CHAIN 2). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.023828 seconds (Warm-up) 0.052005 seconds (Sampling) 0.075833 seconds (Total) SAMPLING FOR MODEL '9618c7d3665ad12862b950f5eeac698e' NOW (CHAIN 3). Gradient evaluation took 7e-06 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.019657 seconds (Warm-up) 0.051736 seconds (Sampling) 0.071393 seconds (Total)
print(data.het.rstan, par = c("beta", "sigma"))
Inference for Stan model: 9618c7d3665ad12862b950f5eeac698e. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 41.44 0.07 2.56 36.24 39.79 41.49 43.02 46.68 1199 1 beta[2] 1.12 0.01 0.39 0.30 0.87 1.12 1.37 1.87 1218 1 sigma 3.04 0.02 0.62 2.12 2.61 2.94 3.39 4.52 1453 1 Samples were drawn using NUTS(diag_e) at Wed Jan 10 11:36:49 2018. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
MCMC diagnostics
In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.
Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.
- Traceplots for each parameter illustrate the MCMC sample values after each successive
iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests
that the MCMC sampling chains may not have completely traversed all features of the posterior
distribution and that more iterations are required to ensure the distribution has been accurately
represented.
- Autocorrelation plot for each paramter illustrate the degree of correlation between
MCMC samples separated by different lags. For example, a lag of 0 represents the degree of
correlation between each MCMC sample and itself (obviously this will be a correlation of 1).
A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain
and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be
independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom
autocorrelation plot.
- Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
Again, prior to examining the summaries, we should have explored the convergence diagnostics.
library(coda) data.het.mcmc = as.mcmc(data.het.r2jags)
- Trace plots
plot(data.het.mcmc)
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta0", "beta1") plot(as.mcmc(data.het.r2jags)[, preds])
- Raftery diagnostic
raftery.diag(data.het.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 37410 3746 9.99 beta1 20 36800 3746 9.82 deviance 20 39300 3746 10.50 sigma 20 36200 3746 9.66 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 35940 3746 9.59 beta1 20 36200 3746 9.66 deviance 20 37410 3746 9.99 sigma 20 39950 3746 10.70 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta0 20 37410 3746 9.99 beta1 20 37410 3746 9.99 deviance 20 36800 3746 9.82 sigma 20 39300 3746 10.50
- Autocorrelation diagnostic
autocorr.diag(data.het.mcmc)
beta0 beta1 deviance sigma Lag 0 1.000000000 1.0000000000 1.0000000000 1.000000000 Lag 10 -0.008104376 -0.0130139443 -0.0076700704 -0.009472480 Lag 50 0.020836024 -0.0008688921 -0.0060345758 0.006069673 Lag 100 0.004116739 -0.0040378285 -0.0047948633 -0.003370821 Lag 500 0.024370623 0.0079407136 -0.0003892607 -0.010212581
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data.het.rstan) wch = grep("beta|sigma", dimnames(s)$parameters) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , wch], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data.het.rstan) wch = grep("beta|sigma", dimnames(s)$parameters) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , wch], 2, as.mcmc)) autocorr.diag(mcmc)
beta[1] beta[2] sigma Lag 0 1.000000000 1.000000000 1.000000000 Lag 1 0.097078125 0.085187946 0.017292690 Lag 5 0.049182168 0.039826812 -0.015386788 Lag 10 -0.031717059 -0.026227268 0.017832942 Lag 50 0.004150904 -0.003924103 -0.002915989
- via rstan
- Traceplots
stan_trace(data.het.rstan, pars = c("beta", "sigma"))
- Raftery diagnostic
raftery.diag(data.het.rstan)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data.het.rstan, pars = c("beta", "sigma"))
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data.het.rstan, pars = c("beta", "sigma"))
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data.het.rstan, pars = c("beta", "sigma"))
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.array(data.het.rstan), regex_pars = "beta|sigma")
library(bayesplot) mcmc_combo(as.array(data.het.rstan), regex_pars = "beta|sigma")
- Density plots
library(bayesplot) mcmc_dens(as.array(data.het.rstan), regex_pars = "beta|sigma")
- Trace plots and density plots
- via shinystan
library(shinystan) launch_shinystan(data.het.rstan))
- It is worth exploring the influence of our priors.
Model validation
Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.
For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.
There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.
Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:
Standardized residuals: | the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known). | |
Studentized residuals: | the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out. | |
Pearson residuals: | the raw residuals divided by the standard deviation of the response variable. |
The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.
The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.
Although residuals can be computed directly within R2jags, we can calculate them manually from the posteriors to be consistent across other approaches.
mcmc = data.het.r2jags$BUGSoutput$sims.matrix[, c("beta0", "beta1")] # generate a model matrix newdata = data.frame(x = data.het$x) Xmat = model.matrix(~x, newdata) ## get median parameter estimates coefs = apply(mcmc, 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data.het$y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
The above residual plot would make us believe that we had a homogeneity of variance issue (which we thought we were addressing by defining a model that allowed the variance to be proportional to the predictor). This is because we have plotted the raw residuals rather than residuals that have been standardized by the variances. The above plot is also what the residual plot would look like if we had not made any attempt to define a model in which the variance was related to the predictor.
Whenever we fit a model that incorporates changes to the variance-covariance structures, we should explore standardized residuals. In this case, we should divide the residuals by sigma and then divide by the square-root of the weights. $$ Res_i = \frac{Y_{i} - \mu_i}{\sigma\times\sqrt{\omega}}\\ $$
mcmc = data.het.r2jags$BUGSoutput$sims.matrix coefs = mcmc[, c("beta0", "beta1")] Xmat = model.matrix(~x, data.het) fit = coefs %*% t(Xmat) resid = -1 * sweep(fit, 2, data.het$y, "-") resid = apply(resid, 2, median)/(median(mcmc[, "sigma"]) * sqrt(data.het$x)) fit = apply(fit, 2, median) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Conclusions:This is certainly an improvement. Nevertheless, there is still an indication of a relationship between mean and variance. We could attempt to further address this by refining $\omega$ in the Bayesian model. That is, rather than indicate that variance is proportional to $x$, we could indicate that variance is proportional to $x^2$ (as an example) - we will leave this as an exercise for the reader.
Residuals against predictors
ggplot() + geom_point(data = NULL, aes(y = resid, x = data.het$x))
Lets see how well data simulated from the model reflects the raw data
mcmc = data.het.r2jags$BUGSoutput$sims.matrix # generate a model matrix Xmat = model.matrix(~x, data.het) ## get median parameter estimates coefs = mcmc[, c("beta0", "beta1")] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data.het), fit[i, ], mcmc[i, "sigma"])) ggplot() + geom_density(data = NULL, aes(x = as.vector(yRep), fill = "Model"), alpha = 0.5) + geom_density(data = data.het, aes(x = y, fill = "Obs"), alpha = 0.5)
Although residuals can be computed directly within R2jags, we can calculate them manually from the posteriors to be consistent across other approaches.
mcmc = as.matrix(data.het.rstan)[, c("beta[1]", "beta[2]")] # generate a model matrix newdata = data.frame(x = data.het$x) Xmat = model.matrix(~x, newdata) ## get median parameter estimates coefs = apply(mcmc, 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data.het$y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
The above residual plot would make us believe that we had a homogeneity of variance issue (which we thought we were addressing by defining a model that allowed the variance to be proportional to the predictor). This is because we have plotted the raw residuals rather than residuals that have been standardized by the variances. The above plot is also what the residual plot would look like if we had not made any attempt to define a model in which the variance was related to the predictor.
Whenever we fit a model that incorporates changes to the variance-covariance structures, we should explore standardized residuals. In this case, we should divide the residuals by sigma and then divide by the square-root of the weights. $$ Res_i = \frac{Y_{i} - \mu_i}{\sigma\times\sqrt{\omega}}\\ $$
mcmc = as.matrix(data.het.rstan) coefs = mcmc[, c("beta[1]", "beta[2]")] Xmat = model.matrix(~x, data.het) fit = coefs %*% t(Xmat) resid = -1 * sweep(fit, 2, data.het$y, "-") resid = apply(resid, 2, median)/(median(mcmc[, "sigma"]) * sqrt(data.het$x)) fit = apply(fit, 2, median) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Conclusions:This is certainly an improvement. Nevertheless, there is still an indication of a relationship between mean and variance. We could attempt to further address this by refining $\omega$ in the Bayesian model. That is, rather than indicate that variance is proportional to $x$, we could indicate that variance is proportional to $x^2$ (as an example) - we will leave this as an exercise for the reader.
Residuals against predictors
ggplot() + geom_point(data = NULL, aes(y = resid, x = data.het$x))
Lets see how well data simulated from the model reflects the raw data
mcmc = as.matrix(data.het.rstan) # generate a model matrix Xmat = model.matrix(~x, data.het) ## get median parameter estimates coefs = mcmc[, c("beta[1]", "beta[2]")] fit = coefs %*% t(Xmat) ## draw samples from this model yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data.het), fit[i, ], mcmc[i, "sigma"])) ggplot() + geom_density(data = NULL, aes(x = as.vector(yRep), fill = "Model"), alpha = 0.5) + geom_density(data = data.het, aes(x = y, fill = "Obs"), alpha = 0.5)
Parameter estimates (posterior summaries)
Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.
print(data.het.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta0 41.505 2.588 36.324 39.849 41.505 43.155 46.587 1.001 15000 beta1 1.110 0.406 0.315 0.848 1.107 1.371 1.925 1.001 6100 sigma 3.070 0.642 2.118 2.617 2.971 3.408 4.578 1.001 15000 deviance 110.930 2.800 107.743 108.871 110.207 112.222 118.054 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 3.9 and DIC = 114.9 DIC is an estimate of expected predictive error (lower deviance is better).
# OR library(broom) tidyMCMC(as.mcmc(data.het.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta0 41.504855 2.5880932 36.2211185 46.428550 2 beta1 1.109973 0.4059171 0.3133687 1.919113 3 deviance 110.930135 2.8003152 107.5038338 116.440782 4 sigma 3.070266 0.6421196 2.0157253 4.355305
1.1099725
change in y. That is, y declines at a rate of 1.1099725
per
unit increase in x.
The 95% confidence interval for the slope does not overlap with 0
implying a significant effect of x on y.
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } } ## since values are less than zero mcmcpvalue(data.het.r2jags$BUGSoutput$sims.matrix[, c("beta1")])
[1] 0.0104
With a p-value of essentially 0, we would conclude that there is almost no evidence that the slope was likely to be equal to zero, suggesting there is a relationship.
summary(data.het.rstan)
$summary mean se_mean sd 2.5% 25% 50% 75% beta[1] 41.442992 0.073836985 2.5568358 36.2395852 39.7883053 41.487069 43.020728 beta[2] 1.115389 0.011220582 0.3915228 0.3035451 0.8708631 1.115613 1.370817 sigma 3.044974 0.016157950 0.6158596 2.1237005 2.6116157 2.939065 3.388581 mu[1] 42.558381 0.065981928 2.2801513 37.8793785 41.0791169 42.584920 43.984087 mu[2] 43.673770 0.059205662 2.0414090 39.3566131 42.3459746 43.686376 44.944022 mu[3] 44.789158 0.053853568 1.8553145 40.9597137 43.6303630 44.811367 45.954376 mu[4] 45.904547 0.051303366 1.7388540 42.3442188 44.8184005 45.916842 46.974475 mu[5] 47.019936 0.049992430 1.7063455 43.5304194 45.9485474 47.028453 48.110881 mu[6] 48.135324 0.051054709 1.7624406 44.4856697 47.0420123 48.146505 49.262421 mu[7] 49.250713 0.054284998 1.8993049 45.3203368 48.0904899 49.273091 50.465238 mu[8] 50.366102 0.059596375 2.1012143 46.1100462 49.0847834 50.365418 51.712390 mu[9] 51.481490 0.066265040 2.3514727 46.6581488 50.0493571 51.485135 52.902086 mu[10] 52.596879 0.073986726 2.6363473 47.2285624 51.0173194 52.621610 54.145544 mu[11] 53.712268 0.082550189 2.9458123 47.6985818 51.9495578 53.770529 55.461604 mu[12] 54.827656 0.091849243 3.2728999 48.2355634 52.8156275 54.902845 56.824694 mu[13] 55.943045 0.101595811 3.6128268 48.7050314 53.6913432 55.996575 58.176609 mu[14] 57.058434 0.111619608 3.9622899 49.1437058 54.5452518 57.111648 59.540074 mu[15] 58.173822 0.121852237 4.3189751 49.6091489 55.3979373 58.222692 60.920433 mu[16] 59.289211 0.132245232 4.6812317 49.8967607 56.2642676 59.314612 62.283450 log_lik[1] -2.297277 0.013837640 0.4679187 -3.6180401 -2.4475348 -2.186895 -2.006298 log_lik[2] -2.476510 0.007329130 0.2560505 -3.1087962 -2.6162324 -2.440820 -2.305207 log_lik[3] -2.883341 0.008429006 0.2874254 -3.5561315 -3.0315994 -2.852148 -2.684144 log_lik[4] -3.249161 0.008169641 0.2919630 -3.9049075 -3.4090507 -3.213980 -3.045913 log_lik[5] -2.999270 0.005767221 0.1989539 -3.4156628 -3.1271936 -2.986768 -2.861143 log_lik[6] -2.936932 0.005223006 0.1957769 -3.3399875 -3.0651037 -2.919002 -2.798798 log_lik[7] -3.017143 0.005188627 0.1958132 -3.4228552 -3.1443770 -2.997645 -2.878875 log_lik[8] -3.979441 0.010315037 0.3645965 -4.8835340 -4.1734291 -3.919540 -3.735370 log_lik[9] -3.325072 0.005694328 0.2055005 -3.7713981 -3.4438154 -3.313126 -3.188136 log_lik[10] -3.206157 0.005318377 0.1998369 -3.6282113 -3.3299997 -3.185634 -3.069982 log_lik[11] -4.032289 0.010792269 0.3878336 -4.9664812 -4.2458943 -3.977621 -3.773950 log_lik[12] -5.765511 0.027636556 1.0078730 -8.2547659 -6.3137862 -5.598133 -5.055433 log_lik[13] -3.348961 0.005511922 0.2076098 -3.8038542 -3.4708539 -3.329661 -3.206951 log_lik[14] -3.472361 0.006479063 0.2334174 -4.0031197 -3.6120043 -3.447464 -3.305322 log_lik[15] -3.434614 0.005712109 0.2155605 -3.9066866 -3.5664219 -3.414566 -3.287066 log_lik[16] -4.991900 0.021339926 0.7754726 -6.8351376 -5.4083531 -4.875211 -4.430311 lp__ -39.939390 0.040398678 1.3460605 -43.4473362 -40.5840790 -39.557955 -38.937435 97.5% n_eff Rhat beta[1] 46.676081 1199.107 0.9990098 beta[2] 1.874006 1217.540 0.9989519 sigma 4.517810 1452.750 0.9996971 mu[1] 47.260131 1194.201 0.9989272 mu[2] 47.814239 1188.868 0.9987994 mu[3] 48.575770 1186.878 0.9986200 mu[4] 49.388986 1148.774 0.9984065 mu[5] 50.414817 1164.999 0.9982116 mu[6] 51.675919 1191.674 0.9980950 mu[7] 53.051423 1224.136 0.9980736 mu[8] 54.535378 1243.086 0.9981195 mu[9] 56.108635 1259.247 0.9981955 mu[10] 57.851426 1269.690 0.9982772 mu[11] 59.521531 1273.427 0.9983530 mu[12] 61.326717 1269.738 0.9984194 mu[13] 63.015019 1264.569 0.9984760 mu[14] 64.754118 1260.119 0.9985240 mu[15] 66.549062 1256.303 0.9985645 mu[16] 68.364529 1253.026 0.9985990 log_lik[1] -1.726202 1143.449 1.0031401 log_lik[2] -2.064946 1220.524 1.0007140 log_lik[3] -2.447192 1162.780 0.9996992 log_lik[4] -2.771311 1277.173 0.9984121 log_lik[5] -2.633946 1190.067 0.9991651 log_lik[6] -2.591999 1405.018 0.9991954 log_lik[7] -2.676223 1424.226 0.9990995 log_lik[8] -3.430373 1249.348 0.9986004 log_lik[9] -2.950170 1302.389 0.9986628 log_lik[10] -2.849156 1411.864 0.9990402 log_lik[11] -3.438390 1291.414 0.9988735 log_lik[12] -4.267266 1329.976 0.9989939 log_lik[13] -2.978654 1418.697 0.9990472 log_lik[14] -3.081081 1297.903 0.9988837 log_lik[15] -3.066279 1424.116 0.9990421 log_lik[16] -3.872842 1320.526 0.9993335 lp__ -38.440798 1110.184 1.0016240 $c_summary , , chains = chain:1 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 41.528564 2.8019920 35.8530063 39.7410560 41.624789 43.285900 47.908926 beta[2] 1.100926 0.4088769 0.1860436 0.8430244 1.087629 1.362995 1.899605 sigma 3.046492 0.5924986 2.1243840 2.6440293 2.942268 3.413697 4.352459 mu[1] 42.629489 2.5042292 37.3375023 41.0585913 42.716571 44.223422 48.460018 mu[2] 43.730415 2.2417693 38.9499522 42.3322862 43.819803 45.152049 48.738744 mu[3] 44.831340 2.0283628 40.5141179 43.5426354 44.882811 45.999981 49.063722 mu[4] 45.932266 1.8807824 42.0166596 44.7867889 45.923635 46.983407 49.566786 mu[5] 47.033192 1.8151557 43.2474401 45.9074891 47.022698 48.111791 50.579488 mu[6] 48.134117 1.8402713 44.1966952 47.0441049 48.063611 49.143153 51.959850 mu[7] 49.235043 1.9526308 45.3100844 48.0883693 49.191250 50.336489 53.241805 mu[8] 50.335968 2.1385268 46.3487674 49.1431912 50.184464 51.656064 54.770749 mu[9] 51.436894 2.3807955 46.9057794 49.9922007 51.344531 52.897557 56.367727 mu[10] 52.537819 2.6641017 47.5758564 50.8685391 52.502873 54.251752 57.909990 mu[11] 53.638745 2.9767516 47.9074109 51.6746331 53.607749 55.532238 59.507157 mu[12] 54.739670 3.3104414 48.3495382 52.5541041 54.617292 56.842567 61.380926 mu[13] 55.840596 3.6594201 49.1286691 53.3190315 55.746453 58.158934 63.015019 mu[14] 56.941521 4.0197076 49.4731250 54.2188418 56.842231 59.493962 64.730752 mu[15] 58.042447 4.3885195 49.7629817 55.1352003 57.995215 60.844878 66.488052 mu[16] 59.143373 4.7638764 49.9697883 55.9820378 59.092284 62.258982 68.309433 log_lik[1] -2.350647 0.5286499 -3.9996124 -2.4975791 -2.219434 -2.017799 -1.741279 log_lik[2] -2.496446 0.2780079 -3.1807610 -2.6295756 -2.447306 -2.311133 -2.052748 log_lik[3] -2.896916 0.3042769 -3.6828110 -3.0453972 -2.857694 -2.684845 -2.479164 log_lik[4] -3.251114 0.3052603 -3.9561235 -3.4243398 -3.206582 -3.039089 -2.787134 log_lik[5] -3.003003 0.2029217 -3.4335834 -3.1209541 -2.986768 -2.863224 -2.662462 log_lik[6] -2.940562 0.1925755 -3.3364549 -3.0681762 -2.917111 -2.814044 -2.591560 log_lik[7] -3.020188 0.1919842 -3.4198148 -3.1470315 -2.996127 -2.894281 -2.655380 log_lik[8] -3.982608 0.3579581 -4.8131497 -4.1638441 -3.950991 -3.733416 -3.437800 log_lik[9] -3.321859 0.2040838 -3.7726359 -3.4409782 -3.305373 -3.181361 -2.957259 log_lik[10] -3.208818 0.1960512 -3.6238857 -3.3327523 -3.184843 -3.078938 -2.832242 log_lik[11] -4.017825 0.3872340 -4.9216734 -4.2211122 -3.953278 -3.748361 -3.437021 log_lik[12] -5.772309 0.9860186 -8.1881140 -6.3149963 -5.623262 -5.067952 -4.249900 log_lik[13] -3.351522 0.2054336 -3.7889271 -3.4731418 -3.330360 -3.214229 -2.973434 log_lik[14] -3.479038 0.2325899 -3.9572592 -3.6186142 -3.456716 -3.316510 -3.115289 log_lik[15] -3.436701 0.2150003 -3.9210547 -3.5676785 -3.414137 -3.297301 -3.061986 log_lik[16] -4.961413 0.7732680 -6.7995259 -5.3686746 -4.876133 -4.408572 -3.861280 lp__ -40.012700 1.4208910 -43.5349286 -40.6725931 -39.631028 -38.955392 -38.418625 , , chains = chain:2 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 41.387569 2.3501586 36.9401470 39.8348194 41.359489 42.937543 46.111209 beta[2] 1.124364 0.3673544 0.3566062 0.8805587 1.138574 1.361100 1.805460 sigma 3.023866 0.5927952 2.1353513 2.5858607 2.920922 3.363023 4.446748 mu[1] 42.511933 2.1061507 38.3961946 41.0385493 42.473801 43.864066 46.654408 mu[2] 43.636297 1.9022079 39.9259193 42.3307690 43.605724 44.867031 47.394732 mu[3] 44.760662 1.7523748 41.4037171 43.6447262 44.739049 45.960363 48.209565 mu[4] 45.885026 1.6712685 42.3989641 44.8636335 45.912400 46.989998 49.029743 mu[5] 47.009390 1.6689392 43.6554264 46.0290883 47.049534 48.137699 50.083733 mu[6] 48.133754 1.7457021 44.5370295 47.0609188 48.210916 49.366050 51.232292 mu[7] 49.258119 1.8919547 45.3219753 48.0450289 49.393772 50.573615 52.623551 mu[8] 50.382483 2.0931812 46.0823520 49.0433743 50.509600 51.833755 54.188182 mu[9] 51.506847 2.3352133 46.6652669 50.0684306 51.699848 52.964821 55.820282 mu[10] 52.631211 2.6067093 47.3256742 51.0740745 52.926153 54.182575 57.311846 mu[11] 53.755575 2.8994040 47.9457351 52.0818479 54.027030 55.474568 58.962860 mu[12] 54.879940 3.2074995 48.4031871 52.9827386 55.202891 56.834914 60.811384 mu[13] 56.004304 3.5269619 48.8950122 53.8301599 56.256864 58.235318 62.501756 mu[14] 57.128668 3.8549665 49.4375653 54.6467123 57.410980 59.659023 64.203041 mu[15] 58.253032 4.1895072 49.8120628 55.4766821 58.525225 61.027451 65.977429 mu[16] 59.377396 4.5291361 50.4053368 56.3087753 59.571903 62.345379 67.719287 log_lik[1] -2.256983 0.3961362 -3.3729315 -2.4114361 -2.178736 -1.992598 -1.720997 log_lik[2] -2.460245 0.2341579 -2.9671122 -2.5959978 -2.425476 -2.303776 -2.065856 log_lik[3] -2.870771 0.2702772 -3.4828180 -3.0255020 -2.855847 -2.683271 -2.429853 log_lik[4] -3.249038 0.2913879 -3.9561643 -3.4035476 -3.216322 -3.033241 -2.803796 log_lik[5] -2.992498 0.1921351 -3.3874051 -3.1265917 -2.992668 -2.856202 -2.631395 log_lik[6] -2.930363 0.1922738 -3.3242045 -3.0644053 -2.912951 -2.793122 -2.597045 log_lik[7] -3.010712 0.1926961 -3.4005792 -3.1433367 -2.989037 -2.870833 -2.695970 log_lik[8] -3.979931 0.3707352 -4.9000950 -4.1897195 -3.882413 -3.715182 -3.466265 log_lik[9] -3.321140 0.1993858 -3.7032030 -3.4472564 -3.314551 -3.188137 -2.948161 log_lik[10] -3.198711 0.1961180 -3.6004363 -3.3272556 -3.177500 -3.058721 -2.876940 log_lik[11] -4.036147 0.3747043 -4.9001937 -4.2519599 -3.998740 -3.780704 -3.468442 log_lik[12] -5.772195 1.0073231 -8.1079058 -6.3556736 -5.623463 -5.066196 -4.299187 log_lik[13] -3.339984 0.2014331 -3.7615858 -3.4646174 -3.317085 -3.197151 -2.996124 log_lik[14] -3.461620 0.2266057 -3.9756796 -3.6024832 -3.437863 -3.292926 -3.081081 log_lik[15] -3.424779 0.2067852 -3.8535273 -3.5532130 -3.400772 -3.274675 -3.068120 log_lik[16] -5.005918 0.7519932 -6.5710436 -5.4318842 -4.902504 -4.457718 -3.916247 lp__ -39.836644 1.2070517 -43.0354449 -40.4515601 -39.502016 -38.916113 -38.472828 , , chains = chain:3 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 41.412844 2.5005399 36.1563135 39.8535638 41.405371 42.922031 46.586136 beta[2] 1.120876 0.3975358 0.2691932 0.8855867 1.114963 1.394922 1.946018 sigma 3.064565 0.6603598 2.0961555 2.6181457 2.953643 3.393950 4.607653 mu[1] 42.533721 2.2143251 38.0572641 41.1492681 42.580736 43.889740 47.172894 mu[2] 43.654597 1.9671910 39.9193431 42.4087083 43.691735 44.847774 47.737954 mu[3] 44.775473 1.7755323 41.3192544 43.6490241 44.828206 45.895562 48.598482 mu[4] 45.896349 1.6586919 42.6263564 44.7889704 45.940357 46.889625 49.344418 mu[5] 47.017226 1.6328110 43.7877095 45.9232632 46.972680 48.002241 50.319649 mu[6] 48.138102 1.7020442 44.8627950 47.0337988 48.075746 49.229521 51.548975 mu[7] 49.258978 1.8557764 45.4183344 48.1124501 49.264211 50.413801 52.969915 mu[8] 50.379854 2.0753138 46.0888917 49.1652269 50.378598 51.518357 54.575569 mu[9] 51.500731 2.3422250 46.6227881 50.0573726 51.456317 52.757954 56.148035 mu[10] 52.621607 2.6421919 47.0440883 51.0730300 52.541727 54.076264 58.031888 mu[11] 53.742483 2.9651994 47.4296843 51.9871419 53.709935 55.440968 59.824271 mu[12] 54.863359 3.3044979 47.6901479 52.8668727 54.828282 56.771707 61.711814 mu[13] 55.984235 3.6555539 48.0366720 53.7732189 55.976891 58.096926 63.490503 mu[14] 57.105112 4.0152847 48.3767095 54.7146668 57.002651 59.492655 65.313177 mu[15] 58.225988 4.3815543 48.7153556 55.5886539 58.165211 60.827961 67.044824 mu[16] 59.346864 4.7528512 49.1759795 56.4162857 59.314612 62.223581 68.682883 log_lik[1] -2.284202 0.4654831 -3.4855222 -2.4231831 -2.173146 -1.996454 -1.757730 log_lik[2] -2.472839 0.2532848 -3.0723563 -2.6242692 -2.452670 -2.301271 -2.071693 log_lik[3] -2.882336 0.2866934 -3.5389797 -3.0292128 -2.846143 -2.684548 -2.435088 log_lik[4] -3.247331 0.2792325 -3.8609855 -3.4064961 -3.221445 -3.064650 -2.737466 log_lik[5] -3.002308 0.2018522 -3.4233435 -3.1318872 -2.981898 -2.863587 -2.609326 log_lik[6] -2.939871 0.2025334 -3.3531087 -3.0603725 -2.924456 -2.798436 -2.581108 log_lik[7] -3.020528 0.2028057 -3.4369340 -3.1419938 -3.002460 -2.878680 -2.671117 log_lik[8] -3.975784 0.3656808 -4.8837286 -4.1519040 -3.913367 -3.748051 -3.416039 log_lik[9] -3.332216 0.2130229 -3.7953188 -3.4496174 -3.314124 -3.196204 -2.945342 log_lik[10] -3.210942 0.2073172 -3.6294905 -3.3351649 -3.192107 -3.068282 -2.850840 log_lik[11] -4.042896 0.4014434 -5.0650864 -4.2467461 -3.974052 -3.777690 -3.439002 log_lik[12] -5.752027 1.0316442 -8.2617409 -6.2387360 -5.568174 -5.035430 -4.260218 log_lik[13] -3.355376 0.2158043 -3.8084198 -3.4733959 -3.339935 -3.208145 -2.973054 log_lik[14] -3.476424 0.2409242 -4.0428982 -3.6146913 -3.450678 -3.311357 -3.075989 log_lik[15] -3.442363 0.2245851 -3.9441121 -3.5792966 -3.425329 -3.294395 -3.058617 log_lik[16] -5.008369 0.8010282 -6.9388527 -5.4170407 -4.837982 -4.412641 -3.832509 lp__ -39.968826 1.3964822 -43.6992157 -40.6423614 -39.549928 -38.928931 -38.429648
# OR library(broom) tidyMCMC(data.het.rstan, conf.int = TRUE, conf.method = "HPDinterval", rhat = TRUE, ess = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 beta[1] 41.442992 2.5568358 36.3065961 46.710851 0.9990098 1199 2 beta[2] 1.115389 0.3915228 0.3935158 1.950012 0.9989519 1218 3 sigma 3.044974 0.6158596 2.0240661 4.244559 0.9996971 1453 4 mu[1] 42.558381 2.2801513 38.2161837 47.424090 0.9989272 1194 5 mu[2] 43.673770 2.0414090 39.8751614 48.073139 0.9987994 1189 6 mu[3] 44.789158 1.8553145 41.2587511 48.790015 0.9986200 1187 7 mu[4] 45.904547 1.7388540 42.6584041 49.583352 0.9984065 1149 8 mu[5] 47.019936 1.7063455 43.8359191 50.635564 0.9982116 1165 9 mu[6] 48.135324 1.7624406 44.8519818 51.900358 0.9980950 1192 10 mu[7] 49.250713 1.8993049 45.5863836 53.280343 0.9980736 1224 11 mu[8] 50.366102 2.1012143 46.1162274 54.548014 0.9981195 1243 12 mu[9] 51.481490 2.3514727 46.6227738 56.048615 0.9981955 1259 13 mu[10] 52.596879 2.6363473 47.4156477 57.960830 0.9982772 1270 14 mu[11] 53.712268 2.9458123 48.4289047 60.162033 0.9983530 1273 15 mu[12] 54.827656 3.2728999 48.7551524 61.736528 0.9984194 1270 16 mu[13] 55.943045 3.6128268 48.9284807 63.162194 0.9984760 1265 17 mu[14] 57.058434 3.9622899 49.8414423 65.330459 0.9985240 1260 18 mu[15] 58.173822 4.3189751 49.9295381 66.839547 0.9985645 1256 19 mu[16] 59.289211 4.6812317 50.4342562 68.776186 0.9985990 1253 20 log_lik[1] -2.297277 0.4679187 -3.2418483 -1.604999 1.0031401 1143 21 log_lik[2] -2.476510 0.2560505 -2.9617001 -1.979047 1.0007140 1221 22 log_lik[3] -2.883341 0.2874254 -3.4492556 -2.395858 0.9996992 1163 23 log_lik[4] -3.249161 0.2919630 -3.9051849 -2.771500 0.9984121 1277 24 log_lik[5] -2.999270 0.1989539 -3.4218408 -2.646321 0.9991651 1190 25 log_lik[6] -2.936932 0.1957769 -3.3233935 -2.579678 0.9991954 1405 26 log_lik[7] -3.017143 0.1958132 -3.4297453 -2.682976 0.9990995 1424 27 log_lik[8] -3.979441 0.3645965 -4.8028118 -3.407681 0.9986004 1249 28 log_lik[9] -3.325072 0.2055005 -3.7121453 -2.922404 0.9986628 1302 29 log_lik[10] -3.206157 0.1998369 -3.5952655 -2.829848 0.9990402 1412 30 log_lik[11] -4.032289 0.3878336 -4.7700990 -3.364148 0.9988735 1291 31 log_lik[12] -5.765511 1.0078730 -7.9007419 -4.167272 0.9989939 1330 32 log_lik[13] -3.348961 0.2076098 -3.7636119 -2.970973 0.9990472 1419 33 log_lik[14] -3.472361 0.2334174 -3.9053621 -3.019839 0.9988837 1298 34 log_lik[15] -3.434614 0.2155605 -3.8481680 -3.022367 0.9990421 1424 35 log_lik[16] -4.991900 0.7754726 -6.4469669 -3.676209 0.9993335 1321
1.1153887
change in y. That is, y declines at a rate of 1.1153887
per
unit increase in x.
The 95% confidence interval for the slope does not overlap with 0
implying a significant effect of x on y.
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } } ## since values are less than zero mcmcpvalue(as.matrix(data.het.rstan)[, c("beta[2]")])
[1] 0.009333333
With a p-value of essentially 0, we would conclude that there is almost no evidence that the slope was likely to be equal to zero, suggesting there is a relationship.
Graphical summaries
A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.
Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.
mcmc = data.het.r2jags$BUGSoutput$sims.matrix ## Calculate the fitted values newdata = data.frame(x = seq(min(data.het$x, na.rm = TRUE), max(data.het$x, na.rm = TRUE), len = 1000)) Xmat = model.matrix(~x, newdata) coefs = mcmc[, c("beta0", "beta1")] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = x)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
If you wanted to represent sample data on the figure in such a simple example (single predictor) we could simply over- (or under-) lay the raw data.
ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = data.het, aes(y = y, x = x), color = "gray") + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
A more general solution would be to add the partial residuals to the figure. Partial residuals are the fitted values plus the residuals. In this simple case, that equates to exactly the same as the raw observations since $resid = obs - fitted$ and the fitted values depend only on the single predictor we are interested in.
## Calculate partial residuals fitted values fdata = rdata = data.het fMat = rMat = model.matrix(~x, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data.het$y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = rdata, aes(y = partial.resid), color = "gray") + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
mcmc = as.matrix(data.het.rstan) ## Calculate the fitted values newdata = data.frame(x = seq(min(data.het$x, na.rm = TRUE), max(data.het$x, na.rm = TRUE), len = 1000)) Xmat = model.matrix(~x, newdata) coefs = mcmc[, c("beta[1]", "beta[2]")] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = x)) + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
If you wanted to represent sample data on the figure in such a simple example (single predictor) we could simply over- (or under-) lay the raw data.
ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = data.het, aes(y = y, x = x), color = "gray") + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
A more general solution would be to add the partial residuals to the figure. Partial residuals are the fitted values plus the residuals. In this simple case, that equates to exactly the same as the raw observations since $resid = obs - fitted$ and the fitted values depend only on the single predictor we are interested in.
## Calculate partial residuals fitted values fdata = rdata = data.het fMat = rMat = model.matrix(~x, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data.het$y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = rdata, aes(y = partial.resid), color = "gray") + geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "blue", alpha = 0.3) + scale_y_continuous("Y") + scale_x_continuous("X") + theme_classic()
$R^2$
In a frequentist context, the $R^2$ value is seen as a useful indicator of goodness of fit. Whilst it has long been acknowledged that this measure is not appropriate for comparing models (for such purposes information criterion such as AIC are more appropriate), it is nevertheless useful for estimating the amount (percent) of variance explained by the model.
In a frequentist context, $R^2$ is calculated as the variance in predicted values divided by the variance in the observed (response) values.
Unfortunately, this classical formulation does not translate simply into a Bayesian context since
the equivalently calculated numerator can be larger than the an equivalently calculated denominator - thereby resulting in an $R^2$
greater than 100%. Gelman, Goodrich, Gabry, and Ali (2017)
proposed an alternative
formulation in which the denominator comprises the sum of the explained variance and the variance of the residuals.
So in the standard regression model notation of: $$ \begin{align} y_i \sim{}& N(\mu_i, \sigma)\\ \mu_i =& \mathbf{X}\boldsymbol{\beta} \end{align} $$ The $R^2$ could be formulated as: $$ R^2 = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_e} $$ where $\sigma^2_f = var(\mu)$, ($\mu = \mathbf{X}\boldsymbol{\beta})$) and for Gaussian models $\sigma^2_e = var(y-\mu)$
library(broom) mcmc <- data.het.r2jags$BUGSoutput$sims.matrix Xmat = model.matrix(~x, data.het) coefs = mcmc[, c("beta0", "beta1")] fit = coefs %*% t(Xmat) resid = sweep(fit, 2, data.het$y, "-") var_f = apply(fit, 1, var) var_e = apply(resid, 1, var) R2 = var_f/(var_f + var_e) tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 var1 0.2426089 0.1090922 0.0324328 0.4434094
# for comparison with frequentist summary(lm(y ~ x, data))
Heteroscedasticity with categorical predictors
For regression models that include a categorical variable (e.g. ANOVA), heterogeneity manifests as vastly different variances for different levels (treatment groups) of the categorical variable. Recall, that this is diagnosed from the relative size of boxplots. Whilst, the degree of group variability may not be related to the means of the groups, having wildly different variances does lead to an increase in standard errors and thus a lowering of power.
In such cases, we would like to be able to indicate that the variances should be estimated separately for each group. That is the variance term is multiplied by a different number for each group. The appropriate matrix is referred to as an Identity matrix.
Again, to assist in the explanation some fabricated ANOVA data - data that has heteroscadasticity by design - will be useful.
set.seed(1) ngroups <- 5 #number of populations nsample <- 10 #number of reps in each pop.means <- c(40, 45, 55, 40, 30) #population mean length sigma <- rep(c(6, 4, 2, 0.5, 1), each = nsample) #residual standard deviation n <- ngroups * nsample #total sample size eps <- rnorm(n, 0, sigma) #residuals x <- gl(ngroups, nsample, n, lab = LETTERS[1:5]) #factor means <- rep(pop.means, rep(nsample, ngroups)) X <- model.matrix(~x - 1) #create a design matrix y <- as.numeric(X %*% pop.means + eps) data.het1 <- data.frame(y, x) boxplot(y ~ x, data.het1)
plot(lm(y ~ x, data.het1), which = 3)
It is clear that there is gross heteroscedasticity. The residuals are obviously more spread in some groups than others yet there is no real pattern with means (the residual plot does not show an obvious wedge). Note, for assessing homogeneity of variance, it is best to use the standardized residuals.
It turns out that if we switch over to maximum (log) likelihood estimation methods, we can model in a within-group heteroscedasticity structure rather than just assume one very narrow form of variance structure.
Lets take a step back and reflect on our simple ANOVA (regression) model that has five groups each with 10 observations: $$y = \mu + \alpha_i + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0,\sigma^2)$$ This is shorthand notation to indicate that the response variable is being modelled against a specific linear predictor and that the residuals follow a normal distribution with a certain variance (that is the same for each group).
Rather than assume that the variance of each group is the same, we could relax this a little so as to permit different levels of variance per group: $$\varepsilon \sim \mathcal{N}(0,\sigma_i^2)$$
To achieve this, we actually multiply the variance matrix by a weighting matrix, where the weights associated with each group are determined by the inverse of the ratio of each group to the first (reference) group: $$\varepsilon \sim \mathcal{N}(0,\sigma_i^2\times \omega)$$
So returning to our five groups of 10 observations example, the weights would be determined as:
data.het1.sd <- with(data.het1, tapply(y, x, sd)) 1/(data.het1.sd[1]/data.het1.sd)
A B C D E 1.00000000 0.91342905 0.40807277 0.08632027 0.12720488
Model fitting or statistical analysis
All the Bayesian linear modelling tutorials so far (Tutorial 7.x), have demonstrated modelling using a variety of tools (such as MCMCpack, JAGS, RSTAN, RSTANARM and BRMS). This is also the intention of this tutorial. However, MCMCpack, RSTANARM and BRMS do not support heterogeneous variances and thus these tools will be not be demonstrated. Whilst JAGS and RSTAN are extremely flexible and thus allow models to be formulated that contain not only the simple model, but also additional derivatives, the other approaches are more restrictive. Consequently, I will mostly restrict models to just the minimum necessary and all derivatives will instead be calculated in R itself from the returned posteriors.
The observed response ($y_i$) are assumed to be drawn from a normal distribution with a given mean ($\mu$) and standard deviation weighted by 1 on the value of the covariate ($\sigma\times\omega$). The expected values ($\mu$) are themselves determined by the linear predictor ($\beta_0 + \beta X_i$). In this case, $\beta_0$ represents the mean of the first group and the set of $\beta$'s represent the differences between each other group and the first group.
MCMC sampling requires priors on all parameters. We will employ weakly informative priors. Specifying 'uninformative' priors is always a bit of a balancing act. If the priors are too vague (wide) the MCMC sampler can wander off into nonscence areas of likelihood rather than concentrate around areas of highest likelihood (desired when wanting the outcomes to be largely driven by the data). On the other hand, if the priors are too strong, they may have an influence on the parameters. In such a simple model, this balance is very forgiving - it is for more complex models that prior choice becomes more important.
For this simple model, we will go with zero-centered Gaussian (normal) priors with relatively large standard deviations (100) for both the intercept and the treatment effect and a wide half-cauchy (scale=5) for the standard deviation. $$ \begin{align} y_i &\sim{} N(\mu_i, \sigma\times\omega)\\ \mu_i &= \beta_0 + \beta X_i\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,100)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$
Specific formulation
For very simple models such as this example, we can write the models as: $$\begin{align} y_i&\sim{}N(\mu_i, \tau\times 1/X_i)\\ \mu_i &= \beta_0 + \beta X_i\\ \beta_0&\sim{}N(0,1.0{E-6}) \hspace{1cm}\mathsf{non-informative~prior~for~interept}\\ \beta_j&\sim{}N(0,1.0{E-6}) \hspace{1cm}\mathsf{non-informative~prior~for~partial~slopes}\\ \sigma &=z/\sqrt{\chi}\\ z&\sim{}N(0,0.04)I(0,)\\ \chi&\sim{}\Gamma(0.5,0.5)\\ \tau &= \sigma^{-2}\\ \end{align} $$
Define the model
Note the following example as group means calculated as derived posteriors
modelString = " model { #Likelihood for (i in 1:n) { y[i]~dnorm(mu[i],tau[x[i]]) mu[i] <- inprod(beta[],X[i,]) } #Priors and derivatives for (i in 1:ngroups) { beta[i] ~ dnorm(0,1.0E-6) sigma[i] <- z[i]/sqrt(chSq[i]) z[i] ~ dnorm(0, 0.04)I(0,) chSq[i] ~ dgamma(0.5, 0.5) tau[i] <- pow(sigma[i], -2) } } "
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- a numeric representation of the predictor variable (x)
- the total number of observed items (n)
X = model.matrix(~x, data.het1) data.het1.list <- with(data.het1, list(y = y, x = as.numeric(x), X = X, n = nrow(data.het1), ngroups = ncol(X)))
Define the MCMC chain parameters
Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:
- the nodes (estimated parameters) to monitor (return samples for)
- the number of MCMC chains (3)
- the number of burnin steps (1000)
- the thinning factor (10)
- the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains
params <- c("beta", "sigma") nChains = 3 burnInSteps = 3000 thinSteps = 10 numSavedSteps = 15000 #across all chains nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains) nIter
[1] 53000
Fit the model
Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.
## load the R2jags package library(R2jags)
data.het1.r2jags <- jags(data = data.het1.list, inits = NULL, parameters.to.save = params, model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter, n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph Resolving undeclared variables Allocating nodes Graph information: Observed stochastic nodes: 50 Unobserved stochastic nodes: 15 Total graph size: 484 Initializing model
print(data.het1.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 40.785 1.654 37.484 39.758 40.788 41.829 44.042 1.001 13000 beta[2] 5.196 2.246 0.729 3.736 5.203 6.662 9.629 1.001 15000 beta[3] 13.945 1.811 10.353 12.783 13.963 15.105 17.551 1.001 15000 beta[4] -0.725 1.661 -4.015 -1.773 -0.723 0.316 2.603 1.001 12000 beta[5] -10.649 1.671 -13.947 -11.701 -10.642 -9.600 -7.302 1.001 11000 sigma[1] 5.092 1.310 3.242 4.164 4.861 5.738 8.295 1.001 8700 sigma[2] 4.676 1.231 2.968 3.830 4.459 5.266 7.646 1.001 15000 sigma[3] 2.184 0.601 1.360 1.769 2.069 2.477 3.672 1.001 15000 sigma[4] 0.473 0.139 0.288 0.377 0.446 0.537 0.812 1.001 15000 sigma[5] 0.697 0.203 0.425 0.558 0.658 0.789 1.214 1.001 3800 deviance 192.691 5.109 184.841 188.956 191.923 195.616 204.517 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 13.1 and DIC = 205.7 DIC is an estimate of expected predictive error (lower deviance is better).
data.mcmc.list <- as.mcmc(data.het1.r2jags)
Whilst Gibbs sampling provides an elegantly simple MCMC sampling routine, very complex hierarchical models can take enormous numbers of iterations (often prohibitory large) to converge on a stable posterior distribution.
To address this, Andrew Gelman (and other collaborators) have implemented a variation on Hamiltonian Monte Carlo (HMC: a sampler that selects subsequent samples in a way that reduces the correlation between samples, thereby speeding up convergence) called the No-U-Turn (NUTS) sampler. All of these developments are brought together into a tool called Stan ("Sampling Through Adaptive Neighborhoods").
By design (to appeal to the vast BUGS users), Stan models are defined in a manner reminiscent of BUGS. Stan first converts these models into C++ code which is then compiled to allow very rapid computation.
Consistent with the use of C++, the model must be accompanied by variable declarations for all inputs and parameters.
One important difference between Stan and JAGS is that whereas BUGS (and thus JAGS) use precision rather than variance, Stan uses variance.
Stan itself is a stand-alone command line application. However, conveniently, the authors of Stan have also developed an R interface to Stan called Rstan which can be used much like R2jags.
Model matrix formulation
The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:- the normal distribution is defined by variance rather than precision
- rather than using a uniform prior for sigma, I am using a half-Cauchy
We now translate the likelihood model into STAN code.
$$\begin{align}
y_i&\sim{}N(\mu_i, \sigma\times\omega)\\
\omega&=1/X_i\\
\mu_i &= \beta_0+\beta X_i\\
\beta_0&\sim{}N(0,100)\\
\beta&\sim{}N(0,100)\\
\sigma&\sim{}Cauchy(0,5)\\
\end{align}
$$
Define the model
modelString = " data { int<lower=1> n; int<lower=1> nX; vector [n] y; matrix [n,nX] X; } parameters { vector[nX] beta; vector<lower=0>[nX] sigma; } transformed parameters { vector[n] mu; vector<lower=0>[n] sig; mu = X*beta; sig = X*sigma; } model { #Likelihood y~normal(mu,sig); #Priors beta ~ normal(0,1000); sigma~cauchy(0,5); } generated quantities { vector[n] log_lik; for (i in 1:n) { log_lik[i] = normal_lpdf(y[i] | mu[i], sig[i]); } } "
Define the data
Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:
- the response variable (y)
- the predictor model matrix (X)
- the total number of observed items (n)
Xmat <- model.matrix(~x, data.het1) data.het1.list <- with(data.het1, list(y = y, X = Xmat, n = nrow(data.het1), nX = ncol(Xmat)))
Fit the model
Now run the STAN code via the rstan interface.
## load the rstan package library(rstan)
data.het1.rstan <- stan(data = data.het1.list, model_code = modelString, chains = 3, iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0, from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4, from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4, from file68a26c3f810c.cpp:8: /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined # define BOOST_NO_CXX11_RVALUE_REFERENCES ^ <command-line>:0:0: note: this is the location of the previous definition SAMPLING FOR MODEL 'e2ff7293cce3da0e2847b4a953e89d26' NOW (CHAIN 1). Gradient evaluation took 2.8e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.081552 seconds (Warm-up) 0.175503 seconds (Sampling) 0.257055 seconds (Total) SAMPLING FOR MODEL 'e2ff7293cce3da0e2847b4a953e89d26' NOW (CHAIN 2). Gradient evaluation took 1.2e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.087268 seconds (Warm-up) 0.137277 seconds (Sampling) 0.224545 seconds (Total) SAMPLING FOR MODEL 'e2ff7293cce3da0e2847b4a953e89d26' NOW (CHAIN 3). Gradient evaluation took 1.3e-05 seconds 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Iteration: 200 / 2000 [ 10%] (Warmup) Iteration: 400 / 2000 [ 20%] (Warmup) Iteration: 501 / 2000 [ 25%] (Sampling) Iteration: 700 / 2000 [ 35%] (Sampling) Iteration: 900 / 2000 [ 45%] (Sampling) Iteration: 1100 / 2000 [ 55%] (Sampling) Iteration: 1300 / 2000 [ 65%] (Sampling) Iteration: 1500 / 2000 [ 75%] (Sampling) Iteration: 1700 / 2000 [ 85%] (Sampling) Iteration: 1900 / 2000 [ 95%] (Sampling) Iteration: 2000 / 2000 [100%] (Sampling) Elapsed Time: 0.085839 seconds (Warm-up) 0.120639 seconds (Sampling) 0.206478 seconds (Total)
print(data.het1.rstan, par = c("beta", "sigma"))
Inference for Stan model: e2ff7293cce3da0e2847b4a953e89d26. 3 chains, each with iter=2000; warmup=500; thin=3; post-warmup draws per chain=500, total post-warmup draws=1500. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta[1] 40.77 0.02 0.83 39.16 40.22 40.78 41.33 42.40 1132 1 beta[2] 5.27 0.05 1.78 1.50 4.16 5.35 6.40 8.60 1387 1 beta[3] 13.98 0.03 1.33 11.38 13.12 13.95 14.84 16.74 1500 1 beta[4] -0.74 0.04 1.32 -3.29 -1.66 -0.73 0.15 1.79 1316 1 beta[5] -10.66 0.03 1.28 -13.12 -11.53 -10.65 -9.80 -8.11 1375 1 sigma[1] 2.70 0.01 0.34 2.16 2.46 2.67 2.89 3.45 1395 1 sigma[2] 2.04 0.03 1.21 0.31 1.20 1.84 2.65 4.96 1391 1 sigma[3] 0.56 0.02 0.58 0.02 0.17 0.38 0.78 2.05 1500 1 sigma[4] 0.38 0.01 0.43 0.01 0.10 0.25 0.51 1.51 1500 1 sigma[5] 0.37 0.01 0.44 0.01 0.10 0.23 0.50 1.53 1500 1 Samples were drawn using NUTS(diag_e) at Fri Jan 12 14:25:24 2018. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
MCMC diagnostics
In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.
Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.
- Traceplots for each parameter illustrate the MCMC sample values after each successive
iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests
that the MCMC sampling chains may not have completely traversed all features of the posterior
distribution and that more iterations are required to ensure the distribution has been accurately
represented.
- Autocorrelation plot for each paramter illustrate the degree of correlation between
MCMC samples separated by different lags. For example, a lag of 0 represents the degree of
correlation between each MCMC sample and itself (obviously this will be a correlation of 1).
A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain
and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be
independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom
autocorrelation plot.
- Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
Again, prior to examining the summaries, we should have explored the convergence diagnostics.
library(coda) data.het1.mcmc = as.mcmc(data.het1.r2jags)
- Trace plots
plot(data.het1.mcmc)
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta0", "beta1") plot(as.mcmc(data.het1.r2jags)[, preds])
Error in `[.default`(x[[k]], , j, drop = drop): subscript out of bounds
- Raftery diagnostic
raftery.diag(data.het1.mcmc)
[[1]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 39950 3746 10.70 beta[2] 20 35610 3746 9.51 beta[3] 10 37120 3746 9.91 beta[4] 20 36530 3746 9.75 beta[5] 10 37120 3746 9.91 deviance 20 38660 3746 10.30 sigma[1] 20 38660 3746 10.30 sigma[2] 20 38660 3746 10.30 sigma[3] 20 39300 3746 10.50 sigma[4] 20 39300 3746 10.50 sigma[5] 20 36800 3746 9.82 [[2]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 36800 3746 9.82 beta[2] 20 37410 3746 9.99 beta[3] 20 36200 3746 9.66 beta[4] 20 37410 3746 9.99 beta[5] 20 38030 3746 10.20 deviance 20 37410 3746 9.99 sigma[1] 20 38030 3746 10.20 sigma[2] 20 36200 3746 9.66 sigma[3] 20 37410 3746 9.99 sigma[4] 20 38660 3746 10.30 sigma[5] 20 37410 3746 9.99 [[3]] Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 Burn-in Total Lower bound Dependence (M) (N) (Nmin) factor (I) beta[1] 20 38050 3746 10.20 beta[2] 20 38030 3746 10.20 beta[3] 20 37410 3746 9.99 beta[4] 20 36800 3746 9.82 beta[5] 20 36800 3746 9.82 deviance 20 38660 3746 10.30 sigma[1] 20 37410 3746 9.99 sigma[2] 20 36200 3746 9.66 sigma[3] 20 38030 3746 10.20 sigma[4] 20 36200 3746 9.66 sigma[5] 20 38660 3746 10.30
- Autocorrelation diagnostic
autocorr.diag(data.het1.mcmc)
beta[1] beta[2] beta[3] beta[4] beta[5] deviance Lag 0 1.000000000 1.000000000 1.0000000000 1.000000000 1.0000000000 1.0000000000 Lag 10 0.006603699 0.018909630 0.0009571795 0.005684844 0.0087022166 0.0213191727 Lag 50 -0.002067179 -0.011985435 -0.0033123020 -0.001565876 -0.0007810205 0.0043646350 Lag 100 -0.012927351 0.010332369 -0.0174684798 -0.012247736 -0.0119502509 -0.0007506807 Lag 500 -0.003686503 0.005948948 -0.0015486610 -0.003904431 -0.0040623726 0.0066224862 sigma[1] sigma[2] sigma[3] sigma[4] sigma[5] Lag 0 1.000000000 1.0000000000 1.000000000 1.000000000 1.000000000 Lag 10 -0.004448528 -0.0163243558 0.012285807 0.038074420 0.022099702 Lag 50 0.002346916 0.0006850755 0.009547599 -0.005532154 -0.005737194 Lag 100 0.001769861 -0.0081116144 -0.007363180 -0.004142998 0.011675582 Lag 500 0.009409273 0.0013516649 0.007830877 -0.000662670 0.025155077
Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).
- extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
- use the numerous routines that come with the rstan package
- use the routines that come with the bayesplot package
- explore the diagnostics interactively via shinystan
- via coda
- Traceplots
- Autocorrelation
library(coda) s = as.array(data.het1.rstan) wch = grep("beta|sigma", dimnames(s)$parameters) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , wch], 2, as.mcmc)) plot(mcmc)
library(coda) s = as.array(data.het1.rstan) wch = grep("beta|sigma", dimnames(s)$parameters) mcmc <- do.call(mcmc.list, plyr:::alply(s[, , wch], 2, as.mcmc)) autocorr.diag(mcmc)
beta[1] beta[2] beta[3] beta[4] beta[5] sigma[1] sigma[2] Lag 0 1.000000000 1.000000000 1.000000000 1.000000000 1.00000000 1.000000000 1.00000000 Lag 1 0.108761562 0.011384181 -0.003696379 0.072403523 0.04965819 0.035069519 0.03796329 Lag 5 -0.004933822 -0.015572611 0.007148490 -0.010417865 -0.01970024 0.017409266 -0.03821191 Lag 10 -0.023161487 0.033159056 -0.012393661 0.003153679 0.01995686 0.001256854 -0.01543960 Lag 50 -0.001947863 -0.009388251 -0.033871161 -0.025387437 -0.01485885 -0.044782697 -0.01786053 sigma[3] sigma[4] sigma[5] Lag 0 1.00000000 1.00000000 1.000000000 Lag 1 -0.03782855 -0.02271615 -0.052463821 Lag 5 0.02018003 -0.02603872 0.007162945 Lag 10 -0.01788819 -0.02657259 -0.007370835 Lag 50 0.04245349 -0.02695635 0.022472988
- via rstan
- Traceplots
stan_trace(data.het1.rstan, pars = c("beta", "sigma"))
- Raftery diagnostic
raftery.diag(data.het1.rstan)
Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
- Autocorrelation diagnostic
stan_ac(data.het1.rstan, pars = c("beta", "sigma"))
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05.
If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this
mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling
in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or
overly vague priors that permit sampling in otherwise nonscence parameter space.
stan_rhat(data.het1.rstan, pars = c("beta", "sigma"))
- Another measure of sampling efficiency is Effective Sample Size (ess).
ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples
on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain
may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
stan_ess(data.het1.rstan, pars = c("beta", "sigma"))
- Traceplots
- via bayesplot
- Trace plots and density plots
library(bayesplot) mcmc_trace(as.array(data.het1.rstan), regex_pars = "beta|sigma")
library(bayesplot) mcmc_combo(as.array(data.het1.rstan), regex_pars = "beta|sigma")
- Density plots
library(bayesplot) mcmc_dens(as.array(data.het1.rstan), regex_pars = "beta|sigma")
- Trace plots and density plots
- via shinystan
library(shinystan) launch_shinystan(data.het1.rstan))
- It is worth exploring the influence of our priors.
Model validation
Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.
For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.
There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.
Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:
Standardized residuals: | the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known). | |
Studentized residuals: | the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out. | |
Pearson residuals: | the raw residuals divided by the standard deviation of the response variable. |
The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.
The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.
Although residuals can be computed directly within R2jags, we can calculate them manually from the posteriors to be consistent across other approaches.
mcmc = data.het1.r2jags$BUGSoutput$sims.matrix wch = grep("beta", colnames(mcmc)) # generate a model matrix newdata = data.frame(x = data.het1$x) Xmat = model.matrix(~x, newdata) ## get median parameter estimates coefs = apply(mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data.het1$y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
The above residual plot would make us believe that we had a homogeneity of variance issue (which we thought we were addressing by defining a model that allowed the variance to be proportional to the predictor). This is because we have plotted the raw residuals rather than residuals that have been standardized by the variances. The above plot is also what the residual plot would look like if we had not made any attempt to define a model in which the variance was related to the predictor.
Whenever we fit a model that incorporates changes to the variance-covariance structures, we should explore standardized residuals. In this case, we should divide the residuals by the appropriate sigma for associated with that group (level of predictor). $$ Res_ij = \frac{Y_{ij} - \mu_j}{\sigma_j}\\ $$
mcmc = data.het1.r2jags$BUGSoutput$sims.matrix wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] Xmat = model.matrix(~x, data.het1) fit = coefs %*% t(Xmat) resid = -1 * sweep(fit, 2, data.het1$y, "-") wch = grep("sigma", colnames(mcmc)) resid = apply(resid, 2, median)/rep(apply(mcmc[, wch], 2, median), table(data.het1$x)) # resid = apply(resid,2,median)/(median(mcmc[,'sigma']) * sqrt(data.het1$x)) fit = apply(fit, 2, median) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Conclusions:This is certainly an improvement. Nevertheless, there is still an indication of a relationship between mean and variance. We could attempt to further address this by refining $\omega$ in the Bayesian model. That is, rather than indicate that variance is proportional to $x$, we could indicate that variance is proportional to $x^2$ (as an example) - we will leave this as an exercise for the reader.
Residuals against predictors
ggplot() + geom_point(data = NULL, aes(y = resid, x = data.het1$x))
Lets see how well data simulated from the model reflects the raw data
mcmc = data.het1.r2jags$BUGSoutput$sims.matrix # generate a model matrix Xmat = model.matrix(~x, data.het1) ## get median parameter estimates wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) ## draw samples from this model wch = grep("sigma", colnames(mcmc)) yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data.het1), fit[i, ], mcmc[i, wch[as.numeric(data.het1$x[i])]])) newdata = data.frame(x = data.het1$x, yRep) %>% gather(key = Sample, value = Value, -x) ggplot(newdata) + geom_violin(aes(y = Value, x = x, fill = "Model"), alpha = 0.5) + geom_violin(data = data.het1, aes(y = y, x = x, fill = "Obs"), alpha = 0.5) + geom_point(data = data.het1, aes(y = y, x = x), position = position_jitter(width = 0.1, height = 0), color = "black")
Although residuals can be computed directly within R2jags, we can calculate them manually from the posteriors to be consistent across other approaches.
mcmc = as.matrix(data.het1.rstan) wch = grep("beta", colnames(mcmc)) # generate a model matrix newdata = data.frame(x = data.het1$x) Xmat = model.matrix(~x, newdata) ## get median parameter estimates coefs = apply(mcmc[, wch], 2, median) fit = as.vector(coefs %*% t(Xmat)) resid = data.het1$y - fit ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
The above residual plot would make us believe that we had a homogeneity of variance issue (which we thought we were addressing by defining a model that allowed the variance to be proportional to the predictor). This is because we have plotted the raw residuals rather than residuals that have been standardized by the variances. The above plot is also what the residual plot would look like if we had not made any attempt to define a model in which the variance was related to the predictor.
Whenever we fit a model that incorporates changes to the variance-covariance structures, we should explore standardized residuals. In this case, we should divide the residuals by the appropriate sigma for associated with that group (level of predictor). $$ Res_ij = \frac{Y_{ij} - \mu_j}{\sigma_j}\\ $$
mcmc = as.matrix(data.het1.rstan) wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] Xmat = model.matrix(~x, data.het1) fit = coefs %*% t(Xmat) resid = -1 * sweep(fit, 2, data.het1$y, "-") wch = grep("sigma", colnames(mcmc)) resid = apply(resid, 2, median)/rep(apply(mcmc[, wch], 2, median), table(data.het1$x)) fit = apply(fit, 2, median) ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
Conclusions:This is certainly an improvement. Nevertheless, there is still an indication of a relationship between mean and variance. We could attempt to further address this by refining $\omega$ in the Bayesian model. That is, rather than indicate that variance is proportional to $x$, we could indicate that variance is proportional to $x^2$ (as an example) - we will leave this as an exercise for the reader.
Residuals against predictors
ggplot() + geom_point(data = NULL, aes(y = resid, x = data.het1$x))
Lets see how well data simulated from the model reflects the raw data
mcmc = as.matrix(data.het1.rstan) # generate a model matrix Xmat = model.matrix(~x, data.het1) ## get median parameter estimates wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) ## draw samples from this model wch = grep("sigma", colnames(mcmc)) yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data.het1), fit[i, ], mcmc[i, wch[as.numeric(data.het1$x[i])]])) newdata = data.frame(x = data.het1$x, yRep) %>% gather(key = Sample, value = Value, -x) ggplot(newdata) + geom_violin(aes(y = Value, x = x, fill = "Model"), alpha = 0.5) + geom_violin(data = data.het1, aes(y = y, x = x, fill = "Obs"), alpha = 0.5) + geom_point(data = data.het1, aes(y = y, x = x), position = position_jitter(width = 0.1, height = 0), color = "black")
Parameter estimates (posterior summaries)
Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.
print(data.het1.r2jags)
Inference for Bugs model at "5", fit using jags, 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10 n.sims = 15000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff beta[1] 40.785 1.654 37.484 39.758 40.788 41.829 44.042 1.001 13000 beta[2] 5.196 2.246 0.729 3.736 5.203 6.662 9.629 1.001 15000 beta[3] 13.945 1.811 10.353 12.783 13.963 15.105 17.551 1.001 15000 beta[4] -0.725 1.661 -4.015 -1.773 -0.723 0.316 2.603 1.001 12000 beta[5] -10.649 1.671 -13.947 -11.701 -10.642 -9.600 -7.302 1.001 11000 sigma[1] 5.092 1.310 3.242 4.164 4.861 5.738 8.295 1.001 8700 sigma[2] 4.676 1.231 2.968 3.830 4.459 5.266 7.646 1.001 15000 sigma[3] 2.184 0.601 1.360 1.769 2.069 2.477 3.672 1.001 15000 sigma[4] 0.473 0.139 0.288 0.377 0.446 0.537 0.812 1.001 15000 sigma[5] 0.697 0.203 0.425 0.558 0.658 0.789 1.214 1.001 3800 deviance 192.691 5.109 184.841 188.956 191.923 195.616 204.517 1.001 15000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 13.1 and DIC = 205.7 DIC is an estimate of expected predictive error (lower deviance is better).
# OR library(broom) tidyMCMC(as.mcmc(data.het1.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
term estimate std.error conf.low conf.high 1 beta[1] 40.7850424 1.6542208 37.6371542 44.176439 2 beta[2] 5.1957407 2.2459418 0.7257995 9.629303 3 beta[3] 13.9450143 1.8114036 10.2871931 17.458411 4 beta[4] -0.7248304 1.6606094 -4.0956109 2.494425 5 beta[5] -10.6491390 1.6711343 -14.0674955 -7.427036 6 deviance 192.6910835 5.1086446 184.0758276 202.948858 7 sigma[1] 5.0917776 1.3104984 2.9554600 7.701112 8 sigma[2] 4.6763204 1.2311885 2.7129254 7.017940 9 sigma[3] 2.1839291 0.6009005 1.2405276 3.377972 10 sigma[4] 0.4734292 0.1394671 0.2586298 0.743741 11 sigma[5] 0.6968348 0.2033296 0.3760953 1.086726
- the mean of the first group (A) is
40.7850424
- the mean of the second group (B) is
5.1957407
units greater than (A) - the mean of the third group (C) is
13.9450143
units greater than (A) - the mean of the forth group (D) is
-0.7248304
units greater (i.e. less) than (A) - the mean of the fifth group (E) is
-10.649139
units greater (i.e. less) than (A)
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } } ## since values are less than zero mcmc = data.het1.r2jags$BUGSoutput$sims.matrix for (i in grep("beta", colnames(mcmc), value = TRUE)) print(paste(i, mcmcpvalue(mcmc[, i])))
[1] "beta[1] 0" [1] "beta[2] 0.0256666666666667" [1] "beta[3] 0" [1] "beta[4] 0.637533333333333" [1] "beta[5] 0"
mcmcpvalue(mcmc[, grep("beta", colnames(mcmc))])
[1] 0
With a p-value of essentially 0, we would conclude that there is almost no evidence that the slope was likely to be equal to zero, suggesting there is a relationship.
summary(data.het1.rstan)
$summary mean se_mean sd 2.5% 25% 50% 75% beta[1] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 beta[2] 5.2712547 0.047769475 1.7790475 1.501944030 4.16281964 5.3485634 6.4005936 beta[3] 13.9814691 0.034313016 1.3289374 11.379003146 13.11924687 13.9499366 14.8399112 beta[4] -0.7422862 0.036261724 1.3155776 -3.294380302 -1.65606498 -0.7256937 0.1451465 beta[5] -10.6614901 0.034482231 1.2787261 -13.115652123 -11.52846014 -10.6476838 -9.8003538 sigma[1] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sigma[2] 2.0364122 0.032373531 1.2075927 0.306000389 1.20460273 1.8442266 2.6532291 sigma[3] 0.5628853 0.015012494 0.5814314 0.017854206 0.17479857 0.3820855 0.7757560 sigma[4] 0.3846713 0.011019046 0.4267658 0.006912231 0.10245830 0.2492161 0.5060454 sigma[5] 0.3742896 0.011297714 0.4375586 0.008776779 0.09664511 0.2310830 0.5007172 mu[1] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[2] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[3] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[4] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[5] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[6] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[7] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[8] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[9] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[10] 40.7738740 0.024718668 0.8317252 39.155154826 40.21910248 40.7752284 41.3297589 mu[11] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[12] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[13] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[14] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[15] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[16] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[17] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[18] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[19] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[20] 46.0451287 0.041268554 1.5983242 42.827343704 45.10248872 46.0960302 47.0187735 mu[21] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[22] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[23] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[24] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[25] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[26] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[27] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[28] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[29] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[30] 54.7553431 0.026368752 1.0212574 52.728625818 54.09673115 54.7345090 55.3928841 mu[31] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[32] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[33] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[34] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[35] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[36] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[37] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[38] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[39] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[40] 40.0315878 0.025664359 0.9939763 38.031759362 39.38117569 40.0647110 40.7109113 mu[41] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[42] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[43] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[44] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[45] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[46] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[47] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[48] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[49] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 mu[50] 30.1123839 0.025695462 0.9951810 28.254798284 29.45510087 30.0947810 30.7365997 sig[1] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[2] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[3] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[4] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[5] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[6] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[7] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[8] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[9] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[10] 2.7004282 0.008997658 0.3360063 2.158164242 2.45867625 2.6726864 2.8923518 sig[11] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[12] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[13] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[14] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[15] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[16] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[17] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[18] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[19] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[20] 4.7368405 0.031485862 1.1788020 3.055702754 3.91041206 4.5409221 5.3858095 sig[21] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[22] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[23] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[24] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[25] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[26] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[27] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[28] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[29] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[30] 3.2633135 0.017205945 0.6663834 2.353417562 2.82058706 3.1409900 3.5745387 sig[31] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[32] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[33] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[34] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[35] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[36] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[37] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[38] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[39] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[40] 3.0850996 0.014907775 0.5694002 2.270615267 2.70153526 2.9950922 3.3499556 sig[41] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[42] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[43] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[44] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[45] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[46] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[47] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[48] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[49] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 sig[50] 3.0747179 0.014768591 0.5658406 2.259440637 2.70170932 2.9798159 3.3487467 log_lik[1] -3.4240449 0.016637710 0.5780161 -4.722889183 -3.76208027 -3.3623782 -3.0011371 log_lik[2] -1.9593423 0.003841752 0.1409695 -2.264188326 -2.04758570 -1.9500383 -1.8561461 log_lik[3] -4.3523990 0.023452583 0.8131038 -6.094140334 -4.86258712 -4.2860232 -3.7637863 log_lik[4] -7.4998435 0.043977463 1.5913399 -11.198891667 -8.44605022 -7.3397304 -6.3658948 log_lik[5] -2.0554220 0.005322932 0.1821811 -2.484721137 -2.16000329 -2.0273872 -1.9264318 log_lik[6] -4.2775255 0.022915476 0.7944084 -5.972271553 -4.77347530 -4.2107198 -3.7026028 log_lik[7] -2.2832356 0.007674743 0.2625414 -2.914356125 -2.43035118 -2.2441994 -2.0905816 log_lik[8] -2.9098430 0.012532743 0.4452326 -3.903595497 -3.17724093 -2.8573023 -2.5886100 log_lik[9] -2.4668368 0.008980682 0.3195303 -3.220101803 -2.65329760 -2.4171129 -2.2401503 log_lik[10] -2.4383905 0.008478281 0.3088511 -3.157087648 -2.60966763 -2.3886935 -2.2058476 log_lik[11] -3.1559451 0.010461430 0.3859263 -4.048496446 -3.37357060 -3.0870201 -2.8835626 log_lik[12] -2.5062180 0.006847935 0.2474331 -3.020060268 -2.66336762 -2.4908089 -2.3391005 log_lik[13] -2.8205738 0.007398486 0.2865421 -3.469124245 -2.98563835 -2.7924319 -2.6131572 log_lik[14] -5.0477153 0.034011473 1.1699932 -7.930043952 -5.67988432 -4.8533128 -4.1944821 log_lik[15] -2.8132604 0.007954993 0.2882831 -3.449790205 -2.97752344 -2.7801118 -2.6146090 log_lik[16] -2.5368441 0.006592724 0.2462753 -3.081749142 -2.68796185 -2.5199875 -2.3615596 log_lik[17] -2.5299496 0.006612126 0.2461193 -3.069014720 -2.68382902 -2.5128207 -2.3539188 log_lik[18] -2.6957566 0.007414443 0.2660268 -3.269295541 -2.84323007 -2.6682373 -2.5055146 log_lik[19] -2.6317359 0.007201146 0.2572916 -3.188264082 -2.77956745 -2.6089874 -2.4519832 log_lik[20] -2.5462376 0.006982285 0.2496870 -3.068626835 -2.70860405 -2.5324755 -2.3706372 log_lik[21] -2.3547034 0.006053021 0.2344325 -2.882815289 -2.49418277 -2.3255742 -2.1874851 log_lik[22] -2.2994503 0.005750422 0.2227129 -2.818181909 -2.42859172 -2.2744335 -2.1396716 log_lik[23] -2.1376236 0.005645438 0.1982263 -2.599119327 -2.25213992 -2.1107273 -1.9983931 log_lik[24] -2.8515193 0.009784339 0.3789458 -3.746838555 -3.04714994 -2.7988952 -2.5822488 log_lik[25] -2.2439800 0.005479965 0.2122381 -2.734918268 -2.36204164 -2.2216429 -2.0954695 log_lik[26] -2.1304468 0.005703217 0.1984305 -2.604699045 -2.24405805 -2.1061132 -1.9929219 log_lik[27] -2.1297338 0.005756838 0.1993023 -2.607259484 -2.24518185 -2.1039756 -1.9878677 log_lik[28] -2.5059881 0.007357603 0.2849588 -3.177182113 -2.66505190 -2.4665314 -2.3062035 log_lik[29] -2.1556325 0.005324918 0.2062332 -2.647006419 -2.27110004 -2.1299313 -2.0086925 log_lik[30] -2.1902290 0.005260228 0.2037278 -2.653794457 -2.31218002 -2.1627901 -2.0480019 log_lik[31] -2.1039876 0.005124865 0.1932191 -2.552393089 -2.21521409 -2.0838931 -1.9710909 log_lik[32] -2.0812924 0.005000845 0.1839032 -2.501070162 -2.18334843 -2.0619235 -1.9538397 log_lik[33] -2.0820978 0.004987682 0.1852212 -2.507402646 -2.18859625 -2.0625684 -1.9575671 log_lik[34] -2.0810636 0.004995580 0.1839512 -2.499582998 -2.18476824 -2.0619935 -1.9552870 log_lik[35] -2.1113603 0.005248409 0.1894442 -2.526362398 -2.22326843 -2.0936271 -1.9831162 log_lik[36] -2.0843636 0.005041745 0.1840402 -2.499168159 -2.18892613 -2.0637836 -1.9535538 log_lik[37] -2.0840736 0.005038641 0.1840073 -2.500376978 -2.18802180 -2.0641689 -1.9541281 log_lik[38] -2.0810859 0.004996156 0.1839449 -2.499748569 -2.18509418 -2.0618835 -1.9551317 log_lik[39] -2.0955216 0.005074273 0.1903952 -2.536361540 -2.20579619 -2.0728992 -1.9664048 log_lik[40] -2.0873649 0.005023804 0.1874716 -2.498912869 -2.19712241 -2.0669180 -1.9587648 log_lik[41] -2.0811073 0.005287726 0.1873743 -2.489875634 -2.18382610 -2.0578996 -1.9572516 log_lik[42] -2.0844387 0.005293213 0.1880711 -2.498309643 -2.18791678 -2.0618258 -1.9591310 log_lik[43] -2.0959147 0.005127847 0.1920156 -2.528954338 -2.20697107 -2.0739726 -1.9645026 log_lik[44] -2.0876805 0.005077950 0.1898549 -2.516871948 -2.19475412 -2.0638911 -1.9595167 log_lik[45] -2.1139006 0.005364461 0.1948161 -2.532380211 -2.21774510 -2.0929068 -1.9825809 log_lik[46] -2.1156584 0.005369423 0.1952347 -2.537604797 -2.22093890 -2.0949475 -1.9845588 log_lik[47] -2.0800813 0.005027508 0.1877892 -2.495595453 -2.18542400 -2.0578321 -1.9530966 log_lik[48] -2.1009877 0.005157634 0.1933329 -2.541793250 -2.21353733 -2.0796863 -1.9686801 log_lik[49] -2.0795741 0.005285760 0.1870693 -2.487670381 -2.18314480 -2.0548800 -1.9535654 log_lik[50] -2.1101602 0.005210557 0.1957032 -2.569282977 -2.22353547 -2.0873122 -1.9760262 lp__ -85.2044141 0.078855694 2.5397119 -91.275402322 -86.60809171 -84.8514564 -83.3494041 97.5% n_eff Rhat beta[1] 42.402410 1132.165 0.9990285 beta[2] 8.603538 1386.993 0.9991307 beta[3] 16.741698 1500.000 0.9985816 beta[4] 1.790178 1316.243 0.9983823 beta[5] -8.109900 1375.195 0.9985087 sigma[1] 3.451063 1394.555 0.9999528 sigma[2] 4.959012 1391.428 0.9996267 sigma[3] 2.053646 1500.000 0.9994395 sigma[4] 1.505785 1500.000 0.9989855 sigma[5] 1.530478 1500.000 0.9982514 mu[1] 42.402410 1132.165 0.9990285 mu[2] 42.402410 1132.165 0.9990285 mu[3] 42.402410 1132.165 0.9990285 mu[4] 42.402410 1132.165 0.9990285 mu[5] 42.402410 1132.165 0.9990285 mu[6] 42.402410 1132.165 0.9990285 mu[7] 42.402410 1132.165 0.9990285 mu[8] 42.402410 1132.165 0.9990285 mu[9] 42.402410 1132.165 0.9990285 mu[10] 42.402410 1132.165 0.9990285 mu[11] 49.101303 1500.000 0.9987779 mu[12] 49.101303 1500.000 0.9987779 mu[13] 49.101303 1500.000 0.9987779 mu[14] 49.101303 1500.000 0.9987779 mu[15] 49.101303 1500.000 0.9987779 mu[16] 49.101303 1500.000 0.9987779 mu[17] 49.101303 1500.000 0.9987779 mu[18] 49.101303 1500.000 0.9987779 mu[19] 49.101303 1500.000 0.9987779 mu[20] 49.101303 1500.000 0.9987779 mu[21] 56.810588 1500.000 1.0004037 mu[22] 56.810588 1500.000 1.0004037 mu[23] 56.810588 1500.000 1.0004037 mu[24] 56.810588 1500.000 1.0004037 mu[25] 56.810588 1500.000 1.0004037 mu[26] 56.810588 1500.000 1.0004037 mu[27] 56.810588 1500.000 1.0004037 mu[28] 56.810588 1500.000 1.0004037 mu[29] 56.810588 1500.000 1.0004037 mu[30] 56.810588 1500.000 1.0004037 mu[31] 41.929967 1500.000 0.9991494 mu[32] 41.929967 1500.000 0.9991494 mu[33] 41.929967 1500.000 0.9991494 mu[34] 41.929967 1500.000 0.9991494 mu[35] 41.929967 1500.000 0.9991494 mu[36] 41.929967 1500.000 0.9991494 mu[37] 41.929967 1500.000 0.9991494 mu[38] 41.929967 1500.000 0.9991494 mu[39] 41.929967 1500.000 0.9991494 mu[40] 41.929967 1500.000 0.9991494 mu[41] 32.076056 1500.000 0.9985074 mu[42] 32.076056 1500.000 0.9985074 mu[43] 32.076056 1500.000 0.9985074 mu[44] 32.076056 1500.000 0.9985074 mu[45] 32.076056 1500.000 0.9985074 mu[46] 32.076056 1500.000 0.9985074 mu[47] 32.076056 1500.000 0.9985074 mu[48] 32.076056 1500.000 0.9985074 mu[49] 32.076056 1500.000 0.9985074 mu[50] 32.076056 1500.000 0.9985074 sig[1] 3.451063 1394.555 0.9999528 sig[2] 3.451063 1394.555 0.9999528 sig[3] 3.451063 1394.555 0.9999528 sig[4] 3.451063 1394.555 0.9999528 sig[5] 3.451063 1394.555 0.9999528 sig[6] 3.451063 1394.555 0.9999528 sig[7] 3.451063 1394.555 0.9999528 sig[8] 3.451063 1394.555 0.9999528 sig[9] 3.451063 1394.555 0.9999528 sig[10] 3.451063 1394.555 0.9999528 sig[11] 7.729039 1401.685 1.0002103 sig[12] 7.729039 1401.685 1.0002103 sig[13] 7.729039 1401.685 1.0002103 sig[14] 7.729039 1401.685 1.0002103 sig[15] 7.729039 1401.685 1.0002103 sig[16] 7.729039 1401.685 1.0002103 sig[17] 7.729039 1401.685 1.0002103 sig[18] 7.729039 1401.685 1.0002103 sig[19] 7.729039 1401.685 1.0002103 sig[20] 7.729039 1401.685 1.0002103 sig[21] 4.969244 1500.000 1.0005360 sig[22] 4.969244 1500.000 1.0005360 sig[23] 4.969244 1500.000 1.0005360 sig[24] 4.969244 1500.000 1.0005360 sig[25] 4.969244 1500.000 1.0005360 sig[26] 4.969244 1500.000 1.0005360 sig[27] 4.969244 1500.000 1.0005360 sig[28] 4.969244 1500.000 1.0005360 sig[29] 4.969244 1500.000 1.0005360 sig[30] 4.969244 1500.000 1.0005360 sig[31] 4.390017 1458.847 0.9993640 sig[32] 4.390017 1458.847 0.9993640 sig[33] 4.390017 1458.847 0.9993640 sig[34] 4.390017 1458.847 0.9993640 sig[35] 4.390017 1458.847 0.9993640 sig[36] 4.390017 1458.847 0.9993640 sig[37] 4.390017 1458.847 0.9993640 sig[38] 4.390017 1458.847 0.9993640 sig[39] 4.390017 1458.847 0.9993640 sig[40] 4.390017 1458.847 0.9993640 sig[41] 4.418945 1467.946 0.9988473 sig[42] 4.418945 1467.946 0.9988473 sig[43] 4.418945 1467.946 0.9988473 sig[44] 4.418945 1467.946 0.9988473 sig[45] 4.418945 1467.946 0.9988473 sig[46] 4.418945 1467.946 0.9988473 sig[47] 4.418945 1467.946 0.9988473 sig[48] 4.418945 1467.946 0.9988473 sig[49] 4.418945 1467.946 0.9988473 sig[50] 4.418945 1467.946 0.9988473 log_lik[1] -2.501799 1206.960 0.9984987 log_lik[2] -1.717507 1346.455 1.0001184 log_lik[3] -3.008023 1202.017 0.9985192 log_lik[4] -4.865147 1309.380 0.9997547 log_lik[5] -1.754890 1171.399 0.9991770 log_lik[6] -2.965525 1201.794 0.9985150 log_lik[7] -1.891217 1170.221 0.9988975 log_lik[8] -2.223393 1262.065 0.9991240 log_lik[9] -1.983603 1265.917 0.9989395 log_lik[10] -1.971880 1327.036 0.9988622 log_lik[11] -2.590198 1360.901 0.9985900 log_lik[12] -2.063288 1305.558 1.0017361 log_lik[13] -2.353676 1500.000 1.0013926 log_lik[14] -3.375097 1183.357 1.0002960 log_lik[15] -2.345069 1313.284 0.9990442 log_lik[16] -2.113051 1395.443 1.0023498 log_lik[17] -2.108031 1385.508 1.0023427 log_lik[18] -2.262510 1287.342 0.9996859 log_lik[19] -2.195284 1276.580 1.0001976 log_lik[20] -2.105692 1278.782 1.0011046 log_lik[21] -1.963804 1500.000 0.9998852 log_lik[22] -1.931180 1500.000 0.9999195 log_lik[23] -1.814134 1232.898 1.0010096 log_lik[24] -2.269651 1500.000 1.0005573 log_lik[25] -1.895984 1500.000 1.0000451 log_lik[26] -1.810244 1210.534 1.0012902 log_lik[27] -1.801217 1198.551 1.0014955 log_lik[28] -2.065220 1500.000 1.0014677 log_lik[29] -1.834852 1500.000 1.0020260 log_lik[30] -1.865157 1500.000 1.0003251 log_lik[31] -1.782736 1421.461 0.9986670 log_lik[32] -1.775632 1352.359 0.9993696 log_lik[33] -1.768732 1379.062 0.9991004 log_lik[34] -1.774804 1355.919 0.9993422 log_lik[35] -1.788369 1302.888 1.0000244 log_lik[36] -1.782672 1332.489 0.9995444 log_lik[37] -1.783180 1333.654 0.9995329 log_lik[38] -1.774709 1355.513 0.9993453 log_lik[39] -1.780424 1407.875 0.9987635 log_lik[40] -1.770142 1392.533 0.9989116 log_lik[41] -1.767881 1255.690 0.9991205 log_lik[42] -1.774199 1262.424 0.9990476 log_lik[43] -1.778898 1402.178 0.9996894 log_lik[44] -1.772902 1397.870 0.9996264 log_lik[45] -1.792548 1318.857 0.9987086 log_lik[46] -1.792371 1322.084 0.9986955 log_lik[47] -1.772828 1395.197 0.9995170 log_lik[48] -1.782323 1405.111 0.9997156 log_lik[49] -1.764347 1252.535 0.9991631 log_lik[50] -1.783132 1410.677 0.9997481 lp__ -81.473498 1037.296 1.0029999 $c_summary , , chains = chain:1 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 beta[2] 5.2281927 1.5885934 1.993541292 4.17690695 5.3000160 6.2968627 8.154668 beta[3] 13.9556827 1.3142590 11.343302336 13.12109330 13.9531619 14.8291287 16.765310 beta[4] -0.7232736 1.3180504 -3.330836453 -1.62311202 -0.7253056 0.2106963 1.694791 beta[5] -10.6437187 1.1879234 -12.908818218 -11.41312821 -10.7123233 -9.7812215 -8.239104 sigma[1] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sigma[2] 1.9532387 1.1691994 0.310878911 1.15758818 1.7509422 2.5755265 4.655595 sigma[3] 0.5690396 0.5612323 0.027512599 0.19576148 0.3984997 0.7722033 2.054960 sigma[4] 0.3786102 0.4472500 0.003130016 0.08858497 0.2380481 0.4855946 1.606986 sigma[5] 0.3654333 0.3955284 0.010914764 0.09959668 0.2433846 0.4977710 1.470844 mu[1] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[2] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[3] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[4] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[5] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[6] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[7] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[8] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[9] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[10] 40.7712155 0.7820977 39.166723948 40.25349564 40.7846422 41.2910400 42.232372 mu[11] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[12] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[13] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[14] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[15] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[16] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[17] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[18] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[19] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[20] 45.9994081 1.4277081 43.157448829 45.07294479 46.0815415 46.8765420 48.678497 mu[21] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[22] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[23] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[24] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[25] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[26] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[27] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[28] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[29] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[30] 54.7268981 1.0014415 52.724885873 54.12401883 54.7633874 55.3416371 56.758740 mu[31] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[32] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[33] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[34] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[35] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[36] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[37] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[38] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[39] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[40] 40.0479419 0.9847646 38.033676868 39.45103414 40.0287496 40.7281681 41.854007 mu[41] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[42] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[43] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[44] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[45] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[46] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[47] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[48] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[49] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 mu[50] 30.1274968 0.9215629 28.350121720 29.53120487 30.1189889 30.6986697 32.022163 sig[1] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[2] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[3] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[4] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[5] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[6] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[7] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[8] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[9] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[10] 2.6902343 0.3256680 2.143039552 2.43955475 2.6749490 2.8950809 3.408437 sig[11] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[12] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[13] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[14] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[15] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[16] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[17] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[18] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[19] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[20] 4.6434730 1.1431163 2.963369049 3.84268848 4.4186231 5.2348328 7.513994 sig[21] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[22] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[23] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[24] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[25] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[26] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[27] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[28] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[29] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[30] 3.2592739 0.6625141 2.360399809 2.82622369 3.1399260 3.5578411 4.904657 sig[31] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[32] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[33] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[34] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[35] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[36] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[37] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[38] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[39] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[40] 3.0688445 0.5872969 2.260650021 2.65414310 2.9823619 3.3375528 4.595977 sig[41] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[42] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[43] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[44] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[45] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[46] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[47] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[48] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[49] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 sig[50] 3.0556676 0.5285417 2.252487776 2.68925554 2.9731857 3.3107350 4.339170 log_lik[1] -3.4276415 0.5564152 -4.699183291 -3.76173907 -3.3704931 -3.0308577 -2.504404 log_lik[2] -1.9498687 0.1380079 -2.230651409 -2.03734155 -1.9446876 -1.8514272 -1.713213 log_lik[3] -4.3629830 0.7958325 -6.131424981 -4.85106331 -4.2963348 -3.8347039 -3.064220 log_lik[4] -7.5207365 1.5348669 -11.153968681 -8.36215378 -7.3856950 -6.4386177 -4.905549 log_lik[5] -2.0459578 0.1762082 -2.464311073 -2.13662381 -2.0286888 -1.9218682 -1.753681 log_lik[6] -4.2875533 0.7767423 -6.018058678 -4.76257393 -4.2247086 -3.7716176 -3.016548 log_lik[7] -2.2745943 0.2483167 -2.902239749 -2.41750622 -2.2328603 -2.0956559 -1.877912 log_lik[8] -2.9042473 0.4168071 -3.809911671 -3.13844598 -2.8551794 -2.6070526 -2.264681 log_lik[9] -2.4590246 0.3001600 -3.200305975 -2.63034001 -2.4146832 -2.2474027 -1.996026 log_lik[10] -2.4341466 0.2872007 -3.057075255 -2.60179102 -2.3885904 -2.2250858 -1.990733 log_lik[11] -3.1620471 0.3591573 -3.944546037 -3.39603022 -3.1009987 -2.9179362 -2.618895 log_lik[12] -2.4803206 0.2395928 -2.993352835 -2.64067645 -2.4573266 -2.3211827 -2.046196 log_lik[13] -2.7993057 0.2680092 -3.392484041 -2.96386132 -2.7762105 -2.6037193 -2.363654 log_lik[14] -5.0981467 1.1747496 -8.000770131 -5.69594273 -4.9348380 -4.2263898 -3.518622 log_lik[15] -2.8039160 0.2660114 -3.378684484 -2.97490156 -2.7712432 -2.6096778 -2.367658 log_lik[16] -2.5090490 0.2371291 -3.004204813 -2.65752858 -2.4969647 -2.3408580 -2.092734 log_lik[17] -2.5020989 0.2373165 -2.986854853 -2.65202117 -2.4887898 -2.3355531 -2.077921 log_lik[18] -2.6807773 0.2479767 -3.220011230 -2.83031651 -2.6602606 -2.4961649 -2.267046 log_lik[19] -2.6135193 0.2422569 -3.122272453 -2.75362201 -2.6023348 -2.4334201 -2.194719 log_lik[20] -2.5232536 0.2393802 -3.021464325 -2.69239382 -2.5077976 -2.3544194 -2.079816 log_lik[21] -2.3529617 0.2378338 -2.890560497 -2.49109544 -2.3055165 -2.1940377 -1.964218 log_lik[22] -2.2974417 0.2267245 -2.831043198 -2.43407948 -2.2610772 -2.1419439 -1.933899 log_lik[23] -2.1342576 0.1982148 -2.586996371 -2.25622319 -2.1052380 -1.9964909 -1.812753 log_lik[24] -2.8444151 0.3622850 -3.680153483 -3.05339340 -2.8071634 -2.5992275 -2.263980 log_lik[25] -2.2416566 0.2163522 -2.728424431 -2.35858640 -2.2063442 -2.0949411 -1.899763 log_lik[26] -2.1268343 0.1966760 -2.586659608 -2.24419738 -2.0973610 -1.9930154 -1.817362 log_lik[27] -2.1259342 0.1961062 -2.602546917 -2.24825122 -2.0926418 -1.9939523 -1.811229 log_lik[28] -2.4997922 0.2674056 -3.077206725 -2.66327573 -2.4713486 -2.3135951 -2.070782 log_lik[29] -2.1512331 0.1980887 -2.602303947 -2.26242258 -2.1205299 -2.0128001 -1.846736 log_lik[30] -2.1875169 0.2070461 -2.639321602 -2.30959208 -2.1521049 -2.0518730 -1.869914 log_lik[31] -2.0971980 0.1965655 -2.554869117 -2.21288287 -2.0804406 -1.9511017 -1.784142 log_lik[32] -2.0765637 0.1868444 -2.505998235 -2.17590300 -2.0600690 -1.9414516 -1.773494 log_lik[33] -2.0765768 0.1882130 -2.495308070 -2.18057229 -2.0579997 -1.9474952 -1.770441 log_lik[34] -2.0762512 0.1868916 -2.507284408 -2.17778232 -2.0596601 -1.9424826 -1.770956 log_lik[35] -2.1091658 0.1930422 -2.550633206 -2.21446040 -2.0939439 -1.9802897 -1.786343 log_lik[36] -2.0801923 0.1870232 -2.498427283 -2.17898964 -2.0638629 -1.9477092 -1.782672 log_lik[37] -2.0798641 0.1869854 -2.498895814 -2.17792519 -2.0637119 -1.9465831 -1.783182 log_lik[38] -2.0762828 0.1868853 -2.507138440 -2.17754434 -2.0596394 -1.9425534 -1.771307 log_lik[39] -2.0890309 0.1936285 -2.539404508 -2.20417581 -2.0705719 -1.9433038 -1.783324 log_lik[40] -2.0813060 0.1905732 -2.513320648 -2.19006284 -2.0626364 -1.9396220 -1.770762 log_lik[41] -2.0726730 0.1765904 -2.461108649 -2.17467052 -2.0581457 -1.9509392 -1.744964 log_lik[42] -2.0763786 0.1769647 -2.469938931 -2.18019715 -2.0606214 -1.9564720 -1.746885 log_lik[43] -2.0842115 0.1835595 -2.475888431 -2.20069927 -2.0688584 -1.9577642 -1.771181 log_lik[44] -2.0764652 0.1811937 -2.465661088 -2.18739176 -2.0574235 -1.9508650 -1.771465 log_lik[45] -2.1077747 0.1822350 -2.512563731 -2.21494388 -2.0901257 -1.9928534 -1.775522 log_lik[46] -2.1096195 0.1825989 -2.514542560 -2.21490203 -2.0915309 -1.9953637 -1.777116 log_lik[47] -2.0695620 0.1787097 -2.452677444 -2.17590827 -2.0533057 -1.9431756 -1.773198 log_lik[48] -2.0890423 0.1849450 -2.483495132 -2.20828761 -2.0733427 -1.9612603 -1.770962 log_lik[49] -2.0709232 0.1764735 -2.454331533 -2.17430354 -2.0537144 -1.9482291 -1.744837 log_lik[50] -2.0978429 0.1873693 -2.501409736 -2.21415430 -2.0802154 -1.9690230 -1.772508 lp__ -84.9341893 2.4059787 -90.674590519 -86.44105562 -84.6310230 -83.1552580 -81.304775 , , chains = chain:2 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 beta[2] 5.2691400 1.8564342 1.334443647 4.1715358 5.3548616 6.4124180 8.956890 beta[3] 14.0167507 1.4125027 11.238180492 13.1163174 13.9483147 14.8619178 16.990505 beta[4] -0.7409155 1.3496317 -3.357026120 -1.7012575 -0.7256937 0.2183396 1.801516 beta[5] -10.6746007 1.2905632 -13.252580104 -11.5344831 -10.6066309 -9.8331765 -8.214350 sigma[1] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sigma[2] 2.0998840 1.2062120 0.375590110 1.2597086 1.9112189 2.6554080 5.007330 sigma[3] 0.5421650 0.5429991 0.010456782 0.1782580 0.3824281 0.7330747 1.894810 sigma[4] 0.3940634 0.4305440 0.010846850 0.1155183 0.2553385 0.5586880 1.392940 sigma[5] 0.3836904 0.4337490 0.008372317 0.1073459 0.2442177 0.4955668 1.561145 mu[1] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[2] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[3] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[4] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[5] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[6] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[7] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[8] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[9] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[10] 40.7769661 0.8943155 38.914021843 40.1899512 40.8010097 41.3291754 42.583572 mu[11] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[12] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[13] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[14] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[15] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[16] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[17] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[18] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[19] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[20] 46.0461061 1.6972027 42.694924987 45.1242719 46.0909610 47.0563845 49.623803 mu[21] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[22] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[23] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[24] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[25] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[26] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[27] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[28] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[29] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[30] 54.7937168 1.0483633 52.827647907 54.0256167 54.7757130 55.4250895 57.092856 mu[31] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[32] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[33] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[34] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[35] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[36] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[37] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[38] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[39] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[40] 40.0360506 1.0015978 38.071029979 39.3580094 40.0899045 40.7314427 41.975733 mu[41] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[42] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[43] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[44] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[45] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[46] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[47] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[48] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[49] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 mu[50] 30.1023654 0.9868607 28.187114027 29.4454978 30.0703127 30.7175120 32.072386 sig[1] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[2] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[3] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[4] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[5] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[6] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[7] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[8] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[9] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[10] 2.6988316 0.3344015 2.168697260 2.4667691 2.6628699 2.8648499 3.490899 sig[11] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[12] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[13] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[14] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[15] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[16] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[17] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[18] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[19] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[20] 4.7987155 1.1726141 3.111913216 3.9573845 4.6564277 5.4317096 7.588820 sig[21] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[22] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[23] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[24] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[25] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[26] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[27] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[28] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[29] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[30] 3.2409966 0.6289856 2.387494292 2.8205871 3.1264271 3.5530545 4.753115 sig[31] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[32] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[33] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[34] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[35] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[36] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[37] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[38] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[39] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[40] 3.0928949 0.5592485 2.328969121 2.7436068 2.9796368 3.3253969 4.224424 sig[41] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[42] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[43] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[44] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[45] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[46] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[47] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[48] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[49] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 sig[50] 3.0825219 0.5473244 2.314411353 2.7126470 2.9952749 3.3518262 4.393171 log_lik[1] -3.4290951 0.6056022 -4.795229551 -3.7661119 -3.3941211 -2.9739082 -2.417855 log_lik[2] -1.9664541 0.1471115 -2.297630436 -2.0583972 -1.9499132 -1.8563949 -1.718198 log_lik[3] -4.3568304 0.8393720 -6.115504105 -4.8793608 -4.3029099 -3.7150161 -2.954924 log_lik[4] -7.5092766 1.6355155 -11.230879739 -8.4623167 -7.3443336 -6.4065139 -4.723948 log_lik[5] -2.0628485 0.1961452 -2.559312570 -2.1862162 -2.0192897 -1.9254960 -1.755142 log_lik[6] -4.2820031 0.8208970 -6.006700241 -4.7867727 -4.2289638 -3.6531731 -2.908491 log_lik[7] -2.2909833 0.2866265 -3.004988996 -2.4398053 -2.2357376 -2.0908243 -1.906083 log_lik[8] -2.9180596 0.4821942 -4.124865968 -3.1823317 -2.8429189 -2.5846853 -2.169219 log_lik[9] -2.4747554 0.3487295 -3.335668309 -2.6615435 -2.4035668 -2.2291592 -1.959968 log_lik[10] -2.4443215 0.3301692 -3.238471405 -2.6113330 -2.4011877 -2.2052004 -1.926772 log_lik[11] -3.1500769 0.3995002 -4.154812994 -3.3542646 -3.0842230 -2.8797837 -2.569701 log_lik[12] -2.5234800 0.2504701 -3.048726357 -2.6737982 -2.5094174 -2.3449738 -2.067460 log_lik[13] -2.8327152 0.2929270 -3.534391890 -2.9837501 -2.8111911 -2.6202935 -2.383066 log_lik[14] -5.0018200 1.1848233 -8.130028866 -5.5722212 -4.8286118 -4.1572007 -3.308343 log_lik[15] -2.8187913 0.3039753 -3.485693014 -2.9612701 -2.7922720 -2.6218869 -2.341189 log_lik[16] -2.5549208 0.2453775 -3.097864023 -2.7022628 -2.5537366 -2.3796233 -2.135450 log_lik[17] -2.5481130 0.2452328 -3.080128553 -2.6961830 -2.5459015 -2.3707445 -2.137399 log_lik[18] -2.7053853 0.2797460 -3.293643089 -2.8423076 -2.6763517 -2.5213299 -2.220248 log_lik[19] -2.6436909 0.2688780 -3.205965231 -2.7802520 -2.6175316 -2.4778841 -2.151092 log_lik[20] -2.5615476 0.2566143 -3.125017275 -2.7124758 -2.5453343 -2.3963304 -2.094099 log_lik[21] -2.3486036 0.2291318 -2.877530178 -2.4906360 -2.3261114 -2.1793313 -1.956650 log_lik[22] -2.2937855 0.2160808 -2.786535135 -2.4170106 -2.2744335 -2.1348979 -1.927671 log_lik[23] -2.1348103 0.1906565 -2.569415722 -2.2446580 -2.1120451 -1.9997546 -1.827162 log_lik[24] -2.8627897 0.4008032 -3.800947719 -3.0423872 -2.8113285 -2.5505553 -2.273910 log_lik[25] -2.2388801 0.2043676 -2.696527224 -2.3601830 -2.2174430 -2.0895003 -1.887165 log_lik[26] -2.1282706 0.1920988 -2.567395156 -2.2379610 -2.1053065 -1.9910268 -1.826055 log_lik[27] -2.1280666 0.1941940 -2.567828859 -2.2389305 -2.1000220 -1.9840486 -1.813719 log_lik[28] -2.5129118 0.3033573 -3.277611514 -2.6639831 -2.4667756 -2.2992051 -2.068512 log_lik[29] -2.1557488 0.2064269 -2.644779036 -2.2670786 -2.1313575 -2.0126144 -1.838718 log_lik[30] -2.1859058 0.1950722 -2.624349247 -2.2970091 -2.1670059 -2.0444813 -1.859685 log_lik[31] -2.1075331 0.1900593 -2.560864876 -2.2100749 -2.0834493 -1.9804596 -1.820216 log_lik[32] -2.0854646 0.1789851 -2.500603435 -2.1911152 -2.0557935 -1.9642396 -1.800148 log_lik[33] -2.0861396 0.1807543 -2.507518291 -2.1982322 -2.0552143 -1.9635329 -1.797328 log_lik[34] -2.0852263 0.1790705 -2.499547755 -2.1922412 -2.0545560 -1.9656872 -1.799121 log_lik[35] -2.1154937 0.1841796 -2.522427348 -2.2291663 -2.0922133 -1.9919426 -1.814321 log_lik[36] -2.0885767 0.1789263 -2.487085073 -2.1952957 -2.0621093 -1.9599561 -1.803433 log_lik[37] -2.0882850 0.1789040 -2.488368313 -2.1951625 -2.0602932 -1.9608444 -1.802809 log_lik[38] -2.0852497 0.1790598 -2.499664732 -2.1915645 -2.0545141 -1.9656226 -1.799231 log_lik[39] -2.0992302 0.1868466 -2.536361540 -2.2055752 -2.0665956 -1.9732010 -1.806127 log_lik[40] -2.0912523 0.1834558 -2.508590291 -2.2028509 -2.0581514 -1.9669039 -1.799937 log_lik[41] -2.0819675 0.1844659 -2.453953272 -2.1803977 -2.0557326 -1.9624602 -1.787778 log_lik[42] -2.0849954 0.1855366 -2.456840715 -2.1811578 -2.0615811 -1.9651412 -1.783392 log_lik[43] -2.0992394 0.1853096 -2.522823382 -2.2065476 -2.0790804 -1.9765191 -1.799503 log_lik[44] -2.0906630 0.1837187 -2.498459464 -2.1901749 -2.0694898 -1.9664168 -1.793129 log_lik[45] -2.1128364 0.1937907 -2.524834529 -2.2073550 -2.0940972 -1.9801874 -1.807504 log_lik[46] -2.1145195 0.1942586 -2.530729118 -2.2115902 -2.0960713 -1.9815390 -1.807212 log_lik[47] -2.0825579 0.1824911 -2.481378029 -2.1840082 -2.0579453 -1.9554182 -1.787928 log_lik[48] -2.1044782 0.1863542 -2.537505567 -2.2080558 -2.0825677 -1.9823195 -1.803775 log_lik[49] -2.0806082 0.1839345 -2.456757551 -2.1801704 -2.0546696 -1.9623971 -1.787756 log_lik[50] -2.1138992 0.1883241 -2.571689825 -2.2183852 -2.0916430 -1.9889668 -1.805019 lp__ -85.2619975 2.5456001 -91.780155171 -86.6891237 -84.8685183 -83.4380535 -81.473498 , , chains = chain:3 stats parameter mean sd 2.5% 25% 50% 75% 97.5% beta[1] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 beta[2] 5.3164314 1.8798487 1.315519216 4.16048503 5.3804502 6.51678019 8.750722 beta[3] 13.9719740 1.2573218 11.513381027 13.12900456 13.9475551 14.82253290 16.472185 beta[4] -0.7626695 1.2804792 -3.082010264 -1.65082981 -0.7234673 0.04978384 1.753563 beta[5] -10.6661510 1.3545292 -13.247131555 -11.60208697 -10.6468607 -9.77898391 -8.006506 sigma[1] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sigma[2] 2.0561140 1.2439264 0.286894471 1.20043894 1.8918098 2.70645120 5.191169 sigma[3] 0.5774513 0.6364212 0.009072128 0.15274334 0.3595200 0.83163012 2.320782 sigma[4] 0.3813404 0.4019704 0.008819222 0.10527083 0.2512103 0.49022258 1.499428 sigma[5] 0.3737452 0.4799912 0.009275243 0.08977503 0.2115983 0.50997778 1.585466 mu[1] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[2] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[3] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[4] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[5] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[6] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[7] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[8] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[9] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[10] 40.7734404 0.8164429 39.290014740 40.20316825 40.7337043 41.36014484 42.347128 mu[11] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[12] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[13] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[14] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[15] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[16] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[17] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[18] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[19] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[20] 46.0898719 1.6586775 42.681852452 45.09592511 46.1102425 47.16706953 49.124131 mu[21] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[22] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[23] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[24] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[25] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[26] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[27] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[28] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[29] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[30] 54.7454144 1.0142703 52.734923405 54.09858422 54.7023548 55.38258148 56.636656 mu[31] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[32] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[33] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[34] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[35] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[36] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[37] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[38] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[39] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[40] 40.0107709 0.9971184 38.083649061 39.37073464 40.0477483 40.66927503 41.930984 mu[41] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[42] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[43] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[44] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[45] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[46] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[47] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[48] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[49] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 mu[50] 30.1072894 1.0731250 28.258469384 29.42303307 30.0865643 30.77429464 32.128126 sig[1] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[2] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[3] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[4] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[5] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[6] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[7] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[8] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[9] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[10] 2.7122189 0.3478838 2.163380460 2.46656472 2.6846101 2.90807802 3.440627 sig[11] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[12] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[13] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[14] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[15] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[16] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[17] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[18] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[19] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[20] 4.7683328 1.2161368 3.108102508 3.93446732 4.5258102 5.40007483 8.019704 sig[21] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[22] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[23] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[24] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[25] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[26] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[27] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[28] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[29] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[30] 3.2896702 0.7058478 2.351930055 2.81944774 3.1527737 3.60278746 5.244945 sig[31] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[32] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[33] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[34] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[35] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[36] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[37] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[38] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[39] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[40] 3.0935593 0.5620280 2.260465071 2.69175835 3.0255956 3.40250196 4.356674 sig[41] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[42] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[43] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[44] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[45] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[46] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[47] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[48] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[49] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 sig[50] 3.0859641 0.6183350 2.253108570 2.70914206 2.9622562 3.39129159 4.734773 log_lik[1] -3.4153982 0.5719979 -4.622163651 -3.76024971 -3.3363971 -2.98861005 -2.539040 log_lik[2] -1.9617041 0.1373415 -2.258039423 -2.04588016 -1.9532974 -1.86711181 -1.723524 log_lik[3] -4.3373835 0.8048670 -6.068207169 -4.83684521 -4.2511736 -3.75214809 -3.029953 log_lik[4] -7.4695175 1.6047002 -11.077066969 -8.44801466 -7.2951956 -6.31414164 -4.926436 log_lik[5] -2.0574598 0.1732959 -2.437148343 -2.16237044 -2.0337524 -1.93502373 -1.775891 log_lik[6] -4.2630203 0.7862898 -5.956572834 -4.75355255 -4.1803503 -3.69154930 -2.990431 log_lik[7] -2.2841291 0.2512142 -2.799448491 -2.43872844 -2.2644771 -2.08684684 -1.893834 log_lik[8] -2.9072222 0.4349084 -3.879308813 -3.18704304 -2.8687186 -2.57496636 -2.231408 log_lik[9] -2.4667302 0.3080190 -3.127307896 -2.66353449 -2.4294029 -2.24027563 -1.991692 log_lik[10] -2.4367036 0.3082162 -3.105533936 -2.62851183 -2.3804527 -2.20099860 -1.976174 log_lik[11] -3.1557112 0.3984319 -4.114142862 -3.37237874 -3.0761398 -2.86955181 -2.582018 log_lik[12] -2.5148536 0.2504585 -3.031238622 -2.66932743 -2.4913459 -2.34413667 -2.085546 log_lik[13] -2.8297004 0.2972014 -3.523380980 -3.01708878 -2.7942003 -2.61026559 -2.336227 log_lik[14] -5.0431793 1.1504854 -7.663158553 -5.73514667 -4.8337800 -4.17496651 -3.353410 log_lik[15] -2.8170740 0.2938686 -3.466358501 -2.99554156 -2.7752631 -2.61299583 -2.324182 log_lik[16] -2.5465624 0.2540889 -3.089510066 -2.70702339 -2.5178582 -2.38038692 -2.101238 log_lik[17] -2.5396370 0.2535880 -3.088104488 -2.69821375 -2.5094656 -2.37374515 -2.098675 log_lik[18] -2.7011070 0.2692542 -3.281537704 -2.86306918 -2.6604829 -2.50182072 -2.274764 log_lik[19] -2.6379974 0.2595447 -3.204078119 -2.80266718 -2.6131309 -2.44192172 -2.232813 log_lik[20] -2.5539115 0.2516022 -3.070817783 -2.71112667 -2.5370419 -2.36570917 -2.152289 log_lik[21] -2.3625449 0.2364907 -2.893976896 -2.49486506 -2.3389568 -2.19204713 -1.975540 log_lik[22] -2.3071237 0.2254147 -2.837278201 -2.43177692 -2.2858165 -2.14474506 -1.940066 log_lik[23] -2.1438029 0.2057714 -2.649896517 -2.25225319 -2.1184626 -2.00351391 -1.800116 log_lik[24] -2.8473533 0.3732111 -3.780950215 -3.04574234 -2.7826782 -2.58497266 -2.285866 log_lik[25] -2.2514034 0.2159966 -2.755394662 -2.36746703 -2.2330164 -2.10498059 -1.910320 log_lik[26] -2.1362354 0.2065115 -2.643952952 -2.25269252 -2.1151933 -1.99580743 -1.786243 log_lik[27] -2.1352005 0.2076230 -2.650788349 -2.25470521 -2.1198106 -1.98755498 -1.781746 log_lik[28] -2.5052603 0.2833928 -3.160393239 -2.66521710 -2.4608971 -2.30706461 -2.062976 log_lik[29] -2.1599156 0.2141912 -2.662762024 -2.28273884 -2.1386519 -2.00508543 -1.819303 log_lik[30] -2.1972643 0.2090081 -2.704376743 -2.32553127 -2.1727528 -2.04734373 -1.868328 log_lik[31] -2.1072316 0.1931854 -2.538936236 -2.21965958 -2.0878499 -1.97553946 -1.774046 log_lik[32] -2.0818489 0.1860379 -2.488347001 -2.19060673 -2.0702349 -1.95588976 -1.758946 log_lik[33] -2.0835771 0.1868496 -2.488748974 -2.19095645 -2.0684382 -1.95150355 -1.752902 log_lik[34] -2.0817132 0.1860487 -2.483935036 -2.19292433 -2.0706754 -1.95689476 -1.756745 log_lik[35] -2.1094214 0.1913059 -2.488066289 -2.22634468 -2.0937550 -1.97516227 -1.772749 log_lik[36] -2.0843219 0.1863348 -2.504323168 -2.19740320 -2.0696004 -1.95135903 -1.763476 log_lik[37] -2.0840718 0.1862955 -2.504891499 -2.19641198 -2.0691873 -1.95128581 -1.762816 log_lik[38] -2.0817251 0.1860467 -2.484419786 -2.19231407 -2.0710196 -1.95659667 -1.756987 log_lik[39] -2.0983035 0.1908636 -2.517794362 -2.20843066 -2.0788601 -1.96682037 -1.770789 log_lik[40] -2.0895364 0.1885387 -2.490659759 -2.20316955 -2.0756210 -1.96219199 -1.760801 log_lik[41] -2.0886815 0.2003167 -2.602376574 -2.19639139 -2.0592150 -1.94850385 -1.768933 log_lik[42] -2.0919422 0.2009772 -2.604344599 -2.19762951 -2.0649687 -1.95387114 -1.773913 log_lik[43] -2.1042933 0.2061683 -2.619275201 -2.21294102 -2.0735509 -1.96000871 -1.774754 log_lik[44] -2.0959134 0.2036972 -2.599939349 -2.20360215 -2.0618264 -1.96127755 -1.767526 log_lik[45] -2.1210906 0.2077429 -2.623636455 -2.23093524 -2.1018396 -1.97742198 -1.787481 log_lik[46] -2.1228361 0.2081691 -2.628054065 -2.23343358 -2.1024669 -1.97892483 -1.790358 log_lik[47] -2.0881240 0.2012800 -2.588106619 -2.19778001 -2.0595931 -1.95752861 -1.770397 log_lik[48] -2.1094428 0.2076615 -2.626964847 -2.22066205 -2.0814470 -1.96550972 -1.777985 log_lik[49] -2.0871909 0.2000400 -2.601367789 -2.19290520 -2.0564055 -1.95294510 -1.773597 log_lik[50] -2.1187386 0.2103325 -2.643843776 -2.23033825 -2.0895641 -1.97546409 -1.783132 lp__ -85.4170555 2.6430383 -91.458316508 -86.76551169 -85.0733902 -83.51843220 -81.747799
# OR library(broom) tidyMCMC(data.het1.rstan, pars = c("beta", "sigma"), conf.int = TRUE, conf.method = "HPDinterval", rhat = TRUE, ess = TRUE)
term estimate std.error conf.low conf.high rhat ess 1 beta[1] 40.7738740 0.8317252 3.930977e+01 42.539779 0.9990285 1132 2 beta[2] 5.2712547 1.7790475 1.696636e+00 8.733072 0.9991307 1387 3 beta[3] 13.9814691 1.3289374 1.131683e+01 16.611148 0.9985816 1500 4 beta[4] -0.7422862 1.3155776 -3.073054e+00 2.018431 0.9983823 1316 5 beta[5] -10.6614901 1.2787261 -1.331217e+01 -8.340837 0.9985087 1375 6 sigma[1] 2.7004282 0.3360063 2.112094e+00 3.386388 0.9999528 1395 7 sigma[2] 2.0364122 1.2075927 9.504926e-02 4.296391 0.9996267 1391 8 sigma[3] 0.5628853 0.5814314 7.720314e-04 1.606201 0.9994395 1500 9 sigma[4] 0.3846713 0.4267658 1.223422e-04 1.182084 0.9989855 1500 10 sigma[5] 0.3742896 0.4375586 1.001235e-05 1.156901 0.9982514 1500
- the mean of the first group (A) is
40.773874
- the mean of the second group (B) is
5.2712547
units greater than (A) - the mean of the third group (C) is
13.9814691
units greater than (A) - the mean of the forth group (D) is
-0.7422862
units greater (i.e. less) than (A) - the mean of the fifth group (E) is
-10.6614901
units greater (i.e. less) than (A)
While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.
library(coda) mcmcpvalue <- function(samp) { ## elementary version that creates an empirical p-value for the ## hypothesis that the columns of samp have mean zero versus a general ## multivariate distribution with elliptical contours. ## differences from the mean standardized by the observed ## variance-covariance factor ## Note, I put in the bit for single terms if (length(dim(samp)) == 0) { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/length(samp) } else { std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp), transpose = TRUE) sqdist <- colSums(std * std) sum(sqdist[-1] > sqdist[1])/nrow(samp) } } ## since values are less than zero mcmc = as.matrix(data.het1.rstan) for (i in grep("beta", colnames(mcmc), value = TRUE)) print(paste(i, mcmcpvalue(mcmc[, i])))
[1] "beta[1] 0" [1] "beta[2] 0.00733333333333333" [1] "beta[3] 0" [1] "beta[4] 0.583333333333333" [1] "beta[5] 0"
mcmcpvalue(mcmc[, grep("beta", colnames(mcmc))])
[1] 0
With a p-value of essentially 0, we would conclude that there is almost no evidence that the slope was likely to be equal to zero, suggesting there is a relationship.
Graphical summaries
A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.
Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.
mcmc = data.het1.r2jags$BUGSoutput$sims.matrix ## Calculate the fitted values newdata = data.frame(x = levels(data.het1$x)) Xmat = model.matrix(~x, newdata) wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = x)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
If you wanted to represent sample data on the figure in such a simple example (single predictor) we could simply over- (or under-) lay the raw data.
ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = data.het1, aes(y = y, x = x), color = "gray") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
A more general solution would be to add the partial residuals to the figure. Partial residuals are the fitted values plus the residuals. In this simple case, that equates to exactly the same as the raw observations since $resid = obs - fitted$ and the fitted values depend only on the single predictor we are interested in.
## Calculate partial residuals fitted values fdata = rdata = data.het1 fMat = rMat = model.matrix(~x, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data.het1$y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = rdata, aes(y = partial.resid), color = "gray") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
mcmc = as.matrix(data.het1.rstan) ## Calculate the fitted values newdata = data.frame(x = levels(data.het1$x)) Xmat = model.matrix(~x, newdata) wch = grep("beta", colnames(mcmc)) coefs = mcmc[, wch] fit = coefs %*% t(Xmat) newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval")) ggplot(newdata, aes(y = estimate, x = x)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
If you wanted to represent sample data on the figure in such a simple example (single predictor) we could simply over- (or under-) lay the raw data.
ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = data.het1, aes(y = y, x = x), color = "gray") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
A more general solution would be to add the partial residuals to the figure. Partial residuals are the fitted values plus the residuals. In this simple case, that equates to exactly the same as the raw observations since $resid = obs - fitted$ and the fitted values depend only on the single predictor we are interested in.
## Calculate partial residuals fitted values fdata = rdata = data.het1 fMat = rMat = model.matrix(~x, fdata) fit = as.vector(apply(coefs, 2, median) %*% t(fMat)) resid = as.vector(data.het1$y - apply(coefs, 2, median) %*% t(rMat)) rdata = rdata %>% mutate(partial.resid = resid + fit) ggplot(newdata, aes(y = estimate, x = x)) + geom_point(data = rdata, aes(y = partial.resid), color = "gray") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + scale_y_continuous("Y") + scale_x_discrete("X") + theme_classic()
References
Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models”. In: Bayesian Analysis, pp. 515-533.
Gelman, A., B. Goodrich, J. Gabry, et al. (2017). “R-squared for Bayesian regression models”.