Tutorials on R and statistics
23 April 2011
This series of tutorials will gradually work through an extensive range of frequentist and Bayesian graphical and statistical theory and practice (focusing on R or JAGS interfaced from R). It is advisable that you initially work through the following tutorials sequentially.
- simulating data allows us to fabricate the true underlying trends responsible for the data and therefore enable us to evaluate how accurately the analyses tools subsequently reveal these trends.
- the process of simulating data is typically the reverse of analysing the data. There must be consideration for how the response is to relate to the predictors, the scale (normal, binomial, Poisson etc) of variables and parameters as well as how to incorporate sensible variability (noise). Thus the process of simulating data specifically for a particular statistical analysis can be as informative as a description of the analysis itself.
Each tutorial is also associated with a workshop featuring a real data sets and research questions (many of which appear in prominent biostatistical texts). These workshops are designed to provide extensive guided practice of the concepts and techniques highlighted in the tutorials. Moreover, as worked examples of real biological data, the provide insights into the diversity of analyses options and challenges presented by real data.
R syntax
Tutorial 1 - R basics
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Installation of R
- Basic syntax
- Data types
- Object manipulation
- R Editors
Tutorial 2 - R Dataframes
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Constructing dataframes
- Importing (reading) data
- Exporting (writing) data
- Vectors within dataframes
- Manipulating dataframes
- Reshaping dataframes
- Merging dataframes
- Aggregating dataframes
- Transformations and derivatives
- Alterations
- List manipulations
- More complex manipulations
- Dummy data sets - random data generation
Tutorial 3 - More advanced R
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Package management
- Matrix algebra
- R programming
R data summaries - numerical and graphical
Tutorial 4 - Exploratory data analysis
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Numerical summaries
- Graphical summaries
Tutorial 5 - Traditional R graphics
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- High level plotting
- Graphical parameters
- Enhancements and customizations
- Exporting graphics
Tutorial 6 - The Grammar of Graphics in R (ggplot2)
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Introducing the grammar of graphics
- Coordinate systems
- Geometric objects - geom
- Scales
- Facets
- Options
- Themes
- Showcase
Linear modeling
Tutorial 7 - Statistical philosophies and estimation
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Distributions
- Probability
- Opposing philosophies
- Frequentist
- Bayesian
- Estimation and inference
- Least squares
- Maximum likelihood
- Bayesian
Tutorial 8 - Simple hypothesis testing
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Frequentist philosophy revisited
- One and two-tailed tests
- t-tests
- Assumptions
- Power
- Robust tests
Tutorial 9 - Traditional linear modelling in R
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- An introduction to linear models
- Regression
- Simple
- Multiple
- ANOVA
- Single factor
- Nested
- Factorial
- Randomized Complete Block
- Partly nested (split-plot and randomized block)
- ANCOVA
Tutorial 10 - The power of contrasts
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- To derive predictions
- To derive treatment means
- To derive effect sizes
Tutorial 11 - Generalized linear models
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- χ2 tests
- Contingency tables
- Generalized linear models
- Logistic regression
- Log-linear modelling
Tutorial 12 - Generalized linear mixed effects
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- χ2 tests
- Contingency tables
- Generalized linear models
- Logistic regression
- Log-linear modelling
Multivariate analyses
Tutorial 13 - R-mode analyses
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Principal components analysis
- Redundancy analysis
- Ordiation
Tutorial 14 - Q-mode analyses
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Dissimilarity
- Classification and regression trees, clustering
- Multidimensional scaling
- ANOSIM
- Mantel
Other topics
Tutorial 15 - Recommendations and Itemsets
«Goto Tutorial»
«Goto Worksheet»
Topics covered
- Itemsets