We start with a dive into R and give you a roadmap to navigate your data without stumbling over usual stats issues. We'll look at how to find outliers, handle collinearity, and decide when (or when not) to use transformations.

Next on the list is multiple linear regression, a super useful tool in statistics. We'll go over the basics of linear regression with a touch of biology for context. We'll also address some potential challenges and introduce generalised linear models (GLM) for tackling data like counts, whether something is present or absent, and proportions. If standard models like linear regression or GLM don't seem to fit? Enter the world of generalised additive models (GAM). They're great for giving your data a smoother touch.

And to keep things real, we'll sprinkle in some case studies throughout the course. It's all about blending theory with real-world analysis in a way that makes sense.

Outline

Module 1 (Monday):

  • General introduction.
  • Introduction to R.
  • Theory presentation on data exploration (outliers, collinearity, transformations, relationships, interactions).
  • Based on Zuur et al. (2010) and Ieno and Zuur (2015).
  • Two exercises.

Module 2 (Tuesday):

  • Theory presentation on linear regression.
  • One exercise.
  • Sketching model fit.
  • Dealing with categorical covariates.
  • One exercise.

Module 3 (Wednesday morning):

  • Different strategies for model selection.
  • Interactions.
  • Two exercises.

Module 4 (Wednesday afternoon and Thursday):

  • Theory presentation on Poisson, negative binomial, Bernoulli and binomial distributions.
  • How to deal with overdispersion.
  • Three exercises:
    • Poisson GLM.
    • Negative binomial GLM.
    • Bernoulli GLM.
  • Short theory presentation on DHARMa (for model validation).

Module 5 (Friday):

  • Theory presentation on GAM.
  • Two exercises using Gaussian GAM and Poisson and negative binomial GAMs.
  • Based on various chapters in Zuur (2012).

On-demand video page:

  • Short theory presentation on binomial, beta, Gamma and Tweedie GLMs.
  • Exercises for binomial, beta, Gamma and Tweedie GLMs.
  • What to present in a paper

General information

  • The course material consists of relevant pdf files of presentations, data sets, and clearly documented R code.
  • Course participants will be given access to the course website with all data sets, R solution code, and course material 1 week before the start of the course. Access to the course website is for 12 months.
  • A discussion board (access for 12 months) allows for interaction on course content between instructors and participants after the course.

1 hour face-to-face

  • The course includes a 1-hour face-to-face video chat with the instructors (to be used after the course).
  • You are invited to apply the statistical techniques discussed during the course on your own data and if you encounter any problems, you can ask questions during the 1-hour face-to-face video chat.

Cited literature:

  • Zuur, Ieno, Elphick. (2010). A protocol for data exploration to avoiding common statistical problems. Methods in Ecology and Evolution, 1: 3-14.
  • Zuur (2012) Beginner’s Guide to GAM with R.
  • Zuur, Hilbe, Ieno (2013). Beginner’s Guide to GLM and GLMM with R.

Keywords: Introduction to R. Outliers. Transformations. Collinearity (correlation between covariates). Multiple linear regression. Model selection. Visualising results. Poisson GLM. Overdispersion. Negative binomial GLM. Binary and proportional data. ggplot2. Logistic regression. DHARMa. Generalised additive models. Tweedie and Gamma GLM.