Welcome!
This is a course about Bayesian statistics, targeted at systems biologists.
There are three intended learning outcomes:
Understand the theoretical basis for applying Bayesian data analysis to practical scientific problems
Develop a familiarity with implementing Bayesian data analysis using modern software tools
Gain deep understanding of both theory and practice of elements of Bayesian data analysis that are particularly relevant to computational biology, including custom hierarchical models, large analyses and statistical models with embedded ODE systems.
General format
Each week we have a one-hour seminar. The goal is to spend the time approximately as follows:
25-35mins on ‘theory’, aka learning things from the book and getting more reading material
25-35mins on practical computer work
Plan
Week 1: What is Bayesian inference?
Theory
Statistical inference in general
Bayesian statistical inference
The big challenge: dimensionality
Practice
Set up development environment
git basics
Install Stan and cmdstanpy
Reading
Jaynes (2003, Ch. 1)
Laplace (1986)
Box and Tiao (1992, Ch. 1.1)
Week 2: MCMC and Stan
Theory
What is MCMC?
Hamiltonian Monte Carlo
Probabilistic programming
Practice
Run an MCMC algorithm and inspect the results
Reading
Betancourt (2018)
Week 3: Metropolis-Hastings
Week 4: After MCMC: diagnostics, and decisions
Theory
Diagnostics: convergence, divergent transitions, effective sample size
Model evaluation as decision theory
Why negative log likelihood is a good default loss function
Practice
Diagnose some good and bad MCMC runs
Reading
Vehtari et al. (2021)
Vehtari, Gelman, and Gabry (2017)
Week 5: Regression models in biology
Theory
Generalised linear models
Prior elicitation
Hierarchical models
Practice
Compare some statistical models of a simulated biological dataset
Reading
Betancourt (2024)
Week 6: Hierarchical models
Week 7: ODEs
Theory
What is an ODE?
ODE solvers
ODE solvers inside probabilistic programs
Practice
Fit a model with an ODE.
Reading
Timonen et al. (2022)
Week 8: Bayesian workflow
Theory
Parts of a statistical anlaysis (not just inference!)
Why Bayesian workflow is complex: non-linearity and plurality
Writing scalable statistical programming projects
Practice
Write a scalable statistical analysis with bibat.
Reading
Gelman et al. (2020)
Week 9-10: Project
Format: one hour joint feedback and help session