ST 599 - W 26: Missing data and causal inference

Syllabus

Textbook:

  • Required: Statistical Analysis with Missing Data, 3rd edition Little and Rubin

  • Required: Causal Inference for Statistics, Social, and Biomedical Sciences, Imbens and Rubin

  • Optional: Bayesian Inference for Partially Identified Models, Gustafson

Course objectives

Upon completion of this course, students should be able to critically evaluate how published literature handles (or does not handle) missing data, and analyze datasets that have missing values by designing models that account for missingness. Students should also be able to read published literature using randomized study designs, and assess whether researchers’ causal conclusions are reasonable.

Course learning outcomes

  1. Differentiate between missing-completely-at-random, missing-at-random (MAR), and missing-not-at-random (MNAR) processes via assumptions about the joint distribution of missingness indicators, outcomes, and covariates.
  2. Evaluate whether estimands of interest are appropriately estimated under missingness for a given missing data technique (complete case analysis, multiple imputation, data augmentation, etc.).
  3. Derive the identification region and limiting posterior density for partially-identified models.
  4. Derive a principal causal effect using the Neyman-Rubin causal model.
  5. Construct and fit maximum likelihood (in R)/Bayesian models (in Stan) for MAR, MNAR, and causal models.

Schedule and readings

Date Topic Reading LO Assignments
1/6/2026 Missingness mechanisms and patterns LR 1.1-1.4 1, 2 NA
1/8/2026 Missing data techniques
Complete case analysis
Weighting
LR Ch. 3 1, 2 NA
1/13/2026 Single imputation techniques LR Ch. 4 1, 2 NA
1/15/2026 Multiple imputation techniques LR Ch. 5 1, 2 NA
1/20/2026 Likelihood theory LR Ch. 6 1, 5 NA
1/22/2026 Likelihood theory LR Ch. 6 1, 5 1

Course syllabus

Project proposal

Course notes

MCMC notes

HW