A data driven approach to address missing data in the 1970 British birth cohort

Missing data may induce bias when analysing longitudinal population surveys. We aimed to tackle this problem in the 1970 British Cohort Study (BCS70). We utilised a data-driven approach to address missing data issues in BCS70. Our method consisted of a 3-step process to identify important predictors of non-response from a pool of ~ 20,000 variables from 9 sweeps in 18,037 individuals. We used parametric regression models to identify predictors of non-response that can be used as auxiliary variab