Multilevel modeling of educational data with cross-classification and missing identification for units

Hill, P. and Goldstein, H.
Journal of Educational and Behavioral Statistics, 16:2, 2665-2678

When multilevel models are estimated from survey data derived using multistage sampling, unequal selection probabilities at any stage of sampling may induce bias in standard estimators, unless the sources of the unequal probabilities are fully controlled for in the covariates. This paper proposes alternative ways of weighting the estimation of a two-level model by using the reciprocals of the selection probabilities at each stage of sampling. Consistent estimators are obtained when both the sample number of level 2 units and the sample number of level 1 units within sample level 2 units increase. Scaling of the weights is proposed to improve the properties of the estimators and to simplify computation. Variance estimators are also proposed. In a limited simulation study the scaled weighted estimators are found to perform well, although non-negligible bias starts to arise for informative designs when the sample number of level 1 units becomes small. The variance estimators perform extremely well. The procedures are illustrated using data from the survey of psychiatric morbidity.

Number of levels
Model data structure
Response types
Multivariate response model?
Longitudinal data?
Substantive discipline

First paper on multiple membership models and highly influential for later developments

Paper submitted by
Harvey Goldstein, Graduate School of Education, University of Bristol,
Edit this page