Generalized Partially Linear Regression with Misclassified Data and an Application to Labour Market Transitions

Stephan Dlugosz, Enno Mammen, Ralf Wilke

Research output: Contribution to journalJournal articleResearchpeer-review

112 Downloads (Pure)


Large data sets that originate from administrative or operational activity are increasingly used for statistical analysis as they often contain very precise information and a large number of observations. But there is evidence that some variables can be subject to severe misclassification or contain missing values. Given the size of the data, a flexible semiparametric misclassification model would be good choice but their use in practise is scarce. To close this gap a semiparametric model for the probability of observing labour market transitions is estimated using a sample of 20 m observations from Germany. It is shown that estimated marginal effects of a number of covariates are sizeably affected by misclassification and missing values in the analysis data. The proposed generalized partially linear regression extends existing models by allowing a misclassified discrete covariate to be interacted with a nonparametric function of a continuous covariate.
Original languageEnglish
JournalComputational Statistics & Data Analysis
Pages (from-to)145-159
Number of pages15
Publication statusPublished - Jun 2017


  • Semiparametric regression
  • Measurement error
  • Side information

Cite this