Double Machine Learning and Automated Confounder Selection: A Cautionary Tale

Paul Hünermund, Beyers Louw, Itamar Caspi

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review


Double machine learning (DML) is becoming an increasingly popular tool for automated model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DML is very sensitive to the inclusion of even a few "bad controls" in the covariate space. The resulting bias varies with the nature of the causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.
Original languageEnglish
Title of host publicationProceedings of the Eighty-second Annual Meeting of the Academy of Management
EditorsSonia Taneja
Number of pages1
Place of PublicationBriarcliff Manor, NY
PublisherAcademy of Management
Publication date2022
Publication statusPublished - 2022
EventThe Academy of Management Annual Meeting 2022: Creating a Better World Together - Seattle, United States
Duration: 5 Aug 20229 Aug 2022
Conference number: 82


ConferenceThe Academy of Management Annual Meeting 2022
Country/TerritoryUnited States
Internet address
SeriesAcademy of Management Proceedings

Cite this