Shrinking the Cross-Section (Again): What Changes When Data Does? A Scientific Replication of a Paper by Kozak et al. Using the Global Factor Data

Ulrik Simmelholt & Christian Carl Gartmann

Student thesis: Master thesis

Abstract

This thesis investigates the replicability of the findings in Shrinking the Cross-Section by Kozak, Nagel, and Santosh (2020), who propose a dual-penalty estimator to construct a stochastic discount factor in high-dimensional settings. While replicating their methodology, we apply it to an alternative dataset: the Global Factor Data by Jensen, Kelly, and Pedersen (2023). Using both ridge and elastic net regularization, we assess the predictive performance of dense and sparse factor models across raw factor returns and the principal component space.

Our analysis finds mixed success in replicating the original results. The dense ridge model yields out-of-sample R2 values comparable to those reported by Kozak et al. (2020). Once we allow for sparsity through the elastic net model, we find suggestions for some levels of variable selection that improve predictive performance, in contrast to the conclusions of Kozak et al. (2020). In the principal component space, however, results align more closely with the original findings. We further evaluate the recovered stochastic discount factors in a mean-variance portfolio allocation framework under a true out-of-sample test, though the results are inconclusive across investment horizons.

Given the change in dataset, our study constitutes a scientific rather than a pure replication, making one-to-one comparability inherently difficult. Nonetheless, the convergence of several results across models supports the internal validity of the methodology and contributes to the broader discussion on replication in empirical asset pricing.

EducationsMSc in Applied Economics and Finance, (Graduate Programme) Final Thesis
LanguageEnglish
Publication date15 May 2025
Number of pages85
SupervisorsJonas Striaukas