On Enhancing the Explainability and Fairness of Tree Ensembles

Emilio Carrizosa, Kseniia Kurishchenko*, Dolores Romero Morales

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Tree ensembles are one of the most powerful methodologies in Machine Learning. In this paper, we investigate how to make tree ensembles more flexible to incorporate explainability and fairness in the training process, possibly at the expense of a decrease in accuracy. While explainability helps the user understand the key features that play a role in the classification task, with fairness we ensure that the ensemble does not discriminate against a group of observations that share a sensitive attribute. We propose a Mixed Integer Linear Optimization formulation to train an ensemble of trees that, apart from minimizing the misclassification cost, controls for sparsity as well as the accuracy in the sensitive group. Our formulation is scalable in the number of observations since its number of binary decision variables is independent of the number of observations. In our numerical results, we show that for standard datasets used in the fairness literature, we can dramatically enhance the fairness of the benchmark, namely the popular Random Forest, while using only a few features, all without damaging the misclassification cost.
Original languageEnglish
JournalEuropean Journal of Operational Research
Number of pages32
ISSN0377-2217
DOIs
Publication statusPublished - 16 Jan 2025

Bibliographical note

Epub ahead of print. Published online: 16. January 2025.

Keywords

  • (R) machine learning
  • Tree ensembles
  • Explainability
  • Fairness
  • Mixed integer linear optimization

Cite this