Optimal Randomized Decision Trees

Rafael Blanquero, Emilio Carrizosa, Cristina Molero-Río, Dolores Romero Morales

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

Abstract

Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.
Original languageEnglish
Title of host publicationConference proceedings : 2nd Spanish Young Statisticians and Operational Researchers Meeting
Number of pages1
Place of PublicationMadrid
PublisherUniversidad Complutense de Madrid * Servicio de Publicaciones
Publication date2019
Pages53
Publication statusPublished - 2019

Cite this