Optimal Randomized Decision Trees

Rafael Blanquero, Emilio Carrizosa, Cristina Molero-Río, Dolores Romero Morales

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

Abstract

Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.
Original languageEnglish
Title of host publicationConference proceedings : 2nd Spanish Young Statisticians and Operational Researchers Meeting
Number of pages1
Place of PublicationMadrid
PublisherUniversidad Complutense de Madrid * Servicio de Publicaciones
Publication date2019
Pages53
Publication statusPublished - 2019

Cite this

Blanquero, R., Carrizosa, E., Molero-Río, C., & Romero Morales, D. (2019). Optimal Randomized Decision Trees. In Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting (pp. 53). Madrid: Universidad Complutense de Madrid * Servicio de Publicaciones.
Blanquero, Rafael ; Carrizosa, Emilio ; Molero-Río, Cristina ; Romero Morales, Dolores . / Optimal Randomized Decision Trees. Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid : Universidad Complutense de Madrid * Servicio de Publicaciones, 2019. pp. 53
@inbook{222f4679eb4a4b4f845e818e50d811bd,
title = "Optimal Randomized Decision Trees",
abstract = "Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.",
author = "Rafael Blanquero and Emilio Carrizosa and Cristina Molero-R{\'i}o and {Romero Morales}, Dolores",
year = "2019",
language = "English",
pages = "53",
booktitle = "Conference proceedings",
publisher = "Universidad Complutense de Madrid * Servicio de Publicaciones",

}

Blanquero, R, Carrizosa, E, Molero-Río, C & Romero Morales, D 2019, Optimal Randomized Decision Trees. in Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Universidad Complutense de Madrid * Servicio de Publicaciones, Madrid, pp. 53.

Optimal Randomized Decision Trees. / Blanquero, Rafael; Carrizosa, Emilio; Molero-Río, Cristina; Romero Morales, Dolores .

Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid : Universidad Complutense de Madrid * Servicio de Publicaciones, 2019. p. 53.

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

TY - ABST

T1 - Optimal Randomized Decision Trees

AU - Blanquero, Rafael

AU - Carrizosa, Emilio

AU - Molero-Río, Cristina

AU - Romero Morales, Dolores

PY - 2019

Y1 - 2019

N2 - Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.

AB - Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.

M3 - Conference abstract in proceedings

SP - 53

BT - Conference proceedings

PB - Universidad Complutense de Madrid * Servicio de Publicaciones

CY - Madrid

ER -

Blanquero R, Carrizosa E, Molero-Río C, Romero Morales D. Optimal Randomized Decision Trees. In Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid: Universidad Complutense de Madrid * Servicio de Publicaciones. 2019. p. 53