Optimal Randomized Decision Trees

Rafael Blanquero, Emilio Carrizosa, Cristina Molero-Río, Dolores Romero Morales

Publikation: Bidrag til bog/antologi/rapportKonferenceabstrakt i proceedingsForskningpeer review

Resumé

Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.
OriginalsprogEngelsk
TitelConference proceedings : 2nd Spanish Young Statisticians and Operational Researchers Meeting
Antal sider1
Udgivelses stedMadrid
ForlagUniversidad Complutense de Madrid * Servicio de Publicaciones
Publikationsdato2019
Sider53
StatusUdgivet - 2019

Citer dette

Blanquero, R., Carrizosa, E., Molero-Río, C., & Romero Morales, D. (2019). Optimal Randomized Decision Trees. I Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting (s. 53). Madrid: Universidad Complutense de Madrid * Servicio de Publicaciones.
Blanquero, Rafael ; Carrizosa, Emilio ; Molero-Río, Cristina ; Romero Morales, Dolores . / Optimal Randomized Decision Trees. Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid : Universidad Complutense de Madrid * Servicio de Publicaciones, 2019. s. 53
@inbook{222f4679eb4a4b4f845e818e50d811bd,
title = "Optimal Randomized Decision Trees",
abstract = "Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.",
author = "Rafael Blanquero and Emilio Carrizosa and Cristina Molero-R{\'i}o and {Romero Morales}, Dolores",
year = "2019",
language = "English",
pages = "53",
booktitle = "Conference proceedings",
publisher = "Universidad Complutense de Madrid * Servicio de Publicaciones",

}

Blanquero, R, Carrizosa, E, Molero-Río, C & Romero Morales, D 2019, Optimal Randomized Decision Trees. i Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Universidad Complutense de Madrid * Servicio de Publicaciones, Madrid, s. 53.

Optimal Randomized Decision Trees. / Blanquero, Rafael; Carrizosa, Emilio; Molero-Río, Cristina; Romero Morales, Dolores .

Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid : Universidad Complutense de Madrid * Servicio de Publicaciones, 2019. s. 53.

Publikation: Bidrag til bog/antologi/rapportKonferenceabstrakt i proceedingsForskningpeer review

TY - ABST

T1 - Optimal Randomized Decision Trees

AU - Blanquero, Rafael

AU - Carrizosa, Emilio

AU - Molero-Río, Cristina

AU - Romero Morales, Dolores

PY - 2019

Y1 - 2019

N2 - Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.

AB - Classic decision trees [1] are defined by a set of orthogonal cuts, i.e., the branching rules are of the form variable X not lower than threshold c. The variables and thresholds are obtained by a greedy procedure. The use of a greedy strategy yields low computational cost, but may lead to myopic decisions. Although oblique cuts, with at least two variables, have also been proposed, they involve cumbersome algorithms to identify each cut of the tree. The latest advances in Optimization techniques have motivated further research on procedures to build optimal decision trees, with either orthogonal or oblique cuts. Mixed-Integer Optimization models have been recently proposed to tackle this problem, see for instance [2]. Although the results of such optimal decision trees are encouraging, the use of integer decision variables leads to hard optimization problems. In this talk, we propose to build optimal decision trees by solving nonlinear continuous optimization problems, thus avoiding the difficulties associated with integer decision variables. This is achieved by including a cumulative density function that will indicate the path to be followed inside the tree. Numerical results show the usefulness of this approach: using one single tree, we obtain better accuracies than classic decision trees, being much more flexible that those since sparsity or preference of performance in a subsample can be easily controlled.

M3 - Conference abstract in proceedings

SP - 53

BT - Conference proceedings

PB - Universidad Complutense de Madrid * Servicio de Publicaciones

CY - Madrid

ER -

Blanquero R, Carrizosa E, Molero-Río C, Romero Morales D. Optimal Randomized Decision Trees. I Conference proceedings: 2nd Spanish Young Statisticians and Operational Researchers Meeting. Madrid: Universidad Complutense de Madrid * Servicio de Publicaciones. 2019. s. 53