Binarized Support Vector Machines

Emilio Carrizosa, Belén Martín-Barragán, Dolores Romero Morales

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

The widely used support vector machine (SVM) method has shown to yield very good results in supervised classification problems. Other methods such as classification trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in data mining.
In this work, we propose an SVM-based method that automatically detects the most important predictor variables and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals that are critical for the classification. The method involves the optimization of a linear programming problem in the spirit of the Lasso method with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard column generation strategy leads to a classification method that, in terms of classification ability, is competitive against the standard linear SVM and classification trees. Moreover, the proposed method is robust; i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.
When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler but still competitive classifiers.
The widely used support vector machine (SVM) method has shown to yield very good results in supervised classification problems. Other methods such as classification trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in data mining.
In this work, we propose an SVM-based method that automatically detects the most important predictor variables and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals that are critical for the classification. The method involves the optimization of a linear programming problem in the spirit of the Lasso method with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard column generation strategy leads to a classification method that, in terms of classification ability, is competitive against the standard linear SVM and classification trees. Moreover, the proposed method is robust; i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.
When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler but still competitive classifiers.
SprogEngelsk
TidsskriftI N F O R M S Journal on Computing
Vol/bind22
Udgave nummer1
Sider154-167
ISSN1091-9856
DOI
StatusUdgivet - 2010
Udgivet eksterntJa

Emneord

  • Supervised classification
  • Binarization
  • Column generation
  • Support vector machines

Citer dette

Carrizosa, Emilio ; Martín-Barragán, Belén ; Morales, Dolores Romero. / Binarized Support Vector Machines. I: I N F O R M S Journal on Computing. 2010 ; Bind 22, Nr. 1. s. 154-167
@article{4459e7e9567140bc90b5d7fa19286380,
title = "Binarized Support Vector Machines",
abstract = "The widely used support vector machine (SVM) method has shown to yield very good results in supervised classification problems. Other methods such as classification trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in data mining.In this work, we propose an SVM-based method that automatically detects the most important predictor variables and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals that are critical for the classification. The method involves the optimization of a linear programming problem in the spirit of the Lasso method with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard column generation strategy leads to a classification method that, in terms of classification ability, is competitive against the standard linear SVM and classification trees. Moreover, the proposed method is robust; i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler but still competitive classifiers.",
keywords = "Supervised classification, Binarization, Column generation, Support vector machines",
author = "Emilio Carrizosa and Bel{\'e}n Mart{\'i}n-Barrag{\'a}n and Morales, {Dolores Romero}",
year = "2010",
doi = "10.1287/ijoc.1090.0317",
language = "English",
volume = "22",
pages = "154--167",
journal = "I N F O R M S Journal on Computing",
issn = "1091-9856",
publisher = "Institute for Operations Research and the Management Sciences",
number = "1",

}

Carrizosa, E, Martín-Barragán, B & Morales, DR 2010, 'Binarized Support Vector Machines' I N F O R M S Journal on Computing, bind 22, nr. 1, s. 154-167. DOI: 10.1287/ijoc.1090.0317

Binarized Support Vector Machines. / Carrizosa, Emilio; Martín-Barragán, Belén; Morales, Dolores Romero.

I: I N F O R M S Journal on Computing, Bind 22, Nr. 1, 2010, s. 154-167.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Binarized Support Vector Machines

AU - Carrizosa,Emilio

AU - Martín-Barragán,Belén

AU - Morales,Dolores Romero

PY - 2010

Y1 - 2010

N2 - The widely used support vector machine (SVM) method has shown to yield very good results in supervised classification problems. Other methods such as classification trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in data mining.In this work, we propose an SVM-based method that automatically detects the most important predictor variables and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals that are critical for the classification. The method involves the optimization of a linear programming problem in the spirit of the Lasso method with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard column generation strategy leads to a classification method that, in terms of classification ability, is competitive against the standard linear SVM and classification trees. Moreover, the proposed method is robust; i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler but still competitive classifiers.

AB - The widely used support vector machine (SVM) method has shown to yield very good results in supervised classification problems. Other methods such as classification trees have become more popular among practitioners than SVM thanks to their interpretability, which is an important issue in data mining.In this work, we propose an SVM-based method that automatically detects the most important predictor variables and the role they play in the classifier. In particular, the proposed method is able to detect those values and intervals that are critical for the classification. The method involves the optimization of a linear programming problem in the spirit of the Lasso method with a large number of decision variables. The numerical experience reported shows that a rather direct use of the standard column generation strategy leads to a classification method that, in terms of classification ability, is competitive against the standard linear SVM and classification trees. Moreover, the proposed method is robust; i.e., it is stable in the presence of outliers and invariant to change of scale or measurement units of the predictor variables.When the complexity of the classifier is an important issue, a wrapper feature selection method is applied, yielding simpler but still competitive classifiers.

KW - Supervised classification

KW - Binarization

KW - Column generation

KW - Support vector machines

U2 - 10.1287/ijoc.1090.0317

DO - 10.1287/ijoc.1090.0317

M3 - Journal article

VL - 22

SP - 154

EP - 167

JO - I N F O R M S Journal on Computing

T2 - I N F O R M S Journal on Computing

JF - I N F O R M S Journal on Computing

SN - 1091-9856

IS - 1

ER -

Carrizosa E, Martín-Barragán B, Morales DR. Binarized Support Vector Machines. I N F O R M S Journal on Computing. 2010;22(1):154-167. Tilgængelig fra, DOI: 10.1287/ijoc.1090.0317