Clustering Categories in Support Vector Machines

Emilio Carrizosa, Amaya Nogales-Gómez, Dolores Romero Morales

Publikation: Working paperForskningpeer review

Resumé

Support Vector Machines (SVM) is the state-of-the-art in Supervised Classification. In this paper the Cluster Support Vector Machines (CLSVM) methodology is proposed with the aim to reduce the complexity of the SVM classifier in the presence of categorical features. The CLSVM methodology lets categories cluster around their peers and builds an SVM classifier using the clustered dataset. Four strategies for building the CLSVM classifier are presented based on solving: the original SVM formulation, a Quadratically Constrained Quadratic Programming formulation, and a Mixed Integer Quadratic Programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM with original data but with a dramatic decrease in complexity.
Support Vector Machines (SVM) is the state-of-the-art in Supervised Classification. In this paper the Cluster Support Vector Machines (CLSVM) methodology is proposed with the aim to reduce the complexity of the SVM classifier in the presence of categorical features. The CLSVM methodology lets categories cluster around their peers and builds an SVM classifier using the clustered dataset. Four strategies for building the CLSVM classifier are presented based on solving: the original SVM formulation, a Quadratically Constrained Quadratic Programming formulation, and a Mixed Integer Quadratic Programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM with original data but with a dramatic decrease in complexity.
SprogEngelsk
Udgivelses stedwww
UdgiverMathematical Optimization Society
Antal sider20
StatusUdgivet - 2014
Udgivet eksterntJa
NavnOptimization Online
Nummer4403
Vol/bind06

Emneord

  • Support vector machines
  • Categorical features
  • Classifier complexity
  • Clustering
  • Quadratically constrained programming
  • 0-1 programming

Citer dette

Carrizosa, E., Nogales-Gómez, A., & Morales, D. R. (2014). Clustering Categories in Support Vector Machines. www: Mathematical Optimization Society. Optimization Online, Nr. 4403, Bind. 06
Carrizosa, Emilio ; Nogales-Gómez, Amaya ; Morales, Dolores Romero. / Clustering Categories in Support Vector Machines. www : Mathematical Optimization Society, 2014. (Optimization Online; Nr. 4403, ???volume??? 06).
@techreport{e863502c734a40958328dfe1ad784b05,
title = "Clustering Categories in Support Vector Machines",
abstract = "Support Vector Machines (SVM) is the state-of-the-art in Supervised Classification. In this paper the Cluster Support Vector Machines (CLSVM) methodology is proposed with the aim to reduce the complexity of the SVM classifier in the presence of categorical features. The CLSVM methodology lets categories cluster around their peers and builds an SVM classifier using the clustered dataset. Four strategies for building the CLSVM classifier are presented based on solving: the original SVM formulation, a Quadratically Constrained Quadratic Programming formulation, and a Mixed Integer Quadratic Programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM with original data but with a dramatic decrease in complexity.",
keywords = "Support vector machines, Categorical features, Classifier complexity, Clustering, Quadratically constrained programming, 0-1 programming",
author = "Emilio Carrizosa and Amaya Nogales-G{\'o}mez and Morales, {Dolores Romero}",
year = "2014",
language = "English",
publisher = "Mathematical Optimization Society",
address = "United States",
type = "WorkingPaper",
institution = "Mathematical Optimization Society",

}

Carrizosa, E, Nogales-Gómez, A & Morales, DR 2014 'Clustering Categories in Support Vector Machines' Mathematical Optimization Society, www.

Clustering Categories in Support Vector Machines. / Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero.

www : Mathematical Optimization Society, 2014.

Publikation: Working paperForskningpeer review

TY - UNPB

T1 - Clustering Categories in Support Vector Machines

AU - Carrizosa,Emilio

AU - Nogales-Gómez,Amaya

AU - Morales,Dolores Romero

PY - 2014

Y1 - 2014

N2 - Support Vector Machines (SVM) is the state-of-the-art in Supervised Classification. In this paper the Cluster Support Vector Machines (CLSVM) methodology is proposed with the aim to reduce the complexity of the SVM classifier in the presence of categorical features. The CLSVM methodology lets categories cluster around their peers and builds an SVM classifier using the clustered dataset. Four strategies for building the CLSVM classifier are presented based on solving: the original SVM formulation, a Quadratically Constrained Quadratic Programming formulation, and a Mixed Integer Quadratic Programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM with original data but with a dramatic decrease in complexity.

AB - Support Vector Machines (SVM) is the state-of-the-art in Supervised Classification. In this paper the Cluster Support Vector Machines (CLSVM) methodology is proposed with the aim to reduce the complexity of the SVM classifier in the presence of categorical features. The CLSVM methodology lets categories cluster around their peers and builds an SVM classifier using the clustered dataset. Four strategies for building the CLSVM classifier are presented based on solving: the original SVM formulation, a Quadratically Constrained Quadratic Programming formulation, and a Mixed Integer Quadratic Programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM with original data but with a dramatic decrease in complexity.

KW - Support vector machines

KW - Categorical features

KW - Classifier complexity

KW - Clustering

KW - Quadratically constrained programming

KW - 0-1 programming

M3 - Working paper

BT - Clustering Categories in Support Vector Machines

PB - Mathematical Optimization Society

CY - www

ER -

Carrizosa E, Nogales-Gómez A, Morales DR. Clustering Categories in Support Vector Machines. www: Mathematical Optimization Society. 2014.