Clustering Categories in Support Vector Machines

Emilio Carrizosa, Amaya Nogales-Gómez, Dolores Romero Morales

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.
The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.
LanguageEnglish
JournalOmega
Volume66
Issue numberPart A
Pages28-37
Number of pages20
ISSN0305-0483
DOIs
StatePublished - Jan 2017

Keywords

  • Support vector machine
  • Categorical features
  • Classifier sparsity
  • Clustering
  • Quadratically constrained programming
  • 0-1 programming

Cite this

Carrizosa, E., Nogales-Gómez, A., & Morales, D. R. (2017). Clustering Categories in Support Vector Machines. Omega, 66(Part A), 28-37. DOI: 10.1016/j.omega.2016.01.008
Carrizosa, Emilio ; Nogales-Gómez, Amaya ; Morales, Dolores Romero. / Clustering Categories in Support Vector Machines. In: Omega. 2017 ; Vol. 66, No. Part A. pp. 28-37
@article{f57036cf7d44404ab999a3a638b5394c,
title = "Clustering Categories in Support Vector Machines",
abstract = "The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.",
keywords = "Support vector machine, Categorical features, Classifier sparsity, Clustering, Quadratically constrained programming, 0-1 programming, Support vector machine, Categorical features, Classifier sparsity, Clustering, Quadratically constrained programming, 0-1 programming",
author = "Emilio Carrizosa and Amaya Nogales-G{\'o}mez and Morales, {Dolores Romero}",
year = "2017",
month = "1",
doi = "10.1016/j.omega.2016.01.008",
language = "English",
volume = "66",
pages = "28--37",
journal = "Omega: The International Journal of Management Science",
issn = "0305-0483",
publisher = "Elsevier",
number = "Part A",

}

Carrizosa, E, Nogales-Gómez, A & Morales, DR 2017, 'Clustering Categories in Support Vector Machines' Omega, vol. 66, no. Part A, pp. 28-37. DOI: 10.1016/j.omega.2016.01.008

Clustering Categories in Support Vector Machines. / Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero.

In: Omega, Vol. 66, No. Part A, 01.2017, p. 28-37.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Clustering Categories in Support Vector Machines

AU - Carrizosa,Emilio

AU - Nogales-Gómez,Amaya

AU - Morales,Dolores Romero

PY - 2017/1

Y1 - 2017/1

N2 - The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.

AB - The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in interpretability. The CLSVM methodology clusters categories and builds the SVM classifier in the clustered feature space. Four strategies for building the CLSVM classifier are presented based on solving: the SVM formulation in the original feature space, a quadratically constrained quadratic programming formulation, and a mixed integer quadratic programming formulation as well as its continuous relaxation. The computational study illustrates the performance of the CLSVM classifier using two clusters. In the tested datasets our methodology achieves comparable accuracy to that of the SVM in the original feature space, with a dramatic increase in sparsity.

KW - Support vector machine

KW - Categorical features

KW - Classifier sparsity

KW - Clustering

KW - Quadratically constrained programming

KW - 0-1 programming

KW - Support vector machine

KW - Categorical features

KW - Classifier sparsity

KW - Clustering

KW - Quadratically constrained programming

KW - 0-1 programming

UR - http://sfx-45cbs.hosted.exlibrisgroup.com/45cbs?url_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&ctx_enc=info:ofi/enc:UTF-8&ctx_ver=Z39.88-2004&rfr_id=info:sid/sfxit.com:azlist&sfx.ignore_date_threshold=1&rft.object_id=954925430321&rft.object_portfolio_id=&svc.holdings=yes&svc.fulltext=yes

U2 - 10.1016/j.omega.2016.01.008

DO - 10.1016/j.omega.2016.01.008

M3 - Journal article

VL - 66

SP - 28

EP - 37

JO - Omega: The International Journal of Management Science

T2 - Omega: The International Journal of Management Science

JF - Omega: The International Journal of Management Science

SN - 0305-0483

IS - Part A

ER -

Carrizosa E, Nogales-Gómez A, Morales DR. Clustering Categories in Support Vector Machines. Omega. 2017 Jan;66(Part A):28-37. Available from, DOI: 10.1016/j.omega.2016.01.008