TY - JOUR

T1 - On Clustering and Interpreting with Rules by Means of Mathematical Optimization

AU - Carrizosa, Emilio

AU - Kurishchenko, Kseniia

AU - Marín, Alfredo

AU - Romero Morales, Dolores

PY - 2023/6

Y1 - 2023/6

N2 - In this paper, we make Cluster Analysis more interpretable with a new approach that simultaneously allocates individuals to clusters and gives rule-based explanations to each cluster. The traditional homogeneity metric in clustering, namely the sum of the dissimilarities between individuals in the same cluster, is enriched by considering also, for each cluster and its associated explanation, two explainability criteria, namely, the accuracy of the explanation, i.e., how many individuals within the cluster satisfy its explanation, and the distinctiveness of the explanation, i.e., how many individuals outside the cluster satisfy its explanation. Finding the clusters and the explanations optimizing a joint measure of homogeneity, accuracy, and distinctiveness is formulated as a multi-objective Mixed Integer Linear Optimization problem, from which non-dominated solutions are generated. Our approach is tested on real-world datasets.

AB - In this paper, we make Cluster Analysis more interpretable with a new approach that simultaneously allocates individuals to clusters and gives rule-based explanations to each cluster. The traditional homogeneity metric in clustering, namely the sum of the dissimilarities between individuals in the same cluster, is enriched by considering also, for each cluster and its associated explanation, two explainability criteria, namely, the accuracy of the explanation, i.e., how many individuals within the cluster satisfy its explanation, and the distinctiveness of the explanation, i.e., how many individuals outside the cluster satisfy its explanation. Finding the clusters and the explanations optimizing a joint measure of homogeneity, accuracy, and distinctiveness is formulated as a multi-objective Mixed Integer Linear Optimization problem, from which non-dominated solutions are generated. Our approach is tested on real-world datasets.

KW - Machine learning

KW - Interpretability

KW - Cluster analysis

KW - Rules

KW - Mixed-Integer Programming

KW - Machine learning

KW - Interpretability

KW - Cluster analysis

KW - Rules

KW - Mixed-Integer Programming

U2 - 10.1016/j.cor.2023.106180

DO - 10.1016/j.cor.2023.106180

M3 - Journal article

SN - 0305-0548

VL - 154

JO - Computers & Operations Research

JF - Computers & Operations Research

M1 - 106180

ER -