Decision-making Enhancement in a Big Data Environment: Application of the K-means Algorithm to Mixed Data

Oded Koren, Carina Antonia Hallin, Nir Perel, Dror Bendet

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

Big data research has become an important discipline in information systems research. However, the flood of data being generated on the Internet is increasingly unstructured and non-numeric in the form of images and texts. Thus, research indicates that there is an increasing need to develop more efficient algorithms for treating mixed data in big data for effective decision making. In this paper, we apply the classical K-means algorithm to both numeric and categorical attributes in big data platforms. We first present an algorithm that handles the problem of mixed data. We then use big data platforms to implement the algorithm, demonstrating its functionalities by applying the algorithm in a detailed case study. This provides us with a solid basis for performing more targeted profiling for decision making and research using big data. Consequently, the decision makers will be able to treat mixed data, numerical and categorical data, to explain and predict phenomena in the big data ecosystem. Our research includes a detailed end-to-end case study that presents an implementation of the suggested procedure. This demonstrates its capabilities and the advantages that allow it to improve the decision-making process by targeting organizations’ business requirements to a specific cluster[s]/profiles[s] based on the enhancement outcomes.
OriginalsprogEngelsk
TidsskriftJournal of Artificial Intelligence and Soft Computing Research
Vol/bind9
Udgave nummer4
Sider (fra-til)293-302
Antal sider10
ISSN2083-2567
DOI
StatusUdgivet - okt. 2019

Emneord

  • Big data
  • Mixed data
  • Hadoop
  • K-means
  • Decision making

Citer dette

@article{2ad6be5f7109470cb25aa2ff724b70fb,
title = "Decision-making Enhancement in a Big Data Environment: Application of the K-means Algorithm to Mixed Data",
abstract = "Big data research has become an important discipline in information systems research. However, the flood of data being generated on the Internet is increasingly unstructured and non-numeric in the form of images and texts. Thus, research indicates that there is an increasing need to develop more efficient algorithms for treating mixed data in big data for effective decision making. In this paper, we apply the classical K-means algorithm to both numeric and categorical attributes in big data platforms. We first present an algorithm that handles the problem of mixed data. We then use big data platforms to implement the algorithm, demonstrating its functionalities by applying the algorithm in a detailed case study. This provides us with a solid basis for performing more targeted profiling for decision making and research using big data. Consequently, the decision makers will be able to treat mixed data, numerical and categorical data, to explain and predict phenomena in the big data ecosystem. Our research includes a detailed end-to-end case study that presents an implementation of the suggested procedure. This demonstrates its capabilities and the advantages that allow it to improve the decision-making process by targeting organizations’ business requirements to a specific cluster[s]/profiles[s] based on the enhancement outcomes.",
keywords = "Big data, Mixed data, Hadoop, K-means, Decision making, Big data, Mixed data, Hadoop, K-means, Decision making",
author = "Oded Koren and Hallin, {Carina Antonia} and Nir Perel and Dror Bendet",
year = "2019",
month = "10",
doi = "10.2478/jaiscr-2019-0010",
language = "English",
volume = "9",
pages = "293--302",
journal = "Journal of Artificial Intelligence and Soft Computing Research",
issn = "2083-2567",
publisher = "De Gruyter Open",
number = "4",

}

Decision-making Enhancement in a Big Data Environment : Application of the K-means Algorithm to Mixed Data. / Koren, Oded; Hallin, Carina Antonia; Perel, Nir; Bendet, Dror.

I: Journal of Artificial Intelligence and Soft Computing Research, Bind 9, Nr. 4, 10.2019, s. 293-302.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Decision-making Enhancement in a Big Data Environment

T2 - Application of the K-means Algorithm to Mixed Data

AU - Koren, Oded

AU - Hallin, Carina Antonia

AU - Perel, Nir

AU - Bendet, Dror

PY - 2019/10

Y1 - 2019/10

N2 - Big data research has become an important discipline in information systems research. However, the flood of data being generated on the Internet is increasingly unstructured and non-numeric in the form of images and texts. Thus, research indicates that there is an increasing need to develop more efficient algorithms for treating mixed data in big data for effective decision making. In this paper, we apply the classical K-means algorithm to both numeric and categorical attributes in big data platforms. We first present an algorithm that handles the problem of mixed data. We then use big data platforms to implement the algorithm, demonstrating its functionalities by applying the algorithm in a detailed case study. This provides us with a solid basis for performing more targeted profiling for decision making and research using big data. Consequently, the decision makers will be able to treat mixed data, numerical and categorical data, to explain and predict phenomena in the big data ecosystem. Our research includes a detailed end-to-end case study that presents an implementation of the suggested procedure. This demonstrates its capabilities and the advantages that allow it to improve the decision-making process by targeting organizations’ business requirements to a specific cluster[s]/profiles[s] based on the enhancement outcomes.

AB - Big data research has become an important discipline in information systems research. However, the flood of data being generated on the Internet is increasingly unstructured and non-numeric in the form of images and texts. Thus, research indicates that there is an increasing need to develop more efficient algorithms for treating mixed data in big data for effective decision making. In this paper, we apply the classical K-means algorithm to both numeric and categorical attributes in big data platforms. We first present an algorithm that handles the problem of mixed data. We then use big data platforms to implement the algorithm, demonstrating its functionalities by applying the algorithm in a detailed case study. This provides us with a solid basis for performing more targeted profiling for decision making and research using big data. Consequently, the decision makers will be able to treat mixed data, numerical and categorical data, to explain and predict phenomena in the big data ecosystem. Our research includes a detailed end-to-end case study that presents an implementation of the suggested procedure. This demonstrates its capabilities and the advantages that allow it to improve the decision-making process by targeting organizations’ business requirements to a specific cluster[s]/profiles[s] based on the enhancement outcomes.

KW - Big data

KW - Mixed data

KW - Hadoop

KW - K-means

KW - Decision making

KW - Big data

KW - Mixed data

KW - Hadoop

KW - K-means

KW - Decision making

UR - https://sfx-45cbs.hosted.exlibrisgroup.com/45cbs?url_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&ctx_enc=info:ofi/enc:UTF-8&ctx_ver=Z39.88-2004&rfr_id=info:sid/sfxit.com:azlist&sfx.ignore_date_threshold=1&rft.object_id=3380000000000378&rft.object_portfolio_id=&svc.holdings=yes&svc.fulltext=yes

U2 - 10.2478/jaiscr-2019-0010

DO - 10.2478/jaiscr-2019-0010

M3 - Journal article

VL - 9

SP - 293

EP - 302

JO - Journal of Artificial Intelligence and Soft Computing Research

JF - Journal of Artificial Intelligence and Soft Computing Research

SN - 2083-2567

IS - 4

ER -