A First Estimation of the Proportion of Cybercriminal Entities in the Bitcoin Ecosystem using Supervised Machine Learning

Haohua Sun Yin, Ravi Vatrapu

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

1963 Downloads (Pure)


Bitcoin, a peer-to-peer payment system and digitalcurrency, is often involved in illicit activities such as scamming,ransomware attacks, illegal goods trading, and thievery. At thetime of writing, the Bitcoin ecosystem has not yet been mappedand as such there is no estimate of the share of illicit activities.This paper provides the first estimation of the portion of cybercriminalentities in the Bitcoin ecosystem. Our dataset consistsof 854 observations categorised into 12 classes (out of which5 are cybercrime-related) and a total of 100,000 uncategorisedobservations.The dataset was obtained from the data providerwho applied three types of clustering of Bitcoin transactions tocategorise entities: co-spend, intelligence-based, and behaviourbased.Thirteen supervised learning classifiers were then tested,of which four prevailed with a cross-validation accuracy of77.38%, 76.47%, 78.46%, 80.76% respectively. From the topfour classifiers, Bagging and Gradient Boosting classifiers wereselected based on their weighted average and per class precisionon the cybercrime-related categories. Both models were used toclassify 100,000 uncategorised entities, showing that the shareof cybercrime-related is 29.81% according to Bagging, and10.95% according to Gradient Boosting with number of entitiesas the metric. With regard to the number of addresses andcurrent coins held by this type of entities, the results are:5.79% and 10.02% according to Bagging; and 3.16% and1.45% according to Gradient Boosting
TitelProceedings. 2017 IEEE International Conference on Big Data : IEEE Big Data 2017
RedaktørerJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
Antal sider10
UdgivelsesstedLos Alamitos, CA
ISBN (Trykt)9781538627167
ISBN (Elektronisk)9781538627150, 9781538627143
StatusUdgivet - 2017
BegivenhedFifth IEEE International Conference on Big Data. IEEE BigData 2017 - Boston, USA
Varighed: 11 dec. 201714 dec. 2017
Konferencens nummer: 5


KonferenceFifth IEEE International Conference on Big Data. IEEE BigData 2017


  • Bitcoin
  • Blockchain
  • Cryptocurrency
  • Ecosystem
  • Cybercrime
  • Machine Learning
  • Supervised Learning
  • Ransomware