An Enhanced Spectral Clustering Algorithm with S-Distance

Kumar Sharma, Krishna; Seal, Ayan; Herrera Viedma, Enrique; Krejcar, Ondrej

doi:10.3390/sym13040596

symmetry-13-00596-v3.pdf (1.602Mb)

Identificadores

URI: http://hdl.handle.net/10481/68697

DOI: 10.3390/sym13040596

Exportar

Editorial

MDPI

Materia

S-divergence

S-distance

Spectral clustering

Fecha

2021-04-02

Referencia bibliográfica

Kumar Sharma, K.; Seal, A.; Herrera-Viedma, E.; Krejcar, O. An Enhanced Spectral Clustering Algorithm with S-Distance. Symmetry 2021, 13, 596. [https://doi.org/10.3390/sym13040596]

Patrocinador

project "Prediction of diseases through computer assisted diagnosis system using images captured by minimally-invasive and non-invasive modalities", Computer Science and Engineering, PDPM Indian Institute of Information Technology, Design and Manufacturing SPARCMHRD-231; project "Smart Solutions in Ubiquitous Computing Environments", Grant Agency of Excellence, University of Hradec Kralove, Faculty of Informatics and Management, Czech Republic UHK-FIM-GE-2204/2021; Universiti Teknologi Malaysia (UTM) 20H04; Malaysia Research University Network (MRUN) 4L876; Fundamental Research Grant Scheme (FRGS) by the Ministry of Education Malaysia 5F073

Resumen

Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms-k-means, density-based spatial clustering of applications with noise and conventional SC-are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon's signed-rank test, Wilcoxon's rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.

Colecciones

DCCIA - Artículos

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 3.0 España