HYBCIM: Hypercube Based Cluster Initialization Method for K-Means
Manoj Kumar Gupta1, Pravin Chandra2
1Manoj Kumar Gupta, Research Scholar, USIC&T, Guru Gobind Singh Indraprastha University, Delhi.
2Pravin Chandra, Professor, USIC&T, Guru Gobind Singh Indraprastha University, Delhi, India.
Manuscript received on 15 August 2019 | Revised Manuscript received on 20 August 2019 | Manuscript published on 30 August 2019 | PP: 3584-3587 | Volume-8 Issue-10, August 2019 | Retrieval Number: J97740881019/19©BEIESP | DOI: 10.35940/ijitee.J9774.0881019
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Clustering is a data processing technique that is extensively used to find novel patterns in data in the field of data mining and also in classification techniques. The k-means algorithm is extensively used for clustering due to its ease and reliability. A major effect on the accuracy and performance of the k-means algorithm is by the initial choice of the cluster centroids. Minimizing Sum of Squares of the distance from the centroid of the cluster for cluster points within the cluster (SSW) and maximizing Sum of Square distance between the centroids of different clusters (SSB) are two generally used quality parameters of the clustering technique. To improve the accuracy, performance and quality parameters of the k-means algorithm, a new Hypercube Based Cluster Initialization Method, called HYBCIM, is proposed in this work. In the proposed method, collection of k equi-sized partitions of all dimensions is modeled as a hypercube. The motivation behind the proposed method is that the clusters may spread horizontally, vertically, diagonally or in arc shaped. The proposed method empirically evaluated on four popular data sets. The results show that the proposed method is superior to basic k-means. HYBCIM is applicable for clustering both discrete and continuous data. Though, HYBCIM is proposed for k-means but it can also be applied with other clustering algorithms which are based on initial cluster centroids.
Index Terms: Clustering; K-Means; Cluster Initialization; Hypercube Based Cluster Initialization Method; Unsupervised Learning.
Scope of the Article: E-Learning