Hierarchical Clustering Based Improved Data Partitioning using Hybrid Similarity Measurement Approach
Kiranjit Kaur1, Vijay Laxmi2

1Kiranjit Kaur, Department of Electronics & Communication, Thapar Institute of Engineering & Technology, Patiala, India.
2Vijay Laxmi, Department of Electronics & Communication, Thapar Institute of Engineering & Technology, Patiala, India.

Manuscript received on 02 June 2019 | Revised Manuscript received on 10 June 2019 | Manuscript published on 30 June 2019 | PP: 3008-3014 | Volume-8 Issue-8, June 2019 | Retrieval Number: H7039068819/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Data partitioning improves the convenience of data utilization. It is complicated to extract information from a collection of data in a fast manner, so this research proposed a hierarchical clustering based improved data partitioning approach. General cloud architectures support and enforce partitioning to offer fast search results. The purpose of this study is to build up a partitioning method based on the amalgamation of cosine and soft cosine similarities to improve the data partitioning performance with better portioning speed and accuracy. Threshold-based partitioning methods are considered to have vertical and horizontal partitioning, where the basis is cosine and soft cosine similarity. The main focus of this research is to develop a hybrid approach for data partitioning in vertical as well as in horizontal manner. To assess the proposed work, Quality of Service (QoS) parameters such as accuracy, recall, true positive, false positive, true negative, and false negative are calculated and compared to data partitioning using a cosine-like algorithm. A comparison has been drawn between H. Guo et al. and K. Korjus et al. with the proposed work. 2.28% enhancement of recall and 39.94 % of enhancement of precision has been noticed with H. Guo et al. and with accuracy, 16.98% improvement has been shown with K. Korjus.
Keyword: Data Partitioning, Clustering, Cosine Similarity, Soft Cosine Similarity, QoS measures.
Scope of the Article: Clustering