Weblog and Retail Industries Analysis using a Robust Modified Apriori Algorithm
Jayakumar Kaliappan1, S. Mohan Sai2, K. Shaily Preetham3
1Dr. Jayakumar Kaliappan, Associate Professor, Department of Computer Science, VIT University, Vellore (Tamil Nadu), India.
2S. Mohan Sai, UG Student, Department of Computer Science, VIT University, Vellore (Tamil Nadu), India.
3K. Shaily Preetham, UG Student, Department of Computer Science and Engineering, Indian Institute of Technology, Bhubaneswar (Odisha), India.
Manuscript received on 07 April 2019 | Revised Manuscript received on 20 April 2019 | Manuscript published on 30 April 2019 | PP: 1727-1733 | Volume-8 Issue-6, April 2019 | Retrieval Number: F5042048619/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: With the rapid development of the usage of the Internet. There is a lot of information on the Internet. The Weblogs and retail industries accumulate a large amount of transactions data. It is important that this data has to be properly used to find the hidden patterns and mine the association rules among these data. First part is the data pre-processing phase which is most important but makes the data of good quality and the next is the pattern discovery phase for that we use the Apriori algorithm. The Apriori algorithm is used to find association rules and commodities and could help in promoting sales and user interaction. In this paper, we propose a modified algorithm by reducing the dimensions of the items in the transaction set database Dk and Candidate set Ck. This also provides proof of the proposed optimization techniques. This paper introduces the hashing technique for finding the frequencies of items in the one step with O (1) time complexity. The buffer is introduced to keep track of all the unwanted sets which helps in increasing the memory efficiently and computation. This paper gives the application with an example and implementation with proposed optimization techniques.
Keyword: Apriori Algorithm, Candidate Sets, Web Log Data; Data Cleaning, Hashing, Retailer Stores, Association Rules, Frequent Itemsets
Scope of the Article: Parallel and Distributed Algorithms