Tobit Regressive Based Gaussian Independence Bayes Map Reduce Classifier on Data Warehouse for Predictive Analytics
R. Sivakkolundu1, V. Kavitha2
1R. Sivakkolundu, Department of Computer Science, Bharathiar University, Coimbatore, India.
2Dr. V. Kavitha, Professor, Department of PG and Research Department of Computer Applications (MCA), Hindusthan College of Arts and Science, Coimbatore, India.
Manuscript received on September 16, 2019. | Revised Manuscript received on 24 September, 2019. | Manuscript published on October 10, 2019. | PP: 4269-4280 | Volume-8 Issue-12, October 2019. | Retrieval Number: L27091081219/2019©BEIESP | DOI: 10.35940/ijitee.L2709.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Data warehouse comprises of data collected from different probable heterogeneous resources at different time intervals with the objective of responding to user analytic queries. Big data is a field that helps in analysing and extracting information from large datasets. The unfolding Big Data incorporation inflicts multiple confronts, compromising the feasible business research practice. Heterogeneous resources, high dimensionality and massive volumes that confront Big Data prototype may prevent the effectual data and system integration processes. In this work, we plan to develop a Tobit Regressive based Gaussian Independence Bayes Map Reduce Classifier (TRGIBMRC) method for categorizing the collected and stored data which helps the users in making decision with minimum time consumption. The TR-GIBMRC method consists of two processes. They are, Tobit Regressive Feature Selection and Gaussian Independence Bayes Map Reduce Classification. Tobit Regressive Feature Selection process is used to select relevant features from collected and stored data. Tobit statistical model, used to describe the relationship between non-negative dependent variable and an independent variable for selecting relevant features. Next, Gaussian Independence Bayes Map Reduce Classifier is used to classify the selected relevant features for decision making with lesser time consumption. Gaussian Independence Bayes Map Reduce Classifier, a probabilistic classifier segments the data by class by measuring the mean and variance of data in each class. The data point gets allocated to the class with minimal variance. This in turn helps to perform efficient data classification for accurate decision making. Experimental evaluation is carried out on the factors such as feature selection rate, classification accuracy, classification time and error rate with respect to number of features and number of data points.
Keywords: Big Data, Data warehouse, Feature Selection, Gaussian, Bayes, Map Reduce Classifier, Tobit Regressive
Scope of the Article: Big Data Analytics