Manuscript received on May 04, 2020. | Revised Manuscript received on May 20, 2020. | Manuscript published on June 10, 2020. | PP: 124-130 | Volume-9 Issue-8, June 2020. | Retrieval Number: 100.1/ijitee.H6274069820 | DOI: 10.35940/ijitee.H6274.069820
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A new split attribute measure for decision tree node split during decision tree creation is proposed. The new split measure consists of the sum of class counts of distinct values of categorical attributes in the dataset. Larger counts induce larger partitions and smaller trees there by favors to the determination of the best spit attribute. The new split attribute measure is termed as maximum exponential class counts (MECC). Experiment results obtained over several UCI machine learning categorical datasets predominantly indicate that the decision tree models created based on the proposed MECC node split attribute technique provides better classification accuracy results and smaller trees in size than the decision trees created using popular gain ratio, normalized gain ratio and gini-index measures. The experimental results are mainly focused on performing and analyzing the results from the node splitting measures alone.
Keywords: Categorical attributes, Categorical datasets, larger counts and larger partitions, Maximum exponential class counts (MECC).
Scope of the Article: Classification