Machine Learning Techniques for Prediction of Parkinson’s Disease using Big Data
S. Kanagaraj1, M.S. Hema2, M. Nageswara Gupta3

1S. Kanagaraj, Department of Information Technology, Kumaraguru College of Technology, Coimbatore, India.
2Dr. M.S. Hema, Department of Computer Science and Engineering, Aurora’s Scientific Technological and Research Academy, Hyderabad, India.
3Dr. M. Nageswara Gupta, Department of Computer Science and Engineering, Sri Venkateshwara College of Engineering, Bengaluru, India,

Manuscript received on 02 August 2019 | Revised Manuscript received on 08 August 2019 | Manuscript published on 30 August 2019 | PP: 3788-3791 | Volume-8 Issue-10, August 2019 | Retrieval Number: J99770881019/2019©BEIESP | DOI: 10.35940/ijitee.J9977.0881019
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (

Abstract: The growth of data in the healthcare industry grows exponentially and the annual growth rate is about 40%, managing this amount of data is challenging task. Big Data architecture and frameworks affords the platform for data storage and processing of massive volume of data in healthcare industry. The paper aims to provide Big Data technologies and Machine Learning algorithms to predict Parkinson’s Disease (PD). The dataset from PPMI are used in the current study and observe the progression of the Parkinson’s Disease. The Movement Disorder Society-Unified Parkinson’s Disease (MDS-UPDRS) features are used for the prediction model. The current study focuses on machine learning algorithms from python libraries such as pandas, ski-kit learn, numpy and matplotlib. The important features obtained are tremor, bradykinesia, facial expression is observed as important features for classification. It is observed that logistic regression and multi class classifier performed with accuracy of 99.04% than the other algorithms such as Naïve Bayes, k-Nearest Neighbor, SVM and Neural Network.
Keywords: Parkinson’s Disease, Big Data, Machine Learning, Python, Jupyter Notebook, HDFS, Pandas, Numpy, Sci-kit learn, Seaborn, Matplotlib, Classification.
Scope of the Article: Classification