![]()
Hybrid Machine Learning Framework for Clinical Diagnosis of Polycystic Ovary Syndrome
Fatima Khan Sarguroh1, Srivaramangai R2
1Fatima Khan Sarguroh, Student, Department of Information Technology, University of Mumbai, Mumbai (Maharashtra), India.
2Srivaramangai R., Head, Department of Information Technology, University of Mumbai, Mumbai (Maharashtra), India.
Manuscript received on 01 March 2026 | Revised Manuscript received on 07 March 2026 | Manuscript Accepted on 15 March 2026 | Manuscript published on 30 March 2026 | PP: 11-18 | Volume-15 Issue-4, March 2026 | Retrieval Number: 100.1/ijitee.A834415010526 | DOI: 10.35940/ijitee.A8344.15040326
Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Polycystic Ovary Syndrome (PCOS) is a common hormonal disorder in women, and it is difficult to diagnose at an early stage due to varying symptoms and limitations of traditional diagnostic methods. Early detection of PCOS is important to prevent long-term health complications. In this study, a hybrid machine learning model is proposed for PCOS detection using clinical and hormonal data. A dataset containing 541 patient records was used for analysis. Missing values were imputed using K-Nearest Neighbour (KNN), and the most relevant features were selected using Mutual Information. To address class imbalance, SMOTE was applied to the training data. Individual machine learning models were first evaluated, and based on their performance, a hybrid model was developed using a weighted soft-voting approach that combines Gaussian Naïve Bayes, Logistic Regression, and Random Forest. The experimental results suggest that the hybrid model strikes a better balance between accuracy, precision, and recall than any single model on its own. This makes it a more trustworthy approach for predicting PCOS. This project marks just the first phase of research, laying the groundwork for future studies that will combine these findings with ultrasound image analysis.
Keywords: Polycystic Ovary Syndrome, Hybrid Model, Random Forest, Support Vector Machines, Logistic Regression, SMOTE, Machine Learning, Ensemble Model, Medical Diagnosis.
Scope of the Article: Computer Science and Engineering
