HYBRID EVENT: Join us in person in London, UK or attend virtually from anywhere.

4th Edition of Global Conference on Gynecology & Women's Health

September 28-30, 2026 | London, UK

Gynec 2026

A machine learning approach for non-invasive PCOS diagnosis from ultrasound and clinical features

Speaker at Gynecology Conferences - Mehtap Agirsoy
Rensselaer Polytechnic Institute, United States
Title : A machine learning approach for non-invasive PCOS diagnosis from ultrasound and clinical features

Abstract:

This study explores the application of machine learning algorithms to support clinicians in achieving faster and more accurate diagnoses of Polycystic Ovary Syndrome (PCOS), with an emphasis on both computational performance and clinical relevance. Several algorithms were evaluated, including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Extreme Gradient Boosting (XGBoost). XGBoost consistently outperformed the others and was selected for further analysis. The study also investigates the potential to simplify the widely used Rotterdam criteria by identifying the most critical diagnostic features. The dataset was structured according to the Rotterdam framework, categorized into clinical, biomarker, and ultrasound data, and various combinations of these subsets were tested.
Feature selection using the chi-square-based SelectKBest method identified the top 10 predictive features, which aligned with XGBoost’s feature importance rankings, SHAP analysis, and manual feature selection. The final XGBoost model demonstrated strong performance, achieving:
Clinical + USG Features + AMH: AUC = 0.9947, Precision = 0.9553, F1 Score = 0.9553, Accuracy = 0.9553
Clinical + USG Features: AUC = 0.9852, Precision = 0.9583, F1 Score = 0.9388, Accuracy = 0.9384
The most impactful features included hair growth, weight gain, menstrual cycle regularity, fast food consumption, pimples, hair loss, follicle counts on both ovaries, and Anti-Müllerian Hormone (AMH) levels.
The model was further externally validated using a publicly available dataset, containing 320 instances with 18 diagnostic features. The XGBoost model trained on the top-ranked features achieved perfect performance (AUC = 1.0, Precision = 1.0, F1 Score = 1.0, Accuracy = 1.0) on the test set. While these results are promising, further independent validation is required to confirm real-world applicability and rule out potential data leakage or overfitting. These findings demonstrate that clinical and ultrasound features alone can achieve high diagnostic accuracy, offering a non-invasive, cost-effective approach for PCOS diagnosis. This study highlights the potential of machine learning-driven diagnostic models in enhancing clinical workflows, reducing reliance on invasive procedures, and enabling early and efficient intervention.

Youtube
Watsapp