Phenotype-Specific Lifestyle Prediction for PCOS Using Machine Learning Multi-Class Classification and SHAP Explainability

    Sudhir Kumar Sharma, Aung Nyein Chan Paing
    Image of study
    TLDR Machine learning can accurately predict PCOS phenotypes using lifestyle and symptom data.
    This study addresses the heterogeneity in Polycystic Ovary Syndrome (PCOS) phenotypes by developing a machine learning framework to predict four specific phenotypes using non-invasive lifestyle and symptom data from 267 patients. Utilizing five classifiers—Support Vector Machines, Extreme Gradient Boosting, Random Forest, Logistic Regression, and K-Nearest Neighbours—the models achieved high accuracy (≥ 98%), with XGBoost and Random Forest achieving perfect separation. SHAP analysis identified cycle_length as the most significant predictor across all models, underscoring its clinical relevance as a biomarker for PCOS. The study demonstrates that explainable machine learning can facilitate phenotype-specific lifestyle recommendations, potentially improving PCOS management.
    Discuss this study in the Community →