HomeAI in HealthEvaluation of machine learning models for early prediction of gestational diabetes using...

Evaluation of machine learning models for early prediction of gestational diabetes using retrospective electronic health records of current and previous pregnancies

Predicting Gestational Diabetes Mellitus: Leveraging Machine Learning for Early Detection

Gestational diabetes mellitus (GDM) poses significant health risks to both mothers and their newborns. Early detection and intervention are crucial for mitigating these risks. Recent advancements in machine learning (ML) have shown promise in predicting GDM using electronic health records (EHR) from the first prenatal visit. This article delves into a study that evaluates the performance of ML models in predicting GDM and assesses whether prior pregnancy data enhances prediction accuracy.

Methods and Analysis

In a comprehensive retrospective cohort study conducted on a substantial dataset of 27,561 women (with a GDM prevalence of 11.6%), researchers developed several ML models. The study utilized four distinct ML algorithms: Logistic Regression (LR), Random Forest (RF), XGBoost (XGB), and Explainable Boosting Machine (EBM). These models were trained on seven to nine carefully selected clinical predictors extracted from EHRs. The evaluation was carried out on diverse cohorts, including the first trimester group (n = 27,561), nulliparous women (n = 11,623), multiparous women, and past cohorts (n = 4,005).

Key performance metrics included discrimination, measured by the area under the receiver operating characteristic curve (AUC) with a 95% confidence interval (CI), and calibration, assessed through slope and intercept analysis. Additionally, a decision curve analysis was performed to further evaluate the models.

Results

The study revealed that first trimester models attained an impressive AUC of 0.819 (95% CI 0.811-0.827) with LR, comparable to more sophisticated models such as XGB (AUC 0.818), EBM (AUC 0.817), and RF (AUC 0.817). In nulliparous women, the difference in performance among the models was minimal, with LR achieving an AUC of 0.813 and XGB, EBM, or RF ranging between 0.805 and 0.814.

Interestingly, for multiparous women, the inclusion of past pregnancy data alongside first trimester information significantly enhanced model discrimination. The EBM model achieved an AUC of 0.885 (95% CI 0.867 to 0.900), followed by RF (AUC 0.878), LR (AUC 0.874), and XGB (AUC 0.876). Models relying solely on data from previous pregnancies also showed good discrimination, with an AUC of 0.860 (95% CI 0.839 to 0.879).

Conclusion

The findings underscore the potential of a small set of clinically selected variables in delivering robust GDM predictions during early pregnancy (AUC ~0.81). Furthermore, incorporating information from previous pregnancies in multiparous women can enhance prediction performance (AUC ~0.86–0.89). Data from prior pregnancies alone provide valuable preconceived risk estimates. These results emphasize the potential for early GDM risk detection in both nulliparous and multiparous populations. However, further research, including external validation and clinical trials, is essential to assess the practical utility of these models and their impact on maternal and newborn outcomes.

For additional insights and detailed study data, please visit the source link: Here.

“`

Must Read
Related News

LEAVE A REPLY

Please enter your comment!
Please enter your name here