Am J Obstet Gynecol. April 14, 2022: S0002-9378(22)00280-0. doi: 10.1016/j.ajog.2022.04.008. Online ahead of print.
BACKGROUND: Severe maternal morbidity and mortality (SMM) remain public health priorities in the United States, given their high rates compared to other high-income countries and the notable racial and ethnic disparities that exist. In general, accurate risk stratification methods are needed to help patients, providers, hospitals, and health systems plan for and potentially avoid adverse outcomes.
OBJECTIVE: Our objective was to understand whether machine learning methods with natural language processing (NLP) of history and physical (H&P) notes could identify a group of patients at high risk of maternal morbidity at admission. for delivery without relying on additional patient information (e.g. demographics, diagnosis codes).
METHODS: This is a retrospective study of people admitted for childbirth at two hospitals (Hospital A and B) in a single health care system between July 1, 2016 and June 30, 2020. The primary outcome was SMM , as defined by the Centers for Disease Control and Prevention; we also looked at non-transfusion SMM (nt-SMM). Clinician documents designated as health and health notes were extracted from the electronic health record for processing and analysis. A Bag of Words (BOW) approach was used for this NLP analysis (i.e. each H&P score was converted into a matrix of individual word counts (or phrases) that occurred in the document ). LASSO models were used to generate prediction probabilities for SMM and non-transfusion SMM for each score. Model discrimination was assessed via the area under the receiver operating curve (AUC). Discrimination was compared between models using Delong’s test. Calibration plots were generated to assess model calibration. NLP models with H&P grade text were also compared to validated obstetric comorbidity risk scores based on diagnostic codes.
RESULTS: There were 13,572 delivery appointments with H&P ratings from Hospital A, split between training (Atrainn=10,250) and tests (Atest, n=3,322) sets for model derivation and internal validation. There were 23,397 childbirth encounters with H&P ratings from Hospital B (Bvalid) used for external validation. For the SMM outcome, the PNL model had an AUC of 0.67 (95% confidence interval (CI) 0.63, 0.72) and 0.72 (95% CI 0.70, 0, 74) in the Atest and Bvalid data sets, respectively. For the nt-SMM outcome, the AUC was 0.72 (95% CI 0.65, 0.80) and 0.76 (95% CI 0.73, 0.79) in the HAStest and Bvalid data sets, respectively. The calibration plots demonstrate the ability of the BOW model to distinguish a group of individuals at significantly higher risk for SMM and nt-SMM, notably those in the upper deciles of predicted risk. The AUCs in the NLP-based models were similar to those generated using a validated, retrospectively derived, diagnostic code-based comorbidity score.
CONCLUSION: In this practical application of machine learning, we demonstrated the capabilities of NLP for the prediction of SMM based on provider documentation inherently generated at the time of admission. This work should serve as a catalyst for providers, hospitals, and EHR systems to explore ways to integrate artificial intelligence into clinical practice and rigorously assess their ability to improve health.
PMID:35430230 | DOI:10.1016/j.ajog.2022.04.008