TY  - JOUR
AU  - Nurmambetova, Elvira
AU  - Pan, Jie
AU  - Zhang, Zilong
AU  - Wu, Guosong
AU  - Lee, Seungwon
AU  - Southern, Danielle A
AU  - Martin, Elliot A
AU  - Ho, Chester
AU  - Xu, Yuan
AU  - Eastwood, Cathy A
PY  - 2023
DA  - 2023/3/8
TI  - Developing an Inpatient Electronic Medical Record Phenotype for Hospital-Acquired Pressure Injuries: Case Study Using Natural Language Processing Models
JO  - JMIR AI
SP  - e41264
VL  - 2
KW  - pressure injury
KW  - natural language processing
KW  - NLP
KW  - algorithm
KW  - phenotype algorithm
KW  - phenotyping algorithm
KW  - machine learning
KW  - electronic medical record
KW  - EMR
KW  - pressure sore
KW  - pressure wound
KW  - pressure ulcer
KW  - pressure injuries
KW  - detect
AB  - Background: Surveillance of hospital-acquired pressure injuries (HAPI) is often suboptimal when relying on administrative health data, as International Classification of Diseases (ICD) codes are known to have long delays and are undercoded. We leveraged natural language processing (NLP) applications on free-text notes, particularly the inpatient nursing notes, from electronic medical records (EMRs), to more accurately and timely identify HAPIs. Objective: This study aimed to show that EMR-based phenotyping algorithms are more fitted to detect HAPIs than ICD-10-CA algorithms alone, while the clinical logs are recorded with higher accuracy via NLP using nursing notes. Methods: Patients with HAPIs were identified from head-to-toe skin assessments in a local tertiary acute care hospital during a clinical trial that took place from 2015 to 2018 in Calgary, Alberta, Canada. Clinical notes documented during the trial were extracted from the EMR database after the linkage with the discharge abstract database. Different combinations of several types of clinical notes were processed by sequential forward selection during the model development. Text classification algorithms for HAPI detection were developed using random forest (RF), extreme gradient boosting (XGBoost), and deep learning models. The classification threshold was tuned to enable the model to achieve similar specificity to an ICD-based phenotyping study. Each model’s performance was assessed, and comparisons were made between the metrics, including sensitivity, positive predictive value, negative predictive value, and F1-score. Results: Data from 280 eligible patients were used in this study, among whom 97 patients had HAPIs during the trial. RF was the optimal performing model with a sensitivity of 0.464 (95% CI 0.365-0.563), specificity of 0.984 (95% CI 0.965-1.000), and F1-score of 0.612 (95% CI of 0.473-0.751). The machine learning (ML) model reached higher sensitivity without sacrificing much specificity compared to the previously reported performance of ICD-based algorithms. Conclusions: The EMR-based NLP phenotyping algorithms demonstrated improved performance in HAPI case detection over ICD-10-CA codes alone. Daily generated nursing notes in EMRs are a valuable data resource for ML models to accurately detect adverse events. The study contributes to enhancing automated health care quality and safety surveillance. 
SN  - 2817-1705
UR  - https://ai.jmir.org/2023/1/e41264
UR  - https://doi.org/10.2196/41264
DO  - 10.2196/41264
ID  - info:doi/10.2196/41264
ER  -