%0 Journal Article %@ 2817-1705 %I JMIR Publications %V 3 %N %P e52190 %T Traditional Machine Learning, Deep Learning, and BERT (Large Language Model) Approaches for Predicting Hospitalizations From Nurse Triage Notes: Comparative Evaluation of Resource Management %A Patel,Dhavalkumar %A Timsina,Prem %A Gorenstein,Larisa %A Glicksberg,Benjamin S %A Raut,Ganesh %A Cheetirala,Satya Narayan %A Santana,Fabio %A Tamegue,Jules %A Kia,Arash %A Zimlichman,Eyal %A Levin,Matthew A %A Freeman,Robert %A Klang,Eyal %+ Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, 2nd Floor, 150 East 42nd Street, New York, NY, 10017, United States, 1 (212) 523 5555, pateldhaval021@hotmail.com %K Bio-Clinical-BERT %K term frequency–inverse document frequency %K TF-IDF %K health informatics %K patient care %K hospital resource management %K care %K resource management %K management %K language model %K machine learning %K hospitalization %K deep learning %K logistic regression %K retrospective analysis %K training %K large language model %D 2024 %7 27.8.2024 %9 Original Paper %J JMIR AI %G English %X Background: Predicting hospitalization from nurse triage notes has the potential to augment care. However, there needs to be careful considerations for which models to choose for this goal. Specifically, health systems will have varying degrees of computational infrastructure available and budget constraints. Objective: To this end, we compared the performance of the deep learning, Bidirectional Encoder Representations from Transformers (BERT)–based model, Bio-Clinical-BERT, with a bag-of-words (BOW) logistic regression (LR) model incorporating term frequency–inverse document frequency (TF-IDF). These choices represent different levels of computational requirements. Methods: A retrospective analysis was conducted using data from 1,391,988 patients who visited emergency departments in the Mount Sinai Health System spanning from 2017 to 2022. The models were trained on 4 hospitals’ data and externally validated on a fifth hospital’s data. Results: The Bio-Clinical-BERT model achieved higher areas under the receiver operating characteristic curve (0.82, 0.84, and 0.85) compared to the BOW-LR-TF-IDF model (0.81, 0.83, and 0.84) across training sets of 10,000; 100,000; and ~1,000,000 patients, respectively. Notably, both models proved effective at using triage notes for prediction, despite the modest performance gap. Conclusions: Our findings suggest that simpler machine learning models such as BOW-LR-TF-IDF could serve adequately in resource-limited settings. Given the potential implications for patient care and hospital resource management, further exploration of alternative models and techniques is warranted to enhance predictive performance in this critical domain. International Registered Report Identifier (IRRID): RR2-10.1101/2023.08.07.23293699 %R 10.2196/52190 %U https://ai.jmir.org/2024/1/e52190 %U https://doi.org/10.2196/52190