%0 Journal Article
%@ 2817-1705
%I JMIR Publications
%V 3
%N 
%P e42630
%T Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and Validation
%A Zhang,Boya
%A Naderi,Nona
%A Mishra,Rahul
%A Teodoro,Douglas
%+ Department of Radiology and Medical Informatics, University of Geneva, 9 Chemin des Mines, Geneva, 1202, Switzerland, 41 782331908, boya.zhang@unige.ch
%K health misinformation
%K information retrieval
%K deep learning
%K language model
%K transfer learning
%K infodemic
%D 2024
%7 2.5.2024
%9 Original Paper
%J JMIR AI
%G English
%X Background: Widespread misinformation in web resources can lead to serious implications for individuals seeking health advice. Despite that, information retrieval models are often focused only on the query-document relevance dimension to rank results. Objective: We investigate a multidimensional information quality retrieval model based on deep learning to enhance the effectiveness of online health care information search results. Methods: In this study, we simulated online health information search scenarios with a topic set of 32 different health-related inquiries and a corpus containing 1 billion web documents from the April 2019 snapshot of Common Crawl. Using state-of-the-art pretrained language models, we assessed the quality of the retrieved documents according to their usefulness, supportiveness, and credibility dimensions for a given search query on 6030 human-annotated, query-document pairs. We evaluated this approach using transfer learning and more specific domain adaptation techniques. Results: In the transfer learning setting, the usefulness model provided the largest distinction between help- and harm-compatible documents, with a difference of +5.6%, leading to a majority of helpful documents in the top 10 retrieved. The supportiveness model achieved the best harm compatibility (+2.4%), while the combination of usefulness, supportiveness, and credibility models achieved the largest distinction between help- and harm-compatibility on helpful topics (+16.9%). In the domain adaptation setting, the linear combination of different models showed robust performance, with help-harm compatibility above +4.4% for all dimensions and going as high as +6.8%. Conclusions: These results suggest that integrating automatic ranking models created for specific information quality dimensions can increase the effectiveness of health-related information retrieval. Thus, our approach could be used to enhance searches made by individuals seeking online health information. 
%R 10.2196/42630
%U https://ai.jmir.org/2024/1/e42630
%U https://doi.org/10.2196/42630