Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines

Abstract

Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the needs of medical practitioners by incorporating text difficulty features. We collected a dataset of 209 scientific research abstracts from different medical fields, available in both English and German. To determine the difficulty aspects of readability and technical level of each abstract, 216 medical experts annotated the dataset. We used a pre-trained BERT model, fine-tuned to our dataset, to develop a regression model predicting those difficulty features of abstracts. To highlight the strength of this approach, the model was compared to readability formulas currently in use. Analysis of the dataset revealed that German abstracts are more technically complex and less readable than their English counterparts. Our baseline model showed greater efficacy than current readability formulas in predicting domain-specific readability aspects. Conclusion: Incorporating these text difficulty aspects into the search engine will provide healthcare professionals with reliable and efficient information retrieval tools. Additionally, the dataset can serve as a starting point for future research.

Sameh Frihat
Sameh Frihat
Researcher in the first cohort

My research interests include Information Retrieval, Natural Language Processing, Machine Learning, and Explainable AI.

Catharina Beckmann
Catharina Beckmann
Researcher in the first cohort

My research interests include Personalized Medicine, Clinical Decision Support Systems, Context Modeling, and Clinical Practice Guidelines.

Eva Hartmann
Eva Hartmann
Researcher in the first cohort

My research interests include Deep Learning, Computer Vision, Radiomics, and Explainable AI.

Norbert Fuhr
Norbert Fuhr
Principal Investigator

My research interests include Information Retrieval, Natural Language Processing and Computer Science.

Next
Previous