Preliminary Evaluation of an Open-Source LLM for Lay Translation of German Clinical Documents

Abstract

Clinical documents are essential to patient care, but their complexity often makes them inaccessible to patients. Large Language Models (LLMs) are a promising solution to support the creation of lay translations of these documents, addressing the infeasibility of manually creating these translations in busy clinical settings. However, the integration of LLMs into medical practice in Germany is challenging due to data scarcity and privacy regulations. This work evaluates an open-source LLM for lay translation in this data-scarce environment using datasets of German synthetic clinical documents and real tumor board protocols. The evaluation framework used combines readability, semantic, and lexical measures with the G-Eval framework. Preliminary results show that zero-shot prompts significantly improve readability (e.g., FREde: 21.4 → 39.3) and few-shot prompts improve semantic and lexical fidelity. However, the results also reveal G-Eval’s limitations in distinguishing between intentional omissions and factual inaccuracies. These findings underscore the need for manual review in clinical applications to ensure both accessibility and accuracy in lay translations. Furthermore, the effectiveness of prompting highlights the need for future work to develop applications that use predefined prompts in the background to reduce clinician workload.

Publication
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
Tabea Pakull
Tabea Pakull
Researcher in the second cohort

My research interests include Deep Learning, Natural Language Processing, Lay Summarization and Explainable AI.

Hendrik Damm
Hendrik Damm
Researcher in the second cohort

My research interests include Deep Learning, Natural Language Processing, and Information Retrieval.

Noëlle Bender
Noëlle Bender
Researcher in the second cohort
Christoph M. Friedrich
Christoph M. Friedrich
Co-Speaker

My research interests include Deep Learning, Computer Vision, Radiomics, and Explainable AI.

Peter Horn
Peter Horn
Principal Investigator

My research interests include Transfusion Medicine, Immunology, and Bioinformatics.

Jens Kleesiek
Jens Kleesiek
Principal Investigator
Dirk Schadendorf
Dirk Schadendorf
Principal Investigator

My research interests include Dermatology, Medical Research, and Digitalization.

Next
Previous