Expert evaluation of large language models for clinical dialogue summarization

Abstract We assessed the performance of large language models’ summarizing clinical dialogues using computational metrics and human evaluations. The comparison was done between automatically generated and human-produced summaries. We conducted an exploratory evaluation of five language models: one g...

Full description

Bibliographic Details
Published in:Scientific Reports
Main Authors: David Fraile Navarro, Enrico Coiera, Thomas W. Hambly, Zoe Triplett, Nahyan Asif, Anindya Susanto, Anamika Chowdhury, Amaya Azcoaga Lorenzo, Mark Dras, Shlomo Berkovsky
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-84850-x