| Title: |
Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT:A Randomized, Single Centre Study |
| Authors: |
Anand, Easan; Ghersin, Itai; Lingam, Gita; Devlin, Katie; Pelly, Theo; Singer, Daniel; Tomlinson, Chris; Munro, Robin E.J.; Capstick, Rachel; Antoniou, Anna; Hart, Ailsa L.; Tozer, Phil; Sahnan, Kapil; Lung, Phillip |
| Source: |
Anand, E, Ghersin, I, Lingam, G, Devlin, K, Pelly, T, Singer, D, Tomlinson, C, Munro, R E J, Capstick, R, Antoniou, A, Hart, A L, Tozer, P, Sahnan, K & Lung, P 2026, 'Enhancing Patient Understanding of Perianal Fistula MRI Findings Using ChatGPT : A Randomized, Single Centre Study', Diagnostics, vol. 16, no. 1, 72. https://doi.org/10.3390/diagnostics16010072 |
| Publication Year: |
2026 |
| Collection: |
King's College, London: Research Portal |
| Subject Terms: |
artificial intelligence; Crohn’s disease; cryptoglandular fistula; large language models; magnetic resonance imaging; patient communication; perianal fistula |
| Description: |
Background/Objectives: Large Language Models (LLMs) may help translate complex Magnetic Resonance Imaging (MRI) fistula reports into accessible, patient-friendly summaries. This study evaluated the clinical utility, safety, and patient acceptability of Generative Pre-trained Transformer (GPT-4o) in generating such reports. Methods: A three-phase study was conducted at a single centre. Phase I involved prompt engineering and pilot testing of GPT-4o outputs for feasibility. Phase II assessed 250 consecutive MRI fistula reports from September 2024 to November 2024, each reviewed by a multi-disciplinary panel to determine hallucinations and thematic content. Phase III randomised patients to review either a simple or complex fistula case, each containing an original report and an Artificial Intelligence (AI)-generated summary (order randomised, origin blinded), and rate readability, trustworthiness, usefulness and comprehension. Results: Sixteen patients participated in Phase I pilot testing. In Phase II, hallucinations occurred in 11% of outputs, with unverified recommendations also identified. In Phase III, 61 patients (mean age 48, 41% female) evaluated paired original and AI-generated summaries. AI summaries scored significantly higher for readability, comprehension, and usefulness than original reports (all p < 0.001), with equivalent trust ratings. Mean Flesch-Kincaid scores were markedly higher for AI-generated summaries (66 vs. 26; p < 0.001). Clinicians highlighted improved anatomical structuring and accessible language, but emphasised risks of inaccuracies. A revised template incorporating Multi-Disciplinary Team (MDT)-focused action points and a lay summary section was co-developed. Conclusions: LLMs can enhance the readability and patient understanding of complex MRI reports but remain limited by hallucinations and inconsistent terminology. Safe implementation requires structured oversight, domain-specific refinement, and clinician validation. Future development should prioritise standardised ... |
| Document Type: |
article in journal/newspaper |
| File Description: |
application/pdf |
| Language: |
English |
| DOI: |
10.3390/diagnostics16010072 |
| Availability: |
https://kclpure.kcl.ac.uk/portal/en/publications/faaae206-d40f-42cc-b4a1-8af0f576b3fe; https://doi.org/10.3390/diagnostics16010072; https://kclpure.kcl.ac.uk/ws/files/367936932/diagnostics-16-00072.pdf; https://www.scopus.com/pages/publications/105027257168 |
| Rights: |
info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by/4.0/ |
| Accession Number: |
edsbas.FA8DB402 |
| Database: |
BASE |