| Title: |
AI language model applications for early diagnosis of childhood epilepsy based on unstructured first-visit patient narratives:A cohort study |
| Authors: |
Loyens,Jitse; Slinger, Geertruida; Doornebal,Nynke; Braun, Kees P J; van Diessen, Eric; Otte,Willem M; Projectafdeling KIND; Neurologen; Brain |
| Publication Year: |
2025 |
| Subject Terms: |
childhood epilepsy; early diagnosis; large language models; natural language processing; Neurology; Clinical Neurology |
| Description: |
OBJECTIVE: Language serves as an indispensable source of information for diagnosing epilepsy, and its computational analysis is increasingly explored. This study assessed - and compared - the diagnostic value of different language model applications in extracting information. The aim is to identify language patterns that may contain useful clinical information that is not overtly considered by the clinician from first-visit documentation to improve the early diagnosis of childhood epilepsy. METHODS: We analyzed 1561 patient letters from the first two seizure clinics. The dataset was divided into training and test sets to evaluate performance and generalizability. We employed an established Naïve Bayes model as a natural language processing technique and a sentence-embedding (large language) model based on the Bidirectional Encoder Representations from Transformers (BERT) architecture. Both models analyzed anamnesis texts as noted by the treating physician only. Within the training sets, we identified predictive features consisting of keywords indicative of 'epilepsy' or 'no epilepsy.' Model outputs were compared to the clinician's final diagnosis (gold standard) after a two-year follow-up period. We computed accuracy, sensitivity, and specificity for both models. RESULTS: The Naïve Bayes model achieved an accuracy of 0.73 (95% CI: 0.68-0.78), with a sensitivity of 0.79 (95% CI: 0.74-0.85) and a specificity of 0.62 (95% CI: 0.52-0.72). The sentence-embedding model demonstrated comparable performance with an accuracy of 0.74 (95% CI: 0.68-0.79), a sensitivity of 0.74 (95% CI: 0.68-0.80), and a specificity of 0.73 (95% CI: 0.61-0.84). SIGNIFICANCE: Both models demonstrated relatively good performance in diagnosing childhood epilepsy solely based on the first-visit patient anamnesis text. Notably, the more advanced sentence-embedding model showed no improvement over the computationally simpler Naïve Bayes model. This suggests that modeling of anamnesis data does depend on word order for this particular ... |
| Document Type: |
article in journal/newspaper |
| File Description: |
application/pdf |
| Language: |
English |
| ISSN: |
1294-9361 |
| Relation: |
https://dspace.library.uu.nl/handle/1874/467864 |
| Availability: |
https://dspace.library.uu.nl/handle/1874/467864 |
| Rights: |
info:eu-repo/semantics/OpenAccess |
| Accession Number: |
edsbas.74FA935C |
| Database: |
BASE |