| Title: |
Rule-based natural language processing to extract clinical trial and research study enrollment history from unstructured notes |
| Authors: |
Goryachev, Sergey D.; Wu, Julie Tsu-Yu; Lin, Eric; Friedman, Daphne R.; Zwolinski, Robert; Dhond, Rupali; Elbers, Danne C.; La, Jennifer; Yildirim, Cenk; Corrigan, June K.; Chen, Daniel C. R.; Brophy, Mary T.; Do, Nhan V.; Fillmore, Nathanael R. |
| Contributors: |
U.S. VA Cooperative Studies Program; VA Boston Medical Informatics Fellowship |
| Source: |
Health Informatics Journal ; volume 32, issue 1 ; ISSN 1460-4582 1741-2811 |
| Publisher Information: |
SAGE Publications |
| Publication Year: |
2026 |
| Description: |
Clinical trials are vital for advancing care. However, a systematic approach to tracking trial participation across different facilities and sponsors has been lacking. We developed natural language processing (NLP) methods to extract study enrollment history, including enrollment status, consent date, and study title from information on clinical trial participation recorded in clinical notes in the electronic health record based on national Veterans Affairs electronic health record data. The method exhibited high test-set precision for enrollment status (0.94), consent date (0.97), and study title (0.87) and acceptably high recall (0.76, 0.70, and 0.84, respectively). From a single center, the classifier correctly identified 111 of 125 trial participants (88.8%) across 12 distinct trials. Our study demonstrates the feasibility of using NLP to capture trial enrollment from a nationwide healthcare system. This algorithm creates a novel data resource for analyzing and tracking trial enrollment at the population level. |
| Document Type: |
article in journal/newspaper |
| Language: |
English |
| DOI: |
10.1177/14604582261430614 |
| Availability: |
https://doi.org/10.1177/14604582261430614; https://journals.sagepub.com/doi/pdf/10.1177/14604582261430614; https://journals.sagepub.com/doi/full-xml/10.1177/14604582261430614 |
| Rights: |
https://creativecommons.org/licenses/by-nc/4.0/ ; https://journals.sagepub.com/page/policies/text-and-data-mining-license |
| Accession Number: |
edsbas.6E179B28 |
| Database: |
BASE |