| Title: |
De-Identifying Student Personally Identifying Information in Discussion Forum Posts with Large Language Models |
| Language: |
English |
| Authors: |
Andres Felipe Zambrano; Shreya Singhal; Maciej Pankiewicz; Ryan Shaun Baker; Chelsea Porter; Xiner Liu |
| Source: |
Information and Learning Sciences. 2025 126(5-6):401-424. |
| Availability: |
Emerald Publishing Limited. Howard House, Wagon Lane, Bingley, West Yorkshire, BD16 1WA, UK. Tel: +44-1274-777700; Fax: +44-1274-785201; e-mail: emerald@emeraldinsight.com; Web site: http://www.emerald.com/insight |
| Peer Reviewed: |
Y |
| Page Count: |
24 |
| Publication Date: |
2025 |
| Document Type: |
Journal Articles; Reports - Research |
| Education Level: |
Higher Education; Postsecondary Education |
| Descriptors: |
Artificial Intelligence; Identification; Privacy; Information Security; Discussion Groups; MOOCs; College Students |
| Geographic Terms: |
Pennsylvania (Philadelphia) |
| DOI: |
10.1108/ILS-11-2024-0156 |
| ISSN: |
2398-5348; 2398-5356 |
| Abstract: |
Purpose: This study aims to evaluate the effectiveness of three large language models (LLMs), GPT-4o, Llama 3.3 70B and Llama 3.1 8B, in redacting personally identifying information (PII) from forum data in massive open online courses (MOOCs). Design/methodology/approach: Forum posts from students enrolled in nine MOOCs were redacted by three human reviewers. The GPT and Llama models were then tasked with de-identifying the same data set using standardized prompts. Discrepancies between LLM and human redactions were analyzed to identify patterns in LLM errors. Findings: All models achieved an average recall of over 0.9 in identifying PII and identified PII instances overlooked by humans. However, their precisions were lower -- 0.579 for GPT-4o, 0.506 for Llama 3.3 and 0.262 for Llama 3.1 -- showing a tendency to over-redact non-PII names and locations. Research limitations/implications: Several courses' data were analyzed to increase findings' generalizability but the models' performance may vary in other contexts. GPT and Llama models were selected because of their availability and cost-effectiveness at the time of the study; future newer models may improve performance. Practical implications: The use of downloadable LLMs enables researchers to de-identify data without training specialized models or involving external companies, ensuring that student data remains private. Originality/value: Previous research on LLM text de-identification has largely used proprietary models, which require sharing data containing sensitive PII with third-party companies. This study evaluates the performance of two open weight models that can be deployed locally, eliminating the need to share sensitive data externally. |
| Abstractor: |
As Provided |
| Entry Date: |
2025 |
| Accession Number: |
EJ1473727 |
| Database: |
ERIC |