Dieses Ergebnis aus ERIC kann Gästen nicht angezeigt werden. Login für vollen Zugriff.

The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning

Title:	The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning
Language:	English
Authors:	Sandrine Chausson (ORCID 0009-0005-4415-4962); Marion Fourcade (ORCID 0000-0002-4821-9031); David J. Harding (ORCID 0000-0002-2121-0790); Björn Ross (ORCID 0000-0003-2717-3705); Grégory Renard
Source:	Sociological Methods & Research. 2026 55(2):568-615.
Availability:	SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://sagepub.com
Peer Reviewed:	Y
Page Count:	48
Publication Date:	2026
Document Type:	Journal Articles; Reports - Research
Descriptors:	Classification; Artificial Intelligence; Social Science Research; Natural Language Processing; Social Media; Elections
DOI:	10.1177/00491241251326819
ISSN:	0049-1241; 1552-8294
Abstract:	Modern computational text classification methods have brought social scientists tantalizingly close to the goal of unlocking vast insights buried in text data--from centuries of historical documents to streams of social media posts. Yet three barriers still stand in the way: the tedious labor of manual text annotation, the technical complexity that keeps these tools out of reach for many researchers, and, perhaps most critically, the challenge of bridging the gap between sophisticated algorithms and the deep theoretical understanding social scientists have already developed about human interactions, social structures, and institutions. To counter these limitations, we propose an approach to large-scale text analysis that requires substantially less human-labeled data, and no machine learning expertise, and efficiently integrates the social scientist into critical steps in the workflow. This approach, which allows the detection of statements in text, relies on large language models pre-trained for natural language inference, and a "few-shot" threshold-tuning algorithm rooted in active learning principles. We describe and showcase our approach by analyzing tweets collected during the 2020 U.S. presidential election campaign, and benchmark it against various computational approaches across three datasets.
Abstractor:	As Provided
Entry Date:	2026
Accession Number:	EJ1502021
Database:	ERIC