Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus Business Source Premier kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Textual Factors: A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information.

Title: Textual Factors: A Scalable, Interpretable, and Data-Driven Approach to Analyzing Unstructured Information.
Authors: Cong, Lin William1 (AUTHOR) will.cong@cornell.edu; Liang, Tengyuan2 (AUTHOR) tengyuan.liang@chicagobooth.edu; Zhang, Xiao3 (AUTHOR) xzhang@compasslexecon.com; Zhu, Wu4 (AUTHOR) zhuwu@sem.tsinghua.edu.cn
Source: Management Science (INFORMS). Dec2025, Vol. 71 Issue 12, p10727-10739. 13p.
Subject Terms: *SCALABILITY; *ARTIFICIAL neural networks; *EMPIRICAL research; DOCUMENT clustering; METADATA; CONTENT analysis; SOCIAL scientists
Abstract: We introduce a general approach for analyzing large-scale text-based data, combining the strengths of neural network language processing and generative statistical modeling to create a factor structure of unstructured data for downstream regressions typically used in social sciences. We generate textual factors by (i) representing texts using vector word embedding, (ii) clustering the vectors using locality-sensitive hashing to generate supports of topics, and (iii) identifying relatively interpretable spanning clusters (i.e., textual factors) through topic modeling. Our data-driven approach captures complex linguistic structures while ensuring computational scalability and economic interpretability, plausibly attaining certain advantages over and complementing other unstructured data analytics used by researchers, including emergent large language models. We conduct initial validation tests of the framework and discuss three types of its applications: (i) enhancing prediction and inference with texts, (ii) interpreting (non–text-based) models, and (iii) constructing new text-based metrics and explanatory variables. We illustrate each of these applications using examples in finance and economics such as macroeconomic forecasting from news articles, interpreting multifactor asset pricing models from corporate filings, and measuring theme-based technology breakthroughs from patents. Finally, we provide a flexible statistical package of textual factors for online distribution to facilitate future research and applications. [ABSTRACT FROM AUTHOR]
: Copyright of Management Science (INFORMS) is the property of INFORMS: Institute for Operations Research & the Management Sciences and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Business Source Premier