| Title: |
Intertwining Generalization and Memorization |
| Authors: |
Coxe-Conklin, Henry; Smith, Kenny |
| Source: |
Proceedings of the Annual Meeting of the Cognitive Science Society, vol 44, iss 44 |
| Publisher Information: |
eScholarship, University of California |
| Publication Year: |
2022 |
| Collection: |
University of California: eScholarship |
| Subject Terms: |
Artificial Intelligence; Computer Science; Linguistics; Complex systems; Evolution; Natural Language Processing; Syntax; Agent-based Modeling; Neural Networks |
| Description: |
Over-paramaterized neural models have become dominant in Natural Language Processing. Increasing the size of a neural network seems to result in improved performance across a a broad range of tasks. Despite their size these models have been shown to generalize poorly outside their training data. Seemingly failing to extract the systematic generalizations that humans use to generate and interpret language. Increasingly work has questioned whether these models are learning to generalize or memorize, with larger capacity models potentially just memorizing their data more and more effectively. We suggest the tradeoff between memorization and generalization may be more nuanced; with the capacity of a model shaping the kinds of generalizations they are likely to acquire. Our results on a linguistic task suggest that while all models develop generalization strategies, smaller models may arrive at a smaller distribution of strategies that generalize more robustly to novel data. |
| Document Type: |
article in journal/newspaper |
| File Description: |
application/pdf |
| Language: |
unknown |
| Relation: |
qt5032871c; https://escholarship.org/uc/item/5032871c; https://escholarship.org/content/qt5032871c/qt5032871c.pdf |
| Availability: |
https://escholarship.org/uc/item/5032871c; https://escholarship.org/content/qt5032871c/qt5032871c.pdf |
| Rights: |
CC-BY |
| Accession Number: |
edsbas.18EA1446 |
| Database: |
BASE |