Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

Discovering Language Model Behaviors with Model-Written Evaluations

Title: Discovering Language Model Behaviors with Model-Written Evaluations
Authors: Perez, Ethan; Ringer, Sam; Lukosiute, Kamile; Nguyen, Karina; Chen, Edwin; Heiner, Scott; Pettit, Craig; Olsson, Catherine; Kundu, Sandipan; Kadavath, Saurav; Jones, Andy; Chen, Anna; Mann, Benjamin; Israel, Brian; Seethor, Bryan; McKinnon, Cameron; Olah, Christopher; Yan, Da; Amodei, Daniela; Amodei, Dario; Drain, Dawn; Li, Dustin; Tran-Johnson, Eli; Khundadze, Guro; Kernion, Jackson; Landis, James; Kerr, Jamie; Mueller, Jared; Hyun, Jeeyoon; Landau, Joshua; Ndousse, Kamal; Goldberg, Landon; Lovitt, Liane; Lucas, Martin; Sellitto, Michael; Zhang, Miranda; Kingsland, Neerav; Elhage, Nelson; Joseph, Nicholas; Mercado, Noemi; DasSarma, Nova; Rausch, Oliver; Larson, Robin; McCandlish, Sam; Johnston, Scott; Kravec, Shauna; El Showk, Sheer; Lanham, Tamera; Telleen-Lawton, Timothy; Brown, Tom
Source: Findings of the Association for Computational Linguistics: ACL 2023 ; page 13387-13434
Publisher Information: Association for Computational Linguistics
Publication Year: 2023
Document Type: conference object
Language: unknown
ISBN: 978-1-338-71343-5; 1-338-71343-4
DOI: 10.18653/v1/2023.findings-acl.847
Availability: https://doi.org/10.18653/v1/2023.findings-acl.847
Accession Number: edsbas.195A0F23
Database: BASE