Katalog Plus
Bibliothek der Frankfurt UAS
Bald neuer Katalog: sichern Sie sich schon vorab Ihre persönlichen Merklisten im Nutzerkonto: Anleitung.
Dieses Ergebnis aus BASE kann Gästen nicht angezeigt werden.  Login für vollen Zugriff.

AlignGuard: scalable safety alignment for text-to-image generation

Title: AlignGuard: scalable safety alignment for text-to-image generation
Authors: Liu, R; Chen, IC; Gu, J; Zhang, J; Pi, R; Chen, Q; Torr, P; Khakzar, A; Pizzati, F
Publication Year: 2025
Collection: Oxford University Research Archive (ORA)
Description: Text-to-image (T2I) models are widespread, but their limited safety guardrails expose end users to harmful content and potentially allow for model misuse. Current safety measures are typically limited to text-based filtering or concept removal strategies, able to remove just a few concepts from the model’s generative capabilities. In this work, we introduce AlignGuard, a method for safety alignment of T2I models. We enable the application of Direct Preference Optimization (DPO) for safety purposes in T2I models by synthetically generating a dataset of harmful and safe imagetext pairs, which we call CoProV2. Using a custom DPO strategy and this dataset, we train safety experts, in the form of low-rank adaptation (LoRA) matrices, able to guide the generation process away from specific safety-related concepts. Then, we merge the experts into a single LoRA using a novel merging strategy for optimal scaling performance. This expert-based approach enables scalability, allowing us to remove 7× more harmful concepts from T2I models compared to baselines. AlignGuard consistently outperforms the state-of-the-art on many benchmarks and establishes new practices for safety alignment in T2I networks. We will release code and models
Document Type: conference object
Language: English
Availability: https://ora.ox.ac.uk/objects/uuid:0550d283-1253-4b8a-bdd3-55fb8b1a3854
Rights: info:eu-repo/semantics/openAccess ; CC Attribution (CC BY)
Accession Number: edsbas.32DCB319
Database: BASE