Dieses Ergebnis aus IEEE Xplore Digital Library kann Gästen nicht angezeigt werden. Login für vollen Zugriff.

MoSKA: Mixture of Shared KV Attention for Efficient Long-Sequence LLM Inference

Title:	MoSKA: Mixture of Shared KV Attention for Efficient Long-Sequence LLM Inference
Authors:	Rhee, M.; Choi, S.; Kim, E.; Sim, J.; Joo, Y.; Kim, H.
Source:	IEEE Computer Architecture Letters IEEE Comput. Arch. Lett. Computer Architecture Letters. 24(2):365-368 Dec, 2025
Database:	IEEE Xplore Digital Library