StreamDQ: HBM-Integrated On-the-Fly DeQuantization via Memory Load for Large Language Models
| Title: | StreamDQ: HBM-Integrated On-the-Fly DeQuantization via Memory Load for Large Language Models |
|---|---|
| Authors: | Jeong, M.; Yoon, D.; Ahn, S.; Lee, S.; Kim, J.; Jeon, J.; Sim, J.; Joo, Y.; Kim, H. |
| Source: | IEEE Computer Architecture Letters IEEE Comput. Arch. Lett. Computer Architecture Letters. 24(2):373-376 Dec, 2025 |
| Database: | IEEE Xplore Digital Library |