OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization
| Title: | OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization |
|---|---|
| Authors: | Guo, Cong; Tang, Jiaming; Hu, Weiming; Leng, Jingwen; Zhang, Chen; Yang, Fan; Liu, Yunxin; Guo, Minyi; Zhu, Yuhao |
| Source: | Proceedings of the 50th Annual International Symposium on Computer Architecture. :1-15 |
| Availability: | http://dl.acm.org/doi/10.1145/3579371.3589038 |
| Database: | ACM Full-Text Collection |