FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference
| Title: | FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference |
|---|---|
| Authors: | Lin, Chenqi; Xu, Tianshi; Yang, Zebin; Wang, Runsheng; Huang, Ru; Li, Meng |
| Source: | Proceedings of the 61st ACM/IEEE Design Automation Conference. :1-6 |
| Availability: | http://dl.acm.org/doi/10.1145/3649329.3657374 |
| Database: | ACM Full-Text Collection |