FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference
| Title: | FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference |
|---|---|
| Authors: | Lin, Chenqi; Xu, Tianshi; Yang, Zebin; Wang, Runsheng; Huang, Ru; Li, Meng |
| Source: | 2024 61st ACM/IEEE Design Automation Conference (DAC) Design Automation Conference (DAC), 2024 61st ACM/IEEE. :1-6 Jun, 2024 |
| Relation: | 2024 61st ACM/IEEE Design Automation Conference (DAC) |
| Database: | IEEE Xplore Digital Library |