X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms
| Title: | X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms |
|---|---|
| Authors: | Yuan, Yueming; Gupta, Ahan; Li, Jianping; Dash, Sajal; Wang, Feiyi; Zhang, Minjia |
| Source: | Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. :1315-1331 |
| Availability: | http://dl.acm.org/doi/10.1145/3712285.3759886 |
| Database: | ACM Full-Text Collection |