DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models
| Title: | DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models |
|---|---|
| Authors: | Maurya, Avinash; Underwood, Robert; Rafique, M. Mustafa; Cappello, Franck; Nicolae, Bogdan |
| Source: | Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing. :227-239 |
| Availability: | http://dl.acm.org/doi/10.1145/3625549.3658685 |
| Database: | ACM Full-Text Collection |