Efficient Inference for Diffusion Language Models
Goal: Reduce the inference cost of diffusion-based LLMs for language generation.
- Reduce the number of diffusion iterations via distillation.
- Develop a sparse attention mechanism for long-sequence generation.