Fast-SLM: Towards Latency-Optimal Hybrid Small Language Models

Karsten Kreis; Yonggan Fu
Publication Advances in Neural Information Processing Systems (NeurIPS)