Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

NVIDIA researchers, alongside teams from MIT, HKU, and Tsinghua, have developed QeRL, an innovative reinforcement learning framework that pushes post-training of 32 billion parameter large language models (LLMs) into 4-bit NVFP4 precision. This breakthrough enables running RL on a single NVIDIA H100 GPU with BF16-level accuracy and provides speed improvements ranging from 1.2 to 1.5 times per step.
This advancement is critical as it drastically reduces the computational resources and costs typically associated with training such massive models. Developers and AI researchers can now efficiently explore and fine-tune large language models, making cutting-edge AI capabilities more accessible.
By open-sourcing QeRL, NVIDIA and collaborators invite the AI community to leverage this efficient quantization technique, which could reshape how large-scale reinforcement learning is approached and accelerate innovation in AI model training.