Senior Software Engineer, RL Post-Training Frameworks
This role involves designing and building scalable reinforcement learning post-training infrastructure that operates efficiently from single-GPU experiments to large-scale distributed systems. You'll optimize training-inference-rollout loops across heterogeneous hardware, contribute to open-source RL frameworks like VeRL and TorchTitan, and collaborate with AI researchers and infrastructure teams to improve distributed runtimes such as Ray and Monarch. The position emphasizes fault tolerance, elastic scaling, and integration with next-generation hardware and deep learning tools.