ML Runtime Engineer (Mid-Level and Senior)
This role involves developing and optimizing the runtime stack for AI accelerators, focusing on integrating with open-source ML frameworks like PyTorch and vLLM. The engineer will work closely with hardware and software teams using a co-design approach to enable high-performance inference for large language models. Key responsibilities include building a high-performance runtime in Rust and supporting inference server integrations.