Senior System Software Engineer, NCCL - Partner Enablement

3 weeks ago
Seniority
Senior
Posted
17 Apr 2026 (3 weeks ago)

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars.

Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with NCCL. Most DL/HPC applications run on large clusters with high-speed networking (Infiniband, RoCE, Ethernet). This is an outstanding opportunity to get an end to end understanding of the AI networking stack. Are you ready for to contribute to the development of innovative technologies and help realize NVIDIA's vision?

What you will be doing:

  • Engage with our partners and customers to root cause functional and performance issues reported with NCCL

  • Conduct performance characterization and analysis of NCCL and DL applications on groundbreaking GPU clusters

  • Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP, etc.)

  • Guide our customers and support teams on HPC knowledge and standard methodologies for running applications on multi-node clusters

  • Document and conduct trainings/webinars for NCCL

  • Engage with internal teams in different time zones on networking, GPUs, storage, infrastructure and support.

What we need to see:

  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience. Experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM)

  • Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design

  • Experience working with engineering or academic research community supporting HPC or AI

  • Practical experience with high performance networking: Infiniband/RoCE/Ethernet networks, RDMA, topologies, congestion control

  • Expert in Linux fundamentals and a scripting language, preferably Python

  • Familiar with containers, cloud provisioning and scheduling tools (Docker, Docker Swarm, Kubernetes, SLURM, Ansible)

  • Adaptability and passion to learn new areas and tools

  • Flexibility to work and communicate effectively across different teams and timezones

Ways to stand out from the crowd:

  • Experience conducting performance benchmarking and developing infrastructure on HPC clusters. Prior system administration experience, esp for large clusters. Experience debugging network configuration issues in large scale deployments

  • Familiarity with CUDA programming and/or GPUs. Good understanding of Machine Learning concepts and experience with Deep Learning Frameworks such PyTorch, TensorFlow

  • Deep understanding of technology and passionate about what you do

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. For Poland: The base salary range is 221,250 PLN - 383,500 PLN for Level 3, and 292,500 PLN - 507,000 PLN for Level 4.

Related Jobs

View all jobs
Spotlight

Senior ML Compiler Engineer

Fractile Bristol, United Kingdom
Spotlight

Machine Learning Engineer - National Security (Gloucestershire)

Mind Foundry Gloucester, Gloucestershire, United Kingdom
On-site Clearance Required

Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA Switzerland

Senior System Software Engineer, NCCL - Partner Enablement

Senior System Software Engineer, NCCL - Partner Enablement

Senior System Software Engineer, NCCL - Partner Enablement

Senior System Software Engineer, NCCL - Partner Enablement

Senior Networking Solution Test Engineer – AI Cluster Debugging

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Where to Advertise Machine Learning Jobs in the UK (2026 Guide)

Advertising machine learning jobs in the UK requires a different approach to most technical hiring. The candidate pool is small, highly specialised and in demand across AI labs, financial services, healthcare, autonomous systems and consumer technology simultaneously. Machine learning engineers and researchers move between roles through professional networks, conference communities and specialist platforms — not general job boards where ML roles compete with unrelated software engineering positions for the same audience. This guide, published by MachineLearningJobs.co.uk, covers where to advertise machine learning roles in the UK in 2026, how the main platforms compare, what employers should expect to pay, and what the data says about hiring across different role types.

Machine Learning Jobs UK 2026: What to Expect Over the Next 3 Years

Machine learning has undergone a transformation that few technology disciplines can match. In the space of three years it has moved from a specialism sitting at the edges of most organisations' technology strategies to a capability that sits at the centre of them. The tools have changed, the expectations have shifted, and the range of industries treating machine learning as a core business function — rather than an experimental one — has expanded dramatically. For job seekers, this creates both opportunity and complexity in roughly equal measure. The machine learning jobs market of 2026 is significantly larger than it was three years ago, but it is also significantly more demanding. Employers have developed more sophisticated expectations, the technical bar for specialist roles has risen, and the landscape of tools, frameworks, and architectural patterns that practitioners are expected to know has broadened considerably. The candidates who will thrive over the next three years are those who understand where the discipline is heading — which specialisms are attracting the most investment, which technologies are reshaping what machine learning engineers and researchers are expected to build, and how the definition of a machine learning career is evolving beyond the model-building core toward a much wider range of roles across the full ML lifecycle. This article breaks down what the UK machine learning jobs market is likely to look like through to 2028 — covering the titles emerging right now, the technologies driving employer demand, the skills that will matter most, and how to position your career ahead of the curve.

New Machine Learning Employers to Watch in 2026: UK and Global Companies Driving ML Innovation

Machine learning (ML) has transitioned from a specialised field into a core business capability. In 2026, organisations across healthcare, finance, robotics, autonomous systems, natural language processing, and analytics are expanding their machine learning teams to build scalable intelligent products and services. For professionals exploring opportunities on www.MachineLearningJobs.co.uk , understanding the companies that are scaling, winning investment, or securing high‑impact contracts is crucial. This article highlights the new and high‑growth machine learning employers to watch in 2026, focusing on UK innovators, international firms with significant UK presence, and global platforms investing in machine learning talent locally.