Principal GenAI Specialist Solutions Architect, Training & Inference
DESCRIPTION
Do you want to help define the future of Go to Market (GTM) at AWS using generative AI (GenAI)?
AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector.
Within SMGS, you will be part of the core worldwide GenAI Training and Inference team, responsible for defining, building, and deploying targeted strategies to accelerate customer adoption of our services and solutions across industry verticals.
You will be working directly with the most important customers (across segments) in the GenAI model training and inference space helping them adopt and scale large-scale workloads (e.g., foundation models) on AWS, model performance evaluations, develop demos and proof-of-concepts, developing GTM plans, external/internal evangelism, and developing demos and proof-of-concepts.
Key job responsibilities
You will help develop the industrys best cloud-based solutions to grow the GenAI business. Working closely with our engineering teams, you will help enable new capabilities for our customers to develop and deploy GenAI workloads on AWS. You will facilitate the enablement of AWS technical community, solution architects and sales with specific customer centric value proposition and demos about end-to-end GenAI on AWS cloud.
You will possess a technical and business background that enables you to drive an engagement and interact at the highest levels with startups, Enterprises, and AWS partners. You will have the technical depth and business experience to easily articulate the potential and challenges of GenAI models and applications to engineering teams and C-Level executives. This requires deep familiarity across the stack - compute infrastructure (Amazon EC2, Lustre), ML frameworks PyTorch, JAX, orchestration layers Kubernetes and Slurm, parallel computing (NCCL, MPI), MLOPs, as well as target use cases in the cloud.
You will drive the development of the GTM plan for building and scaling GenAI on AWS, interact with customers directly to understand their business problems, and help them with defining and implementing scalable GenAI solutions to solve them (often via proof-of-concepts). You will also work closely with account teams, research scientists, and product teams to drive model implementations and new solutions.
You should be passionate about helping companies/partners understand best practices for operating on AWS. An ideal candidate will be adept at interacting, communicating and partnering with other teams within AWS such as product teams, solutions architecture, sales, marketing, business development, and professional services, as well as representing your team to executive management. You will have a natural appetite to learn, optimize and build new technologies and techniques. You will also look for patterns and trends that can be broadly applied across an industry segment or a set of customers that can help accelerate innovation.
This is an opportunity to be at the forefront of technological transformations, as a key technical leader. Additionally, you will work with the AWS ML and EC2 product teams to shape product vision and prioritize features for AI/ML Frameworks and applications. A keen sense of ownership, drive, and being scrappy is a must.
About the team
The Foundation Models (fka Training & Inference) team is highly specialized on computational workloads, performance evaluations and optimization. We work with Foundation model builders and large scale training customers, dive deep into the ML stack including the hardware (GPUs, Custom Silicon), operating system (kernel, communication libraries (NCCL, MPI), Frameworks (PyTorch, NeMO, Jax) and models (Llama, Nemotron...). We also work heavily with containers (Docker, Enroot), orchestrators/schedulers (EKS).
Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasnt followed a traditional path, or includes alternative experiences, dont let it stop you from applying.
Why AWS
Amazon Web Services (AWS) is the worlds most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - thats why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, theres nothing we cant achieve in the cloud.
Inclusive Team Culture
Here at AWS, its in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.
Mentorship and Career Growth
Were continuously raising our performance bar as we strive to become Earths Best Employer. Thats why youll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
BASIC QUALIFICATIONS
- Bachelors degree in computer science, engineering, mathematics or equivalent
- Experience developing technology solutions and evangelising end-to-end technology roadmaps that guide IT transformations toward cloud computing
- Experience in specific technology domain areas like software development, cloud computing, systems engineering, infrastructure, security, networking, data and analytics
- Experience communicating across technical and non-technical audiences and at C-level, including training, workshops, publications
- Practical experience in distributed training frameworks and inference servers. Orchestrators/schedulers (one or several of Kubernetes, EKS, Slurm), storage systems (S3, Lustre, POSIX). Experience working with GPUs or custom silicon, profiling and optimization.
PREFERRED QUALIFICATIONS
- Knowledge of distributed systems design and implementation or equivalent
- Knowledge of large scale automation and workflow management or equivalent
- Knowledge of presentations and whiteboarding skills with a high degree of comfort speaking with internal and external executives, IT management, and developers
- Experience architecting, migrating, transforming or modernizing customer requirements to the cloud
- Practical experience in High Performance Computing (HPC) and/or distributed training, performance profiling and optimization.
- Experience in distributed training (PyTorch, Jax, NeMo) and/or inference (NIMS, TRT-LLM, TorchServe, Triton).
Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice (https://www.amazon.jobs/en/privacy_page) to know more about how we collect, use and transfer the personal data of our candidates.
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/content/en/how-we-hire/accommodations.
J-18808-Ljbffr