We are looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading language technology powering Alexa for Shopping, our AI-driven search and shopping assistant, helping customers with their shopping tasks at every step of their shopping journey.
This innovative role focuses on developing and optimizing language model-powered (LLM/SLM) conversational experiences. The core emphasis is to get the best performance out LLMs/SLMs via careful and methodical instruction design, contextual grounding, informed choices of MCP tools and agent/multi-agent systems, context engineering, model fine-tuning, evaluation frameworks, and experimentation to systematically improve quality, robustness, and customer impact. The work combines scientific rigor with product intuition to systematically raise the bar for conversational AI performance at Amazon scale.
Our mission in conversational shopping is to make it easy for customers to find and discover the best products to meet their needs by helping with their product research, providing comparisons and recommendations, answering product questions, enabling shopping directly from images or videos, providing visual inspiration, and more. We do this by leveraging advanced analytics, Natural Language Processing (NLP), Machine Learning (ML), A/B testing, causal inference, and data-driven insights to continuously improve our systems.
Key job responsibilities
As an Applied Scientist on our team, you will develop and maintain LLM agents, including automated eval pipelines, LLM-as-a-judge methodologies, rubric design, and dataset curation to measure nuanced aspects of response quality.
You will partner with the wider org to experiment with techniques such as retrieval augmentation, context enrichment, prompt decomposition, and model fine-tuning or post-training strategies, if and when applicable. Where latency and cost constraints demand it, you will lead post-training of small language models (SLMs) — including supervised fine-tuning, preference optimisation, and distillation — to deliver low-latency conversational and shopping experiences.
You will apply applied machine learning and deep learning techniques as last-mile improvements to shopping experiences, that might span ranking, relevance, personalisation, and multimodal understanding. You will design and evaluate agentic architectures that balance the needs of diverse shopping use cases, making principled choices across paradigms such as single-agent and multi-agent systems, memory management strategies, and tool orchestration to optimise for quality, latency, and reliability at scale. You will leverage petabytes of data and identify opportunities to leverage machine learning models aimed at making conversational systems more performant.
A day in the life
- Perform hands-on analysis of large-scale multimodal interaction datasets to develop insights into how customers engage with conversational AI systems and how to improve response quality and customer experience.
- Use statistical methods, experimentation, and data-driven analysis to develop scalable approaches for measuring, evaluating, and optimizing large language model (LLM)-based shopping assistant systems, leveraging structured and unstructured contextual signals.
- Conduct deep-dive analyses to identify opportunities for improving conversational relevance, grounding, customer satisfaction, and downstream business impact.
- Collaborate with Product management and Engineers to translate analytical insights into production systems, working closely on model evaluation and deployment.
- Communicate results and insights to both technical and non-technical audiences, including through presentations, written reports, and data visualizations.
About the team
The Alexa for Shopping Science team, based in London, works alongside ~150 engineers, designers and product managers, shaping the future of AI-driven shopping experiences at Amazon. The team works on every aspect of the conversational AI system, from making it agentic, enabling customers to set price alerts or empower the assistant to act on their behalf and automatically purchase products when the price is right, to understanding multimodal user queries and generating answers that combine text, image, audio and video, including deep research reports that scour the web and the Amazon catalog to provide detailed and personalised shopping guidance. We utilize and advance state-of-art techniques in the fields of Natural Language Processing, gen AI, Information Retrieval, Machine/Deep Learning, and Data Mining. We validate our work by actively participating in the internal and external scientific communities.