Maths for Machine Learning Jobs: The Only Topics You Actually Need (& How to Learn Them)
Machine learning job adverts in the UK love vague phrases like “strong maths” or “solid fundamentals”. That can make the whole field feel gatekept especially if you are a career changer or a student who has not touched maths since A level.
Here is the practical truth. For most roles on MachineLearningJobs.co.uk such as Machine Learning Engineer, Applied Scientist, Data Scientist, NLP Engineer, Computer Vision Engineer or MLOps Engineer with modelling responsibilities the maths you actually use is concentrated in four areas:
Linear algebra essentials (vectors, matrices, projections, PCA intuition)
Probability & statistics (uncertainty, metrics, sampling, base rates)
Calculus essentials (derivatives, chain rule, gradients, backprop intuition)
Basic optimisation (loss functions, gradient descent, regularisation, tuning)
If you can do those four things well you can build models, debug training, evaluate properly, explain trade-offs & sound credible in interviews.
This guide gives you a clear scope plus a six-week learning plan, portfolio projects & resources so you can learn with momentum rather than drowning in theory.
Who this is aimed at
Route A: Career changersYou can code or work with data already but you want the maths that makes ML training & evaluation feel logical rather than mystical.
Route B: Students & recent graduatesYou have seen some maths at university but you want job-ready fluency: translating maths into code plus explaining decisions clearly.
Same topics for both routes. Route A learns best by building first. Route B often benefits from connecting concepts to model behaviour & evaluation.
What “good at maths” really means in ML interviews
Interviewers are rarely checking whether you can reproduce proofs. They are checking whether you can:
Keep shapes & dimensions correct
Explain why a model is underfitting or overfitting
Choose a sensible metric & validation method
Reason about uncertainty & class imbalance
Understand what gradients are doing during training
Make trade-offs between accuracy, latency, stability, interpretability & cost
So we will focus exactly on that.
1) Linear algebra essentials for ML
Most ML is linear algebra in disguise. Features are vectors. Datasets are matrices. Weights are vectors or matrices. Embeddings are vectors in high-dimensional space.
What you actually need
Vectors & matrices
Vector addition, scalar multiplication
Dot product
Norms as “size”
Matrix multiplication & shape rules
Transpose
Geometry intuition
Distance vs similarity
Cosine similarity for embeddings & retrieval
Projections & PCA intuition
What it means to project data onto a direction
PCA as “find directions that capture most variance”
Lightweight SVD or eigen intuitionYou do not need deep proofs. You do need to understand why PCA produces components & what “variance explained” means.
Where it shows up at work
Linear regression, logistic regression
Neural nets as repeated matrix multiplies with non-linearities
Embeddings & nearest neighbour search
PCA as a sanity check on datasets & batch effects
How to learn it without getting stuck
If you want intuition fast, 3Blue1Brown’s linear algebra series is a strong visual route. YouTube
Mini exercises that build job skill
Implement cosine similarity & test it on simple vectors
Take an embeddings dataset or a toy dataset, run PCA, interpret PC1 & PC2
Write a “shape checker” function that asserts matrix dimensions in a small model pipeline
2) Probability & statistics you will actually use
ML is uncertainty management. Your data is noisy. Labels are imperfect. Sampling introduces randomness. Metrics wobble from run to run. Probability & stats is how you stop guessing.
What you actually need
Core probability
Conditional probability intuition
Independence vs dependence
Base rates and why rare events break naive thinking
Distributions you see in practice
Bernoulli and binomial for conversion events and churn
Normal as an approximation for aggregated metrics
Poisson for counts per time period
Statistics for evaluation
Mean, variance, standard deviation
Confidence interval intuition
Sampling variability
Correlation vs causation awareness
Metrics
Classification: precision, recall, F1, ROC AUC, PR AUC
Regression: MAE, RMSE
Calibration awareness if you use probabilities
A solid free pathway for the basics is Khan Academy’s statistics & probability content. khanacademy.org
Where it shows up at work
Choosing thresholds for classifiers
Handling class imbalance
Explaining whether improvements are real or noise
Designing experiments & offline evaluations
Mini exercises that build job skill
Build a confusion matrix & compute precision & recall
Create an imbalanced dataset where accuracy looks good but recall is poor then fix it
Show how metric estimates vary across different train test splits
3) Calculus essentials for gradients & backprop
Most working ML engineers do not do calculus on paper every day. But you must understand gradients conceptually because that is how models learn.
What you actually need
Derivative as rate of change
Partial derivatives conceptually
Chain rule idea
Gradient as direction of steepest increase
Gradient descent as repeated small steps downhill
For intuition, 3Blue1Brown’s calculus series is a strong option. YouTubeFor clear step-by-step explanations of ML ideas including gradient descent, StatQuest is widely used by learners. YouTube
Where it shows up at work
Understanding why loss goes down or fails to go down
Diagnosing exploding or vanishing gradients
Choosing learning rates & schedules
Explaining backprop in plain English
Mini exercises that build job skill
Plot a simple loss function then implement one step of gradient descent
Train a tiny model then log gradient norms to see instability
Explain autograd in your own words
PyTorch’s autograd tutorial is a reliable reference for how automatic differentiation supports training. docs.pytorch.org
4) Basic optimisation for training & tuning
Training is optimisation. You choose a loss function then you minimise it under constraints.
What you actually need
Loss functions: MSE, cross-entropy
SGD & mini-batch training intuition
Learning rate as a primary control knob
Regularisation: L2, dropout intuition
Early stopping
Hyperparameter search basics
A practical way to connect maths to code is to build a full training loop once. PyTorch’s training loop tutorial is a good reference. docs.pytorch.org
Where it shows up at work
Model training, fine-tuning
Debugging non-convergence
Preventing overfitting
Explaining trade-offs between model size, speed & performance
Mini exercises that build job skill
Train the same model with three learning rates & interpret the curves
Compare regularisation strengths & show how it affects validation performance
Run a small hyperparameter search & document findings
What you can ignore for most ML jobs
If your goal is an applied ML role, you can often postpone:
Proof-heavy linear algebra
Measure theory and advanced probability
Advanced calculus beyond chain rule & gradients
Abstract optimisation theory
Advanced information theory
These become relevant for research-heavy roles but they are not the fastest path to job readiness for most candidates.
A six-week maths plan for ML jobs
This plan is designed so you build confidence & output at the same time. Aim for 4 to 5 sessions per week of 45 to 60 minutes. Each week ends with something you can publish.
Week 1: Linear algebra for shapes & similarity
Learn
vectors, dot product, norms, matrix shapesBuild
cosine similarity notebook
simple matrix shape checks in NumPyOutput
repo with a short README explaining cosine similarity & why it matters
Use 3Blue1Brown for intuition support if needed. YouTube
Week 2: Linear models as your foundation
Learn
linear regression & logistic regression conceptually
link to vectors & matricesBuild
logistic regression from scratch in NumPy
plot decision boundary on a toy datasetOutput
repo showing training loop, loss curve & evaluation
Week 3: Probability, metrics & class imbalance
Learn
base rates, confusion matrix, precision recall trade-offsBuild
imbalanced classification example
threshold tuning based on a simple cost storyOutput
report notebook that explains metric choice in plain English
Khan Academy is a straightforward refresher. khanacademy.org
Week 4: Calculus intuition & autograd
Learn
derivative, chain rule, gradient descent intuitionBuild
show gradient descent on a simple function
replicate using PyTorch autogradOutput
notebook explaining gradients plus a short “what autograd is doing” section
PyTorch autograd tutorial is a good anchor reference. docs.pytorch.org
Week 5: Model evaluation the professional way
Learn
train validation test split
cross-validation
leakage pitfallsBuild
cross-validated evaluation for a model
compare two models properlyOutput
a model evaluation pack including cross-validation results & metric discussion
scikit-learn’s cross-validation guide is a reliable reference. scikit-learn.orgIts metrics documentation is useful when choosing scoring methods. scikit-learn.org
Week 6: Optimisation & a mini deep learning capstone
Learn
SGD, batch size, learning rate schedules, regularisationBuild
small PyTorch model end to end
track loss curves, p95 inference latency on your machine if relevantOutput
a clean repo with README explaining data, model, loss, optimiser, evaluation & next improvements
PyTorch training loop tutorial is useful for structure. docs.pytorch.orgTensorFlow’s beginner quickstart is an alternative if you prefer Keras. TensorFlow
Portfolio projects that prove the maths
You want projects that show you can translate maths into working systems.
Project 1: From-scratch logistic regression
Shows
linear algebra, probability intuition, optimisation basicsDeliver
vectorised implementation
loss curve
confusion matrix
threshold tuning explanation
Project 2: PCA plus clustering insight report
Shows
linear algebra intuition plus good analysis practiceDeliver
scaling rationale
PCA plot with interpretation
note on what could mislead you such as leakage or batch effects
Project 3: Imbalanced classification decision note
Shows
probability thinking plus metric maturityDeliver
PR curve discussion
chosen threshold based on costs
monitoring plan for base rate shift
Project 4: Tiny neural net training loop with diagnostics
Shows
gradients, optimisation, debugging maturityDeliver
training curves for multiple learning rates
gradient norm logging
short note on overfitting signs & fixes
How to write this on your CV
Avoid “strong maths” as a claim. Use outcomes.
Implemented logistic regression using vectorised NumPy operations & gradient descent with documented loss curves
Evaluated imbalanced classifiers using precision recall trade-offs & threshold tuning aligned to business costs
Applied cross-validation to compare models & prevent overfitting with clear metric justification scikit-learn.org
Built a PyTorch training loop using autograd plus optimiser configuration & training diagnostics docs.pytorch.org
Resources section
Linear algebra
3Blue1Brown Essence of Linear Algebra playlist. YouTube
Probability & statistics
Khan Academy Statistics & Probability. khanacademy.org
Calculus, gradients & intuition
3Blue1Brown Essence of Calculus. YouTube
StatQuest playlists including gradient descent explanations. YouTube
Core ML maths reference
Stanford CS229 lecture notes PDF. cs229.stanford.edu
Model evaluation & selection
scikit-learn cross-validation guide. scikit-learn.org
scikit-learn model evaluation metrics documentation. scikit-learn.org
Deep learning frameworks for building projects
PyTorch autograd tutorial. docs.pytorch.org
PyTorch training loop tutorial. docs.pytorch.org
TensorFlow beginner quickstart & tutorials hub. TensorFlow