Data Engineer / Data Architect
Location: Penryn, Cornwall- On premise
About Us:
Aspia Space is building the next generation of planetary intelligence. We transform observational data into trusted, consumable intelligence for agriculture, finance, environmental planning and policy.
From our offices in Harwell and Cornwall, we push Earth data to its limits to deliver insights no one else can, developing products that combine satellite imagery and expert ground truth with the very best observational science and deep learning tools to support clients across the world, powering everything from smallholder farming programmes in Africa to biodiversity net gain compliance in the UK.
We’re a multidisciplinary team of scientists, engineers, and product strategists who believe in delivering practical impact. Our ambition is global, but our focus is always local, measurable, and relevant.
Role Overview:
We’re looking for a highly skilled Data Engineer / Data Architect who can hit the ground running and join us in our Penryn office. You will be instrumental in building, managing, and optimising our data infrastructure across both on-premise HPCs and cloud platforms. You’ll work closely with ML engineers and researchers to wrangle, clean, and prepare large datasets—including geospatial data—for training our large-scale AI models.
Key Responsibilities:
• Architect, design, and manage scalable data pipelines and infrastructure across on-premise and cloud environments (AWS S3, Redshift, Glue, Step Functions).
• Ingest, clean, wrangle, and preprocess large, diverse, and often messy datasets—including structured, unstructured, and geospatial data.
• Collaborate with ML and research teams to ensure data pipelines align with model training requirements and schedules.
• Develop and maintain robust metadata management and data versioning strategies.
• Optimise data workflows for performance, reproducibility, and cost efficiency.
• Implement automated processes for data quality checks, validation, and governance.
• Champion data security, compliance, and privacy best practices.
• Monitor and troubleshoot data issues in real-time, ensuring high availability and integrity.
Essential:
• 3+ years of experience in data engineering, data architecture, or similar roles.
• Expert proficiency in Python, including popular data libraries (Pandas, PySpark, NumPy, etc.).
• Strong experience with AWS services—specifically S3, Redshift, Glue (Athena a plus).
• Solid understanding of applied statistics.
• Hands-on experience with large-scale datasets and distributed systems.
• Experience working across hybrid environments: on-premise HPCs and cloud platforms.
• Proficiency with Linux, bash scripting, and git.
• Proven ability to write clean, maintainable, and testable code.
• Ability to thrive in a fast-paced, dynamic environment with shifting priorities.
• Excellent problem-solving and communication skills.
• Proximity to our Penryn office in Cornwall, UK.
Desired:
• Experience supporting machine learning workflows, especially for large model training.
• Familiarity with handling geospatial datasets and related libraries (e.g., GDAL, GeoPandas, Rasterio).
• Familiarity with data cataloguing tools and practices.
• Prior experience in a startup or high-growth tech company.
• Familiarity with containerisation (Docker), orchestration tools (Airflow, Prefect), and CI/CD workflows.
• Understanding of foundational MLOps and data-centric AI practices.
• Experience of working in an Agile environment.
What We Offer:
• The opportunity to shape the data backbone of a transformative AI company.
• A dynamic and collaborative work environment where initiative is valued.
• Competitive salary and company benefits including private health insurance.
• Hybrid work options
• Access to cutting-edge compute infrastructure and tools.
How To Apply:
To apply, please send a PDF of your CV (and an optional cover letter) to Laura Botha at .
Applications will be reviewed on a rolling basis until the position is filled.
Please also indicate that you are aware this role requires you to be on-site in the Cornish office 3-4 days a week.