About Springer Nature Group
Springer Nature opens the doors to discovery for researchers, educators, clinicians and other professionals. Every day, around the globe, our imprints, books, journals, platforms and technology solutions reach millions of people. For over 180 years our brands and imprints have been a trusted source of knowledge to these communities and today, more than ever, we see it as our responsibility to ensure that fundamental knowledge can be found, verified, understood and used by our communities – enabling them to improve outcomes, make progress, and benefit the generations that follow. Visit and follow @SpringerNature / @SpringerNatureGroup
Job Title: Senior Data Engineer, Project Data
Location(s): London, Berlin
Springer Nature is one of the world’s leading global research, educational and professional publishers. It is home to an array of respected and trusted brands and imprints, with more than 170 years of combined history behind them, providing quality content through a range of innovative products and services. Every day, around the globe, our imprints, books, journals and resources reach millions of people, helping researchers and scientists to discover, students to learn and professionals to achieve their goals and ambitions. The company has almost 13,000 staff in over 50 countries.
About Us
We’re looking for a Data Engineer to join within Springer Nature Operations. Springer Nature is a leading publisher of scientific books, journals and magazines with over 3000 journal titles and one of the world’s largest corpora of peer-reviewed scientific text data. You would be joining a new programme of work to transform how Springer Nature uses its data: building up data capabilities, creating a data platform and engineering capability (technology, people and process) to create a foundation for the future, adding value to cross-organisation Initiatives and kick-starting data-driven Innovation.
Across the programme, our teams are cross-functional, diverse and made up of different experience levels. All team members collaborate to deliver the best solutions that satisfy our customers’ needs.
We are committed to growing and nurturing our people for the long-term. We attend conferences and make time for people to explore new technology that interests them and is relevant to our work as well host data engineering community-of-practice sessions to share knowledge.
We work in a supportive environment. We value face-to-face co-working and require two days per week in the office on average in a month.
The position is based in either London (UK) or Berlin (Germany), with some travel required, typically 3-4 times a year. This role is on a small, autonomous team and you will be expected to impact what we do and how we work. We like to keep our processes light, and bureaucracy slim.
About You
You have strong SQL and data problem-solving skills and several years of experience in data/software engineering on a cloud platform (AWS/GCP/Azure) using tools such as DBT and programming languages such as Python, Scala or Java. You have experience designing, building, delivering, and optimising production data solutions, such as data pipelines, implementing the data supply chain from source systems to a wide range of data product consumers. You factor in non-functional aspects of data pipeline development, including quality checks, cost-effectiveness, sensitive data handling, usage monitoring, and observability of data pipelines and data quality. You promote working in a cross-functional team where there is collective code ownership. You understand how your teams’ work can impact interdependent teams and design accordingly You are comfortable with making large-scale refactorings of a codebase You can facilitate and guide technical discussions to a workable outcome You enjoy mentoring junior team members and act as a role model on the team You understand distributed systems concepts and are familiar with the pros and cons of common data architectures, including data meshes. You are comfortable with moving between teams and departments to deliver the most impactful work
You understand the benefits of agile software engineering practices when applied to data engineering, especially:
You understand the benefits of test-driven development and automation. You are comfortable with pair programming and practising continuous integration and continuous delivery. You see the value in developers owning production software and view failure as a learning opportunity. You take a user-first approach to the design and delivery of data products. You look to continually simplify and automate data solutions using new tools and techniques
. What you will be doing
Within 3 Months you will:
Become familiar with our emerging Google Cloud technology stack and data landscape. Understand the data requirements and issues facing our users, which include data analysts, BI developers, and AI/ML engineers. Collaborate effectively with each discipline on the team. Actively participate in technical discussions and share ideas. Have understood the role of architects, data governance and other data engineers in the organisation
By 3-6 months you will:
Have an understanding of the team’s context within the wider organisation. Be a supportive member of the team, leading the development of solutions using the appropriate technology to solve the problem at hand. Oversee support queries and diagnose issues in our live applications. Identify new sources of data across the organisation and build relationships with data providers to gain access. Understand the processes by which data is acquired and any resulting limitations or bias and communicate this to the team. Lead the development and maintenance of data pipelines into systems like BigQuery to analyze, clean, and join datasets in an automated, repeatable manner Consider how new data product views can better serve users' needs Ensure that data is stored securely and in compliance with GDPR. Work with data owners to understand how we can allow them to self-serve their data using tools we develop.
By 6-12 months you will:
Develop processes and tools to monitor feeds and test data integrity and completeness and to alert users when a problem occurs. Understand our customers’ needs, both internal and external, and how your work affects their experience. Identify and suggest next steps for addressing architectural shortcoming in company-wide data flows Able to gauge the complexity or scope of a piece of work, breaking it into smaller pieces when appropriate
.Give and receive constructive feedback within your team. Mentor junior members of the team in the principles of data engineering and promote best practices. Promote and advocate the use of data across Springer Nature and contribute to the data engineering community of practice.
Day-to-day responsibilities
As part of a Lean/Agile data product team, day-to-day you will:
Contribute to team interaction, such as stand-ups, story writing, collaborative design sessions, retrospectives and subsequent actions. Actively engage in pair programming sessions, fostering a culture of knowledge-sharing and collective code ownership. Take part in the support and monitoring of our services