Job Title:Data Engineer(Scala & Spark) - London
Get The Future You Want!
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
Your Role
As a Data Engineer within the exciting, new Finance Risk and Data Analytics capability, you will be building big data solutions to solve some of the organisation’s toughest problems and delivering significant business value.
This is an exciting time to join as you will be helping to shape the Reference Data Mastering and Distribution architecture and technology stack within our new cloud-based data lake-house.
What You Will Be Doing
- Shape the portfolio of business problems to solve by building detailed knowledge of internal data sources.
- Model data landscape, obtain data extracts and define secure data exchange approaches.
- Acquire, ingest, and process data from multiple sources and systems into Cloud Data Lake
- Operate in the fast-paced, iterative environment while remaining compliant with bank’s Information Sec
- policies/standards
- Collaborate with others to map data fields to hypotheses and curate, wrangle, and prepare data for use in
What We Need
- Experience in software development, including a clear understanding of data structures, algorithms,
- software design and core programming concepts
- Comfortable multi-tasking, managing multiple stakeholders and working as part of agile team.
Your profile
- Meaningful experience in following technologies:Scala, SQLExperience and interest in Cloud platforms such asAzure (preferred)or AWS Experience in Distributed Processing usingApache SparkAbility to debug using tools like Ganglia UI, expertise in Optimizing Spark Jobs
- The ability to work across structured, semi-structured, and unstructured data, extracting information and identifying linkages across disparate data sets.
- Expert in creating data structures optimized for storage and various query patterns for e.g., Parquet and
- Delta Lake
- Meaningful experience in at least one database technology such as:
Traditional RDBMS (MS SQL Server, Oracle)
NoSQL (MongoDB, Cassandra, Neo4J, Cosmos DB, Gremlin)
- Understanding of Information Security principles to ensure compliant handling and management of data.
- Experience in traditional data warehousing / ETL tools (Informatica, Azure Data factory)
- Ability to clearly communicate complex solutions.
- Proficient at working with large and complex code bases (GitHub, Gitflow, Fork/Pull Model)
- Working experience in Agile methodologies (SCRUM, XP, Kanban)
About Capgemini
Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organization of over 360,000 team members in more than 50 countries. With its strong 55-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering, and platforms. The Group reported in 2022 global revenues of €22 billion.
Get The Future You Want |www.capgemini.com