At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world's most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same.
Your role
As a Site Reliability Engineer on the Engineering Operations team, you will contribute to every phase of the software development lifecycle. In our fast-paced, cloud-native environment, you'll be responsible for scaling and managing AWS-based infrastructure, designing for high availability, and maintaining reliable CI/CD pipelines. You'll work closely with development, security, privacy, and quality teams to deliver secure, scalable, and resilient systems.
Our tech stack is centered around AWS, with services built in Kotlin and deployed using modern orchestration and telemetry tools. A strong foundation in distributed systems, observability, and cloud-native design is essential. Experience with AI/ML to enhance observability, automate incident response, and enable self-healing capabilities is a valuable plus.
Your profile
Essential Qualifications
- Proven experience in operationalizing large-scale, distributed, fault-tolerant, multi-tenant systems in production environments.
- 5+ years of experience in working with AWS Services, including but not limited to EC2, S3, EKS, DynamoDB, EBS CloudFormation, Lambda, VPC, Route 53.
- Bring at least 5 years experience operating in core SDLC CI/CD processes, along with SRE concepts - Monitoring, Alerting, Incident management.
- Experience working within a DevOps operating model, with exposure to data analytics and AI/ML use cases in infrastructure and operations.
- BS degree in Computer Science or equivalent field.
Preferred Qualifications
- AWS certifications (e.g., Certified Solutions Architect, Certified Developer).
- Launched and operated commercial products and services based on AWS (references required), specific examples in FinTech are a strong plus.
- Published papers or led talks featuring leading multi-functional projects.
- Familiarity with managing interdependent systems at scale, including both interactive and batch-oriented.
- Working knowledge of ML and GenAI concepts, including LLM architectures, attention mechanisms, function calling, agentic workflows.
- Experience with AI/ML in observability and automation, such as proactive monitoring, anomaly detection, and root cause analysis using tools like Datadog, Splunk, or New Relic. Expertise of using frameworks such as LangChain or Spring AI to build AI-powered applications integrated with APIs and vector databases is a big plus.
If you're excited about this role but don't meet every requirement, we still encourage you to apply, your unique experience could be just what we need
What you'll love about working here
- Open access to digital learning platforms
- Active employee networks promoting diversity, equity and inclusion like OutFront, CapAbility or Women@Capgemini
- A work environment recognized by Ethisphere as one of the World's most Ethical companies
Need to know
- All roles will require a level of security clearance; BPSS OR Security Clearance OR Developed Vetting.
- You can bring your whole self to work. At Capgemini building an inclusive future is part of everyday life and will be part of your working reality. We have built a representative and welcoming environment, for everyone
#LI-GP5
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.