We are seeking a highly motivated and experienced Service Delivery Manager to join our growing Platform Operations team. The portfolio includes Platform Operations, Site Reliability Engineering and Global Release Management.. You will be part of the GCP Cloud Platform team who is responsible for deploying Cloud Native and 3rd party services for building an Analytics & Data Science platform. In this role, you will play a pivotal role in ensuring the smooth operation, release, reliability, and scalability of our analytics platform built on Google Cloud Platform (GCP). You will be responsible for leading a team of release and support engineers to provide technical support, manage incidents, releases and contribute to the continuous improvement of our platform operations.
What you will do
Build strong relationships with both internal and external stakeholders including product, business and sales partners. Demonstrate excellent communication skills with the ability to both simplify complex problems and also dive deeper if needed
Manage teams with cross functional skills that include software, quality, reliability engineers, project managers and scrum masters. Mentor, coach and develop junior
and senior software, quality and reliability engineers. Collaborate with the architects, SRE leads and other technical leadership on strategic technical direction, guidelines, and best practices
Ensure compliance with EFX secure software development guidelines and best practices and responsible for meeting and maintaining QE, DevSec, and FinOps KPIs.
Define, maintain and report SLA, SLO, SLIs meeting EFX engineering standards in partnership with the product, engineering and architecture teams
Drive technical documentation including support, end user documentation and run books. Lead Sprint planning, Sprint Retrospectives, and other team activity
Implement architecture decision making associated with Product features/stories, refactoring work, and EOSL decisions
Create and deliver technical presentations to internal and external technical and non-technical stakeholders communicating with clarity and precision, and present complex information in a concise format that is audience appropriate
Provides coaching, leadership and talent development; ensures teams functions as a high-performing team; able to identify performance gaps and opportunities for
upskilling and transition when necessary. Drives culture of accountability through actions and stakeholder engagement and expectation management
Develop the long-term technical vision and roadmap within, and often beyond, the scope of your teams. Oversee systems designs within the scope of the broader area, and review product or system development code to solve ambiguous problems
What experience you need
Bachelor's degree or 7-12 years of software engineering experience.
Led design and development of solutions using event-driven and RESTFul API architecture
Worked with SREs and dev teams to define and maintain SLA, SLO, SLIs meeting EFX quality engineering standards
Collaborated with the architects, SRE leads and other technical leadership on strategic technical direction, guidelines, and best practices. Collaborated with Product with a product mindset
Drove technical documentation including support, end user documentation and runbooks. Led Sprint planning, Sprint Retrospectives, and other team activity
Managed multiple development squads at a supervisory capacity assuming all HR responsibilities, including hiring & interviews, retention, training, and team capacity planning
Implemented architecture decision making associated with Product features/stories, refactoring work, EOSL decisions, and continuous improvements whilst ensuring EHB guidelines
Created and delivered technical presentations to internal and external technical and non-technical stakeholders communicating with clarity and precision, and present complex information in a concise format that is audience appropriate
Experience of building and managing strong technical teams that deliver complex software solutions that scale
Deep troubleshooting skills with the ability to lead and solve production and customer issues under pressure
Strong experience of full stack software development and public cloud like GCP and AWS is preferred
Cloud Certification Strongly Preferred
What could set you apart
Strong understanding of GCP services, including Vertex AI, MLOps, Compute Engine, Cloud Storage, Cloud Monitoring, and Cloud Logging.
Experience in managing platform high-availability and business continuity plan.
Familiarity with Linux operating systems and scripting languages (e.g., Bash, Python).
Excellent problem-solving, analytical, and communication skills.
Ability to work independently and as part of a team in a fast-paced environment.
Strong attention to detail and a commitment to high-quality work.
Source code control management systems (e.g. SVN/Git, Subversion) and build tools like Maven
To adhere to our corporate location policies, this resource will be required to be local to the surrounding Atlanta, GA . You are required to adhere to our Return To Office (RTO) / weekly onsite requirements (Tuesday, Wednesday, and Thursday).