Posted in Other about 3 hours ago.
Our client is seeking a dynamic Senior Site Reliability Engineer to join their team. This role is ideal for a candidate well-versed in modern reliability disciplines, capable of driving cross-team reliability initiatives. These initiatives include enhancing our client's reliability engineering practices through increased application resiliency, improved uptime/availability, and optimized application performance.
Key Responsibilities:
- Setting SLOs/SLIs/error budgets and managing reliability for infrastructure and applications
- Scripting in languages such as JavaScript, Nodejs, Python, Maven, Ansible, Bash, etc.
- Handling diverse systems with configuration management systems like Puppet, Chef, Ansible
- Leveraging automation for toil elimination
- Using tools like PagerDuty for managing incidents
- Monitoring and alerting systems like Prometheus, Grafana, Dynatrace
- Working with standard networking protocols and components
- Experience in Serverless Application Framework
- Managing containerized workloads and platforms such as Docker or Kubernetes
- Familiarity with distributed systems including Microservices
- Infrastructure automation tools such as CloudFormation, Terraform
- Understanding of CI/CD processes and deployment automation tools
- Debugging, troubleshooting, and problem-solving
- Effective communication, collaboration & negotiation skills
- Liaising with developers, operations staff, and third-party resources
- API integration projects
- Coaching/mentoring team members on reliability engineering aspects
Required Experience:
- Minimum 5+ years of experience in DevOps practices
- Hands-on experience with AWS Cloud and DevOps principles
- 2+ years of experience working on DevOps tools (GitLab CI, AWS-CodePipeline)
- 2+ years of experience in Scripting tools (Bash, Python etc.)
- 1+ years of experience in developing NodeJS or TypeScript applications
- 2+ years of experience in building and supporting applications in AWS
- 1+ year of experience in AWS CDK
Preferred Experience:
- Experience in Containerization technologies like Kubernetes, OpenShift, Docker
- Experience in Application Resiliency evaluation using AWS FIS
- Experience using Litmus for Chaos Engineering methods
- Exposure to RedHat OpenShift on AWS (ROSA)
As a lead engineer with our client's team, you will be at the forefront of Cloud and Big Data technology. This role will support highly available, business-critical applications and serve as the escalation point for complex issues in both on-premise and AWS environments. We are seeking talented engineers, well versed in DevOps technologies, automation, infrastructure orchestration, configuration management, and continuous integration.
Best Buy |
Best Buy |
Best Buy |