Title:  Engineering Manager - PSRE

Location: 

Hyderabad, TG, IN

Description: 

Team Summary

We are looking for an experienced Engineering Manager to lead our Site Reliability Engineering (SRE) team. The ideal candidate will have a strong background in SRE principles and practices, as well as experience managing and mentoring engineers. The SRE Manager will be responsible for the overall success of the SRE team, including ensuring that our systems are reliable, scalable, and secure. The team is responsible for monitoring the stability and availability of mission critical production systems, managing incidents for quicker resolution, and establishing BAU. Team also building tools/infra which to be used by all development teams to assist in monitoring and troubleshooting.

What You'll Do

  • Lead and manage the SRE team in the design, implementation, and operation of our SRE practices and processes.
  • Lead and manage a team of engineers, providing coaching, technical guidance, mentorship, goal (OKR) and performance management, and career management for their reports.
  • Work with other engineering teams to ensure that our systems are designed and implemented in a way that is reliable, scalable, and secure.
  • Represent the SRE team to other stakeholders within the company.
  • Operations management
  • Manage on-call rotations to provide 24 hours coverage
  • Day to day support of dashboard, including responding to outages and triaging cases escalated by clients/internal teams
  • Should have a flair for automation and seek opportunities to automate manual processes and service catalog items.
  • Own operational success by continuously monitoring the stability and tech KPIs of the team and remediating any issues.

What You'll Need

  • 10+ years of experience in SRE or a related field.
  • Strong understanding of SRE principles and practices.
  • Experience with observability tools.
  • Experience with incident response and management.
  • Reliability: An exposure to Chaos Engineering and various reliability practices including disaster recovery will be good to have.
  • Experience with Cloud Computing like AWS.
  • Experience with Kubernetes.
  • Experience in Agile practices (Scrum)
  • Excellent analytical, problem-solving and troubleshooting skills.
  • Excellent communication and presentation skills.
  • Experience managing and mentoring engineers.

Arcesium and its affiliates do not discriminate in employment matters on the basis of race, color, religion, gender, gender identity, pregnancy, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other category protected by law. Note that for us, this is more than just a legal boilerplate. We are genuinely committed to these principles, which form an important part of our corporate culture, and are eager to hear from extraordinarily well qualified individuals having a wide range of backgrounds and personal characteristics.

 

Arcesium's Personal Data Privacy Notice for Candidates is linked at the bottom of this page.