Senior Site Reliability Engineer

Posted 17 September 2024
Salary Market related
LocationDublin
Discipline Software EngineeringSystems & Infrastructure
Reference9822
ContactJason O’Donoghue

Job description

Senior & Staff Site Reliability Engineers in Dublin

 

An amazing opportunity to join a new SRE team in Dublin!

 

We at Stelfox are collaborating with an innovative U.S.-based company that is transforming the energy landscape by leveraging otherwise wasted energy to power cutting-edge high-performance computing data centers. These centers are the backbone for customers running energy-demanding operations such as AI with GPUs, large language models (LLMs), bitcoin mining, and other intensive computational services.

 

We are actively seeking Senior and Staff Site Reliability Engineers to be part of this newly established team in Dublin. This team will work in close partnership with an existing SRE group in the U.S., ensuring seamless 24/7 operational support for customers and their mission-critical services.

 

What you'll be doing:

 

  • Monitoring and analyzing overnight alerts and system performance.
  • Automating standard procedures and developing tools to enhance monitoring capabilities.
  • Collaborating closely with software engineering teams to ensure system resilience prior to deployments.
  • Managing a range of tasks to maintain a robust and efficient operational environment.

 

What we're looking for:

 

  • A degree, Master's, or equivalent with over 8 years of commercial experience.
  • 5+ years in SRE, Software, or Systems Engineering, preferably within a 24/7 operational setting.
  • Significant experience in system/software architecture and design for new or existing systems, such as monitoring systems, compute platforms, and internal tools.
  • Proficiency in at least one programming language, such as Python or Go.
  • Strong experience with modern infrastructure tools like Docker, Kubernetes, Ansible, CloudFormation, Terraform, etc.
  • Proficiency with logging, monitoring, and alerting tools.
  • Experience with CI/CD and build systems, ensuring smooth deployments alongside software engineering teams.
  • Solid experience with Linux/Unix environments, networking protocols, and standards.

 

Note: Interviews for these roles will be conducted over the next 2-3 weeks. Don't miss your chance to be part of something groundbreaking!

Please note:

We have a number of similar positions currently, and in the future, which we would like to discuss with you should you indicate your interest in this role. When we receive your application for this role, we will contact you to advise you of our process for other similar positions.

Stelfox is fully compliant with GDPR regulations and you can read more in our privacy policy here: https://www.stelfox.com/privacy-policy-gdpr/

Your shared data will not be disclosed or transferred to a third party data controller or data processor located outside the EEA unless we have obtained your express consent.

We look forward to working with you.