Lead Site Reliability Engineer

Join our Platform Engineering Team in Sheffield

The role: Lead Site Reliability Engineer (SRE)

Egress presents to you a job description with a difference, a first-hand account of the role you are applying for from the people currently doing it.

We are looking for someone to join us as a Lead Site Reliability Engineer (SRE) within the wider Platform Engineering team based in Yorkshire. If the Platform Engineering team function sounds interesting and you feel that you have the suitable level of experience and technical skills to succeed in the role then please do get in touch. In addition to becoming a vital part of the Platform Engineering team we would require this person to contribute with pipeline development and advanced scripting & coding. Exposure to specific languages are not important but a willingness to learn new ones certainly is!

Responsibilities

  • The Lead Site Reliability Engineer is responsible for leading the development engagement to improve and standardise operational requirements and ensure operational supportability across all products
  • Identify opportunities for improving cost efficiency programmes.
  • Work closely with the development team on application architecture to increase operational availability, resilience, supportability, and scalability
  • Work closely with the VP of architecture to drive standardisation in development for operational efficiencies.
  • Identify and implement KPIs to measure success of specific initiatives
  • Research and provide guidance on cloud native technologies
  • Responsible for ensuring security is built in to design and development
  • Implementing quality assurance within operational initiatives

Who are the Platform Engineering Team and what do they do?

“The team’s role within Egress is to deploy and maintain the infrastructure where most, if not all, of Egress Products live. Our main purpose is to get product features from code developed by the Development teams and transform it into infrastructure so our customers can ultimately use it, while keeping this infrastructure secure and updated.”

“The Platform Engineering team at Egress play a pivotal role in providing deployment and support for infrastructure hosted in the cloud. This ranges from creating infrastructure diagrams to writing infrastructure as code and deploying pipelines for applications.”

“We are a small team providing 24/7/365 support managing single servers to huge clusters of servers for a wide range of customers dealing with practically every team in the business to get a customer’s dream out to production.”

“We are responsible for maintaining a fleet of over 1000 VMs and other cloud resources spread across Azure, AWS and other providers. We use tools like Terraform and Salt to automate the deployment of infrastructure and software whenever possible.”

“Our team play one of the most vital roles within the company, the developers need a platform to be able to host their applications safely and without the PET (Platform Engineering Team) that platform wouldn’t exist.

“We are highly flexible, adaptable and pivotal to the success of the organisation with close relationships to technical services, developers, senior management and more.”

What skills are needed for this role?

The key focus for the Lead Site Reliability Engineer role is to have someone that is familiar with pipeline development, and advanced scripting/coding rather than just being a systems administrator.”

“Platform Engineering requires the ability to understand the software development process and the code being written to be able to design infrastructure to work well with the software.”

Experience of software development, database administration and customer support can all feed in.”

“People with an enquiring mind and broad experiences in IT, you need to feel comfortable solving problems with potentially unfamiliar technology and be confident in the solutions you find for problems

“Systems administration skills are still vital for sure! A good understanding of Linux/Windows, cloud technology as well as being interested in learning, using and applying new technologies”

” Ability to write scripts to automate processes and monitoring systems.”

“We use many cloud technologies ranging from tooling to cloud vendors. Having tools such as Terraform and Salt, Kubernetes/Docker at our disposal help us deploy new infrastructure and applications.”

“We always aspire to go ‘Cloud-First’ and use what is available to us within providers such as Azure. why manage an SQL Cluster when someone else can!”

“Breadth of skills are more important than specific skills, end-to-end solution understanding from infrastructure to UI is key.”

“It’s beneficial to be comfortable making decisions under pressure as Egress manages systems critical for major organisations like the NHS and UK Government and the decisions made by one platform engineer could have a wide reaching impact if it’s the wrong one.”

“Knowledge sharing is also an excellent habit to have”

Who would suit a role in the Platform Engineering Team?

“Experience is a nice thing to have, but in the modern world with how rapidly cloud technologies are expanding, it is more about the thrill to learn new technologies and expanding your personal knowledge. Always having that desire to learn new technologies and pushing yourself further is a fantastic trait”

“Someone who wants a varied job that combines aspects of software development and system administration. People who want to focus solely on one project for weeks on end would not be a good fit as the work changes often and we have to manage incoming support requests and incidents alongside project work.”

“A willingness and interest to look beyond what you already know.”

“Oh and of course you need a good personality! This team is very busy so being calm under pressure and agreeing to out of hours work is sometimes required.”

Privacy and your data

Please take the time to check and read our recruitment privacy policy – you can find it at www.egress.com/legal/recruitment-privacy. The information you provide to us when you apply will be held, stored and processed by Egress Software Technologies Limited in accordance with it.

Any job offer that we may make to you will be subject to you successfully passing background checks.

Location

While this role is based in Sheffield, we will also consider home-based applicants. 

Benefits

Social
  • 25 days annual leave
  • Monthly fully funded office socials (Trampolining, Bowling, Rounders, Sports Day, Pub Quiz, Board Games Night etc.)
  • Work hard, play hard culture
  • Invite to our company Tech Summit (3-day funded retreat)

 

 

Physical
  • BUPA Private Health Cover
  • Cycle to work discount scheme
  • Free breakfasts

 

 



 

Financial
  • Pension scheme
  • Childcare voucher scheme

 








Sheffield

Office details

About our Sheffield office

Our Sheffield office is conveniently located in the centre of the city's new and growing digital campus.

It's a short walk from Sheffield train station, and a shorter walk to Sheffield city centre and Sheffield Hallam's university campus.

Similar Jobs

Job Department Location Closing Date
IT Operations Placement 2020 / Internal Infrastructure Support Operations London 25 May 2020