Job description• As an SRE you create a bridge between development and operations by applying a software engineering mindset to system and application administration topics.
• Implement, maintain and support CI/CD tools and pipelines across our operational environments (dev, non-prod and prod) in close collaboration and together with the different feature teams.
• You ensure that the service is automated and with proper follow-up implemented. To achieve this, you need to combine application development with application administration skills.
• You add and follow-up the non-functional and reliability requirements in the backlog of each feature or product team.
• As an SRE it is essential that you have an in depth comprehension of the operational reality. This is why approximately half of your time will be spend on making sure the operational service is guaranteed. The other half of your time you will spend in making the service more resilient and in automating ‘the hell’ out of everything.
• When implementing new products and technologies, you will provide your objective evaluation and substantiate your arguments for supporting the new product or technology and this in close collaboration with the product development and our internal security risk team. This to achieve a financial industry grade secured solution.
• You lead in defining standards and methodologies around DevOps and Site reliability engineering.
• You will act as a thought leader and bring on new approaches and ways of working in close collaboration with the product development, internal security and SRE team.
• You will participate in a 24/7 duty roster every 4-5 weeks
• You have a previous min 5y experience with application development or operations
• You have code writing skills. This specifically around pipeline and automation code
• You can interact on a technical level with developers, SREs and infrastructure administrators
• You have a technology leader mindset and are always looking to improve the products/ processes and anticipate to the upcoming features and tools keeping in mind the existing set.
• You have a real working experience with public cloud computing solutions like IBM Cloud GoogleCloud, AWS, Azure and/or on-premise solutions based on OpenStack and OpenShift.
• You know concepts like Service Discovery & Service Meshing
• You have a real DevOps, SRE mindset
• You have expertise and are passionate about containerization technologies running on OpenShift/Kubernetes (docker, Terraform, Ansible, AWX, Helm, Jenkins)
• You have a strong security mindset and focus, the solutions you propose have security built-in from the start and at financial industry grade
• You should have knowledge of CI/CD principles and working experience in setting up pipelines and automatic application deployment
• Good knowledge of infrastructure, meaning, networking (routing, load balancing, firewall), storage (NAS, SAN), virtualization (Vmware, Red Hat Openstack)
• You know about agile and Large Scale Scrum (LeSS) and believe in the benefits.
• You take initiative to challenge the status quo if it improves the quality of the overall product.
• You are at least fluent in English (EU C1)
Within the financial landscape, Isabel is a key player with great ambitions. Our multibanking business is a major component of the corporate financial landscape, moving 400 million transactions a year for a total of 2500 Billion euros. A lot is moving in this area with new legislations and new technologies challenging the status quo.
The key challenge for the IT Operations teams is to have applications and infrastructure robust enough to handle these volumes while at the same time have the agility and speed to market of a Fintech.
A lot of new business initiatives are being launched and are supported by adopting state-of-the-art technologies and new methodologies. Hybrid Cloud, PaaS, DevOps and SRE principles are good examples of how the IT Operations team is supporting in the business needs for faster and easier go-to-market.
We are currently looking for a Senior Site Reliability Engineer or SRE to implement, maintain and support deployment tools and standards for various technological stacks across our operational environments.
An SRE is fundamentally doing work that has historically been done by an operations team, but using engineers with software expertise and banking on the fact that these engineers are inherently both predisposed to, and have the ability to, substitute automation for human labor. In general, an SRE team is responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.