Talent.com
Site Reliability Engineer - Integrations

Site Reliability Engineer - Integrations

HORAtekaCairo, C, eg
26 days ago
Job type
  • Quick Apply
Job description

Job Description

Our client is seeking a Site Reliability Engineer (SRE) to join their Integration Factory team. This role is pivotal in ensuring the reliability, scalability, and performance of integration platforms and services. You will work at the intersection of software engineering and operations, with a focus on performance and availability, automation, observability, and continuous improvement of integration services — including problem management, reducing user-created incidents, and achieving a mean time to recovery (MTTR) of 48 hours or less.

  • Maintain and enhance the reliability and availability of integration platforms (e.g., API gateways, message brokers, ETL pipelines).
  • Design and implement monitoring, logging, alerting, and observability to ensure system health and performance.
  • Contribute to integration design by defining monitoring and end-to-end observability requirements.
  • Automate deployment, scaling, and recovery processes using Infrastructure as Code (IaC) and CI / CD pipelines.
  • Collaborate with API & Event consumers, integration product manager, integration development and integration architects to ensure best practices and continuous improvement in system design and deployment (e.g. feature prioritization).
  • Troubleshoot and resolve incidents in production environments, performing root cause analysis and postmortems.
  • Define and track Performance & Availability, Service Level & Operating Level Agreements (SLA, OLA), Mean-Time-To-Resolve (MTTR) and customer and peer satisfaction (NPS, P4G).
  • Continuously improve system resilience, fault tolerance, and recovery strategies.
  • Work closely with the integration support team to ensure accurate reporting and effective incident handling
  • Work along with the automated testing and observability teams to ensure and validate monitoring points effectively detect and report issues.
  • Responsible for determining the creation of dashboards in observability platform.

Requirements

Required :

o A passion for creating robust, scalable platforms that accelerate innovation.

o Bachelor’s degree in computer science, engineering, a related technical field, or equivalent practical experience.

o 5+ years experience in Site Reliability Engineering, DevOps and / or similar roles (e.g. Level 2 and 3 Operations Engineer, Integration Development).

o Strong understanding of core integration design principles and patterns (REST, GraphQL), authentication methods (OAuth, API Keys), and data formats (JSON, XML).

o Proficiency in scripting and automation (e.g., Python, Bash, Terraform, and Ansible).

o Experience with cloud platforms (e.g. Azure).

o Familiarity with monitoring and observability tools (e.g. Dynatrace).

o Solid understanding of CI / CD pipelines (e.g. Azure DevOps), containerization (Docker), and orchestration (Kubernetes).

o Exceptional communication and influencing skills, with a demonstrated ability to lead by consensus and drive standardization across multiple teams.

o Strong analytical and problem-solving skills, with a data-driven approach to decision-making.

o Good proficiency in English as a day-to-day business language is a must

Preferred :

o Previous experience as a software developer, solutions architect, or in a similar technical role.

o Hands-on experience with enterprise integration platforms and technologies.

o Specific experience with SAP integration tools such as SAP BTP Integration Suite, SAP BTP API-M, OData services, APIs, iDocs and RFCs.

o Familiarity with event-driven architecture and streaming platforms like Apache Kafka.

o Experience with cloud-based API management services, particularly Azure API Management.

o Experience with agile development methodologies (e.g., SAFe, Scrum, Kanban).

o Familiarity with DevOps practices and tools.

o Experience with Azure, D365, SAP S / 4HANA and SAP MDG. SAP and MS Azure Certifications are a plus.

o Excellent leadership and communication skills.

o Strong problem-solving and decision-making abilities.

o Strong organizational and time-management skills.

o Ability to work in a fast-paced and dynamic environment.

o FMCG background or experience working with FMCG

Create a job alert for this search

Site Engineer • Cairo, C, eg