
Site Reliability Engineer (Egypt, USA, Phillipines, Mexico)
- Pretoria, Gauteng
- Permanent
- Full-time
This role involves working with a team of talented SREs/DevOps Engineers to support highly scalable services. Responsibilities include:
- Responsible for pipeline build and maintenance in accordance with
requirements such as availability, performance, security and
maintainability standards. * Maintain services through monitoring of metrics, system health, andanalysis of reports. * Provide support for production and in-house systems. Participate in on-call Production support rota. * Incident management, on call support and root cause analysis conducting post incident reviews and 5-Whys
- Remediate system vulnerability , security and resiliency measures.
- Improve process and systems within the Program.
- Lead incident management efforts by proactively monitoring and analyzing ISO 8583 financial transaction messages across the 4-party payment model (Cardholder, Merchant, Acquirer, Issuer).
- Experience with CI/CD and Build pipelines using Jenkins.
- Experience in public and private Cloud offerings (PCF, Azure, AWS etc.).
- Knowledge of NoSQL & SQL databases such as Mongo / Oracle/
- Experience and knowledge of managing distributed systems and working
- Exposure to working with Monitoring and Alerting tools such as Splunk,
- Familiarity defining SLOβs and SLAβs
- Prior experience of working in an SRE/DevOps team and excellent understanding of SRE/DevOps principles.
- High degree of initiative and self-motivation, with a willingness to take on