
Senior DevOps Engineer
- Cape Town, Western Cape
- Permanent
- Full-time
- Hands-On Development - Design, implement, and optimise AWS infrastructure through Infrastructure as Code, ensuring environments are robust, scalable and cost-effective.
- Automation & CI/CD - Be an authority on CI/CD pipelines to automate fast, secure and seamless deployments, with a strong emphasis on improving developer experience to automate routine tasks and improve operational efficiency.
- Platform Reliability - Ensure high availability, scalability, and resilience of our platform, leveraging managed services and utilising the AWS shared responsibility model to clearly define and manage the division of responsibilities for security, compliance, and maintenance. Participate in Disaster Recovery testing to ensure platform resilience and business continuity
- Monitoring & Observability - Take ownership of proactive observability using DataDog and other tools to monitor system health, performance, and security, making sure we can see and fix issues before they impact users
- Cloud Security & Best Practices - Apply cloud and security best practices, including patching and secure configuration of networking, encryption (at rest and in transit), secrets and identity/access management.
- Collaboration - Work closely with development, testing, and security teams to support application delivery and platform improvements.
- Continuous Improvement - Contribute ideas and solutions to evolve our DevOps processes, platform reliability, and developer experience.
- Provide mentorship to less experienced engineers
- AI & Future Tech - We want to push the boundaries of AI-driven development - if you have ideas on how to embed AI into our DevOps processes, you'll have the space to explore them.
- Tech stack - We use Terraform, Terragrunt, Ansible, Helm, Python, Bash, AWS (EKS, Lambda, EC2, RDS/Aurora), Linux OS & Github Actions. You're comfortable with all of these and are an expert with Terraform and IaC principles, CI/CD and a multi-region AWS ecosystem.
- Strong experience with AWS Networking (VPC, Subnets, Security Groups, API Gateway, Load Balancing, WAF, NAT & Transit gateways, Network Firewall), and Cloud configuration (Secrets Manager, IAM, KMS, SCPs)
- Proven experience running a production-grade Kubernetes platform, utilising ArgoCD, Isitio & Deployment strategies (blue/green & canary)
- Familiarity with Cloud Security services such as Security Hub, Guard Duty, Inspector and vulnerability management/patching. Knowledge of security tooling (SIEM/SOC, Crowdstrike & Rapid7) and compliance frameworks such as CIS Benchmarks, OWASP & PCI DSS (v4) is advantageous
- Familiarity with Disaster Recovery and Resilience Strategies - Experience with automated AWS backup solutions and disaster recovery tools such as FIS.
- Observability Mindset - You believe in measuring everything. You've worked with DataDog (or similar) to ensure teams have visibility into platform health and security.
- Experience with embedding AI into DevOps processes is advantageous