Site Reliability Engineer

  • Centurion, Gauteng Pretoria, Gauteng
  • Permanent
  • Full-time
  • 13 days ago
Our esteemed client, operating as a Mobile Virtual Network Operator (MVNO) aggregator (MVNE), is recognized as a trailblazer in innovative practices, adeptly integrating with host networks across charging and provisioning tiers. At present, they are in search of a proficient Site Reliability Engineer with a specialized focus on Kubernetes to augment their team. The ideal candidate will bear the responsibility of ensuring the robustness, scalability, and efficiency of their infrastructure and applications. Key duties encompass Kubernetes CI/CD implementation, monitoring, log analysis, application issue resolution, Linux and Active Directory management, DNS administration, end-user assistance, and PostgreSQL database oversight. The desired candidate should possess a minimum of 3+ years of experience in a Site Reliability Engineer role or analogous position.DUTIES:Kubernetes CI/CD:
  • Designing, implementing, and maintaining CI/CD pipelines for Kubernetes-based applications.
  • Automating deployment processes and ensuring continuous integration and delivery of software.
Monitoring and Reporting:
  • Implementing monitoring solutions for infrastructure and applications using tools such as Prometheus, Grafana, and Kubernetes-native monitoring.
  • Generating reports on system performance, availability, and reliability.
Log Analysis:
  • Analysing logs and metrics to identify trends, anomalies, and performance issues.
  • Implementing log aggregation and analysis solutions like ELK Stack or Splunk.
Application Troubleshooting:
  • Investigating and resolving issues related to application performance, availability, and reliability in Kubernetes environments.
  • Collaborating with development teams to diagnose and debug complex issues.
Alerting and Escalation:
  • Setting up alerting mechanisms to proactively detect and respond to incidents.
  • Escalating critical issues to appropriate teams and stakeholders.
Linux Administration and Maintenance:
  • Managing and maintaining Linux servers, including installation, configuration, and patch management.
  • Implementing security measures and best practices for Linux-based systems.
Active Directory Admin and Maintenance:
  • Managing user accounts, groups, and permissions in Active Directory.
  • Performing routine maintenance tasks and ensuring the security of AD infrastructure.
DNS Admin and Maintenance:
  • Configuring and managing DNS servers and zones.
  • Troubleshooting DNS-related issues and ensuring DNS resolution reliability.
End-User Support:
  • Providing technical support and assistance to end-users for infrastructure-related issues.
  • Resolving hardware, software, and connectivity problems promptly.
Database Administration (PostgreSQL):
  • Managing PostgreSQL databases, including installation, configuration, and performance tuning.
  • Performing routine maintenance tasks such as backups, restores, and upgrades.
REQUIREMENTS:
  • 3+ years of experience in a Site Reliability Engineer role or similar position.
  • Proficiency in Kubernetes administration and experience with CI/CD pipelines.
  • Strong Linux administration skills, including shell scripting and troubleshooting.
  • Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or Splunk.
  • Familiarity with Active Directory administration and DNS management.
  • Experience with PostgreSQL database administration is a plus.
ATTRIBUTES:
  • Excellent communication and problem-solving skills.
  • Ability to work effectively in a fast-paced, collaborative environment.
Copyright 2016-2024 © Datafin. All Rights Reserved.|Manage Cookie ConsentTo provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behaviour or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.Functional Functional Always activeThe technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network. Preferences PreferencesThe technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user. Statistics StatisticsThe technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you. Marketing MarketingThe technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.

Intelligence Software

Similar Jobs

  • Site Reliability Engineer/ Cloud Specialist

    Sabenza IT

    • Midrand, Johannesburg
    Site Reliability Engineer/ Cloud Specialist Cloud Consultant - IT Menlyn - Gauteng - South Africa, Midrand - Gauteng - South Africa Our clients are revolutionizing the automotiv…
    • 1 month ago
  • Site Reliability Engineer (SRE)

    Sabenza IT

    • Midrand, Johannesburg
    Site Reliability Engineer (SRE) DevOps/Platform Engineer - IT Menlyn - Gauteng - South Africa, Midrand - Gauteng - South Africa, Rosslyn - Gauteng - South Africa Are you a Site …
    • 1 month ago
  • Mechanical Engineer- Electrical Vehicle design (Centurion)

    E&D Recruiters

    • Centurion, Gauteng
    • R300,000 per year
    Mechanical Design Engineer required that has a sincere interest to work on the design of Electrical vehicle design. Centurion. Must have a B.Eng. or BSc. in Mechanical Engineering …
    • 16 days ago