Cloud Engineer – SRE

January 29, 2023
Urgent
Apply Now

Job Description

We’re passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are currently seeking a public cloud experienced engineer  for planning, designing and implementing next generation cloud infrastructure solutions. Cloud Engineer will be a part of the Engineering team and will require a strong knowledge of application monitoring, infrastructure monitoring, automation, maintenance, and Service Reliability Improvements.
Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.

RESPONSIBILITIES

Role & Responsibilities

  • Design, automate and manage a highly available and scalable cloud deployment that allows development teams to deploy and run their services.
  • Collaborating with engineering and Architects teams to evaluate and identify optimal cloud solutions, also leveraging scalability, high-performance and security.
  • Modernise existing on-prem solution and improving existing systems.
  • Extensively automated deployments and managed applications in GCP.
  • Developing and maintaining cloud solutions in accordance with best practices.
  • Ensuring efficient functioning of data storage and processing functions in accordance with company security policies and best practices in cloud security.
  • Collaborate with Engineering teams to identify optimization strategies, help develop self-healing capabilities
  • Experience in developing a strong observability capabilities
  • Identifying, analysing, and resolving infrastructure vulnerabilities and application deployment issues.
  • Regularly reviewing existing systems and making recommendations for improvements.

QUALIFICATIONS

Required Skills and Selection Criteria:

  • Proven work experience in designing, deploying and operating mid to large scale public cloud environments.
    • Professional Certification is an advantage
    • Public Cloud >> GCP is a good to have.
  • Proven work experience in Docker/Kubernetes (image building, k8s schedule)
  • Experience in package, config and deployment management via Helm, Kustomize, ArgoCD.
  • Proven work experience in provisioning Infrastructure as Code (IaC) using Terraform Enterprise or community edition.
  • Proven work experience in writing custom terraform providers/plug-ins with Sentinel Policy as Code
  • Strong knowledge in Github, DevOps (Tekton is an advantage)
  • Should be proficient in scripting and coding, that include traditional languages like Python, GoLang,Java, JS and Node.js. 
  • Extensive knowledge and hands-on experience in Grafana and Prometheus micro libraries.
  • Exposure to Cloud Monitoring and logging.
  • Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
  • Experience with automation tools should be a priority 

Preferred Qualifications

  • Previous success in technical engineering

Video