Job Description
We’re passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are currently seeking a public cloud experienced engineer for planning, designing and implementing next generation cloud infrastructure solutions. Cloud Engineer will be a part of the Engineering team and will require a strong knowledge of application monitoring, infrastructure monitoring, automation, maintenance, and Service Reliability Improvements.
Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.
RESPONSIBILITIES
Role & Responsibilities
- Design, automate and manage a highly available and scalable cloud deployment that allows development teams to deploy and run their services.
- Collaborating with engineering and Architects teams to evaluate and identify optimal cloud solutions, also leveraging scalability, high-performance and security.
- Modernise existing on-prem solution and improving existing systems.
- Extensively automated deployments and managed applications in GCP.
- Developing and maintaining cloud solutions in accordance with best practices.
- Ensuring efficient functioning of data storage and processing functions in accordance with company security policies and best practices in cloud security.
- Collaborate with Engineering teams to identify optimization strategies, help develop self-healing capabilities
- Experience in developing a strong observability capabilities
- Identifying, analysing, and resolving infrastructure vulnerabilities and application deployment issues.
- Regularly reviewing existing systems and making recommendations for improvements.
QUALIFICATIONS
Required Skills and Selection Criteria:
- Proven work experience in designing, deploying and operating mid to large scale public cloud environments.
- Professional Certification is an advantage
- Public Cloud >> GCP is a good to have.
- Proven work experience in Docker/Kubernetes (image building, k8s schedule)
- Experience in package, config and deployment management via Helm, Kustomize, ArgoCD.
- Proven work experience in provisioning Infrastructure as Code (IaC) using Terraform Enterprise or community edition.
- Proven work experience in writing custom terraform providers/plug-ins with Sentinel Policy as Code
- Strong knowledge in Github, DevOps (Tekton is an advantage)
- Should be proficient in scripting and coding, that include traditional languages like Python, GoLang,Java, JS and Node.js.
- Extensive knowledge and hands-on experience in Grafana and Prometheus micro libraries.
- Exposure to Cloud Monitoring and logging.
- Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
- Experience with automation tools should be a priority
Preferred Qualifications
- Previous success in technical engineering