Location – Pune/Chennai
Grade – P3
Sr. SRE
Mandatory?
Expected Proficiency [1-5 scale]; 5 highest
Infrastructure Monitoring – Enterprise solutions(Dynatrace/DATAdog/New Relic)
Yes
5
Kubernetes & Docker
Yes
4
Cloud(AWS EKS/AWS VPC/AWS EC2)
Yes
4
SIEM – Splunk
Yes
4
Configuration & Release Management
Yes
4
Infra as Code(Ansible/Terraform)
Yes
4
Chaos Engineering Tools
Good to have
3
Unix/Linux Scripting
Good to have
2
Gitlab
Good to have
2
Communication
Yes
3
Experience
6+ years
We are looking to speak with candidates who are:
Highly experienced SRE engineers who have excellent working knowledge of Cloud. You will be expected to interact with clients on a daily basis to discuss and share plans, outcomes etc. You should have a very strong technical flair and passion to learn new technologies, find innovative solutions to problems, and have the ability to keep abreast with emerging technologies. Work independently & collaborate with customer teams on daily basis to understand requirements, plan & implement tasks, and provide regular status updates
Technical / process skills:
Must have:
Has experience of designing, building, and/or operating large-scale production systems
Storing understanding of system design principles
Strong knowledge of cloud native ecosystems like containerization, packaging services & Orchestration such as Docker, Kuberenetes, helm etc.
Strong experience in unix/linux, networking along with scripting with shell, bash
Understands networking and messaging, especially between services
Experience with build automation tool like maven, gradle
Has experience automating infrastructure, testing, and deployments using Infrastructure as Code tools like Ansible, Terraform etc
Experience building self healing systems with the help of strong SRE practices
Developed dashboards & automated alerts to measure uptime, monitor services and remediate issues
Experienced in integrating observability, monitoring and logging tools with services across a mix of open-source tools such as Prometheus, Grafana, ELK, Jaeger etc., and/or commercial platforms such as New Dynatrace etc.
Strong knowledge in setting up Cloud(AWS prefered) infrastructure & services using best practices
Experience implementing scalable, resilient, and secure infrastructure considering industry best practices and following the processes defined
Familiarity with Security best practices & experience managing secure systems
Good to have:
CKA Certification Good to have
Familiarity with chaos engineering tools
Hands-on experience in setting up & managing production-level Kubernetes clusters
Knowledge of service mesh architecture eg: ISTIO
Experience working with Kafka Clusters
Soft skills:
Excellent Communication Skills
Ability to gather business requirements and interface with customer teams as required
Excellent analytical and problem-solving skills
Willingness to explore and learn new technology, POC
Team player