DevOps Engineer / Platform Engineer
Job id - 1121269
Skills
DevOps Engineer / Platform Engineer
Job Description
Job Title: DevOps Engineer / Platform Engineer
Location: Gurugram
Work Mode: Work from Office
Working Days: 5 Days Working (Monday to Friday)
CTC Budget: 35–40 LPA
Location: Gurugram
Work Mode: Work from Office
Working Days: 5 Days Working (Monday to Friday)
CTC Budget: 35–40 LPA
Project Overview
Decision-Making Engine
A product designed for Risk Managers and Anti-Fraud Strategy teams. The platform is built using Kogito as the core framework for business rules and process orchestration. Business logic and microservices are developed using Kotlin and managed in Kubernetes environments.Marketing Automation Platform
A next-generation product for CRM and marketing automation teams with a user-friendly UI and advanced AI-driven capabilities. The platform follows a microservices architecture using Python and FastAPI.Data Lakehouse
A centralized data analytics platform supporting large-scale analytics, antifraud systems, and information security initiatives. The technology stack includes Apache Airflow, Apache Spark, and MS SQL, along with implementation of modern data and security technologies.Job Summary
We are looking for a highly skilled DevOps Engineer / Platform Engineer to manage and improve cloud infrastructure, Kubernetes platforms, CI/CD pipelines, observability systems, and infrastructure automation. The ideal candidate should have strong hands-on expertise in AWS, Terraform, Kubernetes, Helm, and GitOps practices, along with a solid understanding of infrastructure security and distributed systems.Key Responsibilities
- Maintain, support, and continuously improve AWS cloud infrastructure, including VPC, EC2, EKS, IAM, Route 53, S3, and related networking services.
- Manage and support Kubernetes-based production platforms and microservices environments.
- Design, implement, and maintain Infrastructure as Code (IaC) using Terraform and Ansible.
- Standardize service deployments and manage reusable deployment templates using Helm and Helm Library Charts.
- Build, optimize, and maintain CI/CD pipelines using GitLab CI and Argo CD.
- Administer and support on-premises GitLab environments.
- Monitor platform performance and improve observability using Grafana, VictoriaMetrics, and VictoriaLogs.
- Administer and maintain HashiCorp Vault for secrets management and secure access control.
- Troubleshoot Linux-based environments running on EC2 instances and Kubernetes clusters.
- Provide operational support for PostgreSQL, MS SQL, Apache Airflow, and database workloads running on AWS RDS and Kubernetes environments.
- Implement and support infrastructure security controls, including encryption, network segmentation, secure connectivity, and least-privilege access management.
- Develop internal automation tools and scripts to improve platform efficiency and operational reliability.
- Investigate incidents, perform root cause analysis, and implement preventive improvements.
- Collaborate with engineering, data, security, and product teams to ensure platform scalability, reliability, and security.
Required Skills & Qualifications
- Strong hands-on experience with AWS cloud services, especially VPC, EC2, EKS, IAM, Route 53, and networking fundamentals.
- Solid production-level experience with Terraform and Infrastructure as Code practices.
- Strong practical experience managing Kubernetes clusters and containerized environments.
- Good knowledge of Helm, reusable deployment templates, and Helm chart management.
- Hands-on experience with Helm Library Charts and Go templating.
- Strong Linux system administration and troubleshooting skills.
- Good understanding of networking concepts including TCP/IP, DNS, HTTP(S), TLS, VPNs, load balancing, and ingress controllers.
- Experience building and maintaining CI/CD pipelines, preferably using GitLab CI.
- Experience with GitOps tools such as Argo CD.
- Strong understanding of monitoring, logging, observability, and troubleshooting distributed systems.
- Ability to write automation scripts and tools using Bash, Python, or Go.
- Good understanding of infrastructure and platform security principles, including secrets management, encryption, network access controls, and secure service communication.
Nice to Have
- Experience with PostgreSQL, MS SQL, or database administration in cloud/Kubernetes environments.
- Experience supporting Apache Airflow platforms.
- Experience administering on-premises GitLab environments.
- Experience administering and supporting HashiCorp Vault.
- Experience developing internal platform engineering or automation solutions.
- Exposure to data platforms, antifraud systems, or information security projects.
Preferred Candidate Profile
- Strong problem-solving and analytical skills.
- Ability to work in fast-paced, highly scalable production environments.
- Passion for automation, platform reliability, and infrastructure optimization.
- Strong collaboration and communication skills.
- Proactive mindset with ownership-driven execution approach.
- Experience working in microservices and cloud-native architectures.