PURPOSE:
We are seeking a Senior DevOps / Cloud Platform Engineer to design, implement, and manage scalable, secure, and highly available cloud infrastructure on AWS. The role focuses on Kubernetes-based container platforms, Infrastructure as Code (IaC), CI/CD automation, observability, and DevSecOps practices to enable reliable and efficient software delivery.
QUALIFICATIONS, SKILLS, AND EXPERIENCE:
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field.
- 4+ years of experience in DevOps, Cloud Engineering, or Platform Engineering roles.
-
Strong expertise in AWS, Kubernetes, EKS, ECS, Lambda, and RDS.
-
Hands-on experience with Infrastructure as Code using Terraform (primary) and CloudFormation.
-
Proven experience building and managing CI/CD pipelines using GitHub Actions, GitLab CI, or Jenkins.
-
Strong knowledge of containerization using Docker and orchestration using Kubernetes (production-scale deployments, troubleshooting).
-
Experience with Kubernetes scaling mechanisms (HPA, KEDA).
-
Solid understanding of observability tools including OpenTelemetry, Prometheus, Grafana, CloudWatch, Datadog, and Splunk.
-
Familiarity with DevSecOps practices including secrets management, IaC scanning, container security, and software supply chain security (SBOMs, artifact signing).
-
Experience with GitOps, Ansible, and monitoring tools such as Honeybadger and AppSignal is a plus.
-
Understanding of cloud security, IAM, networking, VPNs, firewalls, and service-to-service access controls.
-
Knowledge of SRE principles (SLIs, SLOs) and incident management practices.
- Preferred Certifications CKA (Certified Kubernetes Administrator) and AWS Solutions Architect or AWS DevOps.
KEY RESPONSIBILITIES:
- Cloud & Platform Engineering: Design, provision, and maintain AWS cloud infrastructure using Infrastructure as Code (Terraform, CloudFormation), ensuring scalability, reliability, and security.
- Containerization & Orchestration: Build, manage, and optimize container platforms using Docker, Kubernetes, ECS, and EKS, including production deployments, scaling, and troubleshooting.
- CI/CD & Automation: Develop and maintain secure, scalable CI/CD pipelines using GitHub Actions, GitLab CI, Jenkins, or similar tools to enable efficient software delivery.
- Observability & Monitoring: Implement and enhance observability frameworks using OpenTelemetry, Prometheus, Grafana, CloudWatch, Elastic/OpenSearch, and other tools for logs, metrics, traces, and alerting.
- DevSecOps & Security: Establish and enforce DevSecOps practices, including secrets management, IaC scanning, container security, policy-as-code, and secure identity mechanisms (OIDC).
- Reliability & Performance: Define and manage SLIs, SLOs, and reliability processes aligned with SRE principles. Support incident response, postmortems, and disaster recovery planning.
- Cloud Networking & Security: Design and manage secure AWS networking, including IAM, private networking, VPNs, firewalls, and service-level access controls.
- Collaboration & Optimization: Partner with engineering, security, and product teams to improve developer experience, platform capabilities, system performance, and cost optimization (FinOps).
A Culture of Belonging:
At our core, we value diversity and inclusion. As an equal opportunity employer, we are dedicated to creating a workplace where every voice is heard, every person is respected, and everyone has the opportunity to succeed.