AWS Cloud Infrastructure Support Engineer (L1 / L2) – MSP (24x7)
Job Summary
We are seeking a skilled AWS Cloud Infrastructure Support Engineer (L1/L2) to join our Managed Services team. The role involves providing 24x7 operational support for cloud-hosted environments, ensuring high availability, performance, and security of AWS infrastructure for multiple clients. The candidate will be responsible for incident resolution, monitoring, troubleshooting, and escalation support within defined SLAs.
Key Responsibilities
L1 Responsibilities (Primary Support / Monitoring)
Monitor AWS cloud infrastructure using monitoring tools (CloudWatch, Datadog, etc.)
Respond to alerts, incidents, and service requests in a 24x7 shift model
Perform initial triage and categorize incidents (severity assessment)
Execute predefined runbooks and operational playbooks
Handle basic troubleshooting of:
EC2 instance health
EBS volume status
ELB / ALB connectivity issues
RDS basic checks (CPU, storage, availability)
IAM access issues (basic validation)
Escalate complex issues to L2/L3 teams with proper documentation
Maintain incident logs and update ticketing systems (ServiceNow/Jira)
Ensure SLA adherence and timely communication updates
L2 Responsibilities (Advanced Support / Troubleshooting)
Perform deep-dive troubleshooting of AWS infrastructure issues
Analyze root cause of incidents and provide resolution or workarounds
Manage and support AWS services including:
EC2, S3, VPC, IAM, RDS, CloudFront, Route 53, Lambda (basic–intermediate)
Handle performance issues, scaling concerns, and service degradation
Support deployment validation and post-deployment checks
Assist in automation of routine tasks using scripts (Shell, Python, AWS CLI)
Work closely with DevOps and Engineering teams for problem resolution
Participate in RCA (Root Cause Analysis) documentation
Improve operational runbooks and incident response processes
Required Skills & Experience
2–4 years of experience in AWS cloud infrastructure support (L1/L2 roles)
Strong understanding of AWS core services (EC2, S3, VPC, IAM, RDS)
Experience with monitoring tools (CloudWatch, Datadog, Nagios, etc.)
Hands-on experience with Linux/Unix systems administration
Basic scripting knowledge (Shell / Python preferred)
Experience with ITSM tools (ServiceNow, Jira, etc.)
Understanding of networking concepts (DNS, TCP/IP, Load Balancing)
Ability to work in 24x7 rotational shifts
Good to Have
AWS certifications (AWS Cloud Practitioner / Solutions Architect Associate)
Experience in MSP or multi-client environments
Exposure to Infrastructure as Code (Terraform, CloudFormation)
Knowledge of CI/CD pipelines
Basic security best practices in AWS environments
Soft Skills
Strong problem-solving and analytical skills
Ability to work under pressure in production environments
Good communication skills for incident reporting and escalation
Team collaboration in a shift-based environment
Ownership mindset for incident resolution
Work Environment
24x7 rotational shifts (including nights, weekends, and holidays)
Production support for multiple enterprise AWS environments
High focus on SLA-driven service delivery