Lead MLOps Architect – AWS
Cairo, Cairo Governorate, Egypt
Full Time
Software Engineering
Experienced
Lead MLOps Architect – AWS
Experience: 10–12 Years
Location: Egypt (Onsite Role)
Employment Type: Full-Time
Job Summary
We are seeking a highly experienced Lead MLOps Architect with deep AWS expertise to lead the design, architecture, and governance of enterprise-grade ML platforms. This role requires strong leadership capabilities, hands-on expertise in scalable ML systems, and experience managing large production environments.
Key Responsibilities
- Architect and lead enterprise-scale MLOps platforms on AWS
- Define best practices for ML lifecycle management, deployment standards, and governance
- Lead production deployment of ML models using AWS-native services
- Design automated CI/CD pipelines for ML workflows and infrastructure
- Implement advanced monitoring, drift detection, retraining automation, and observability
- Ensure high availability, scalability, security, and cost optimization
- Establish model versioning, reproducibility, and experiment tracking standards
- Lead troubleshooting of complex production issues
- Mentor and lead a team of MLOps and platform engineers
- Collaborate with stakeholders to align ML platform strategy with business objectives
Required Skills & Qualifications
MLOps & Machine Learning
- 10–12 years of overall experience with strong focus on ML production systems
- Proven experience leading ML platform architecture and large-scale deployments
- Deep understanding of ML lifecycle management, governance, and reproducibility
- Hands-on experience with TensorFlow, PyTorch, Scikit-learn
- Strong experience with MLflow or enterprise model management tools
AWS Cloud (Mandatory)
- Advanced hands-on expertise in:
- Amazon SageMaker (training, pipelines, endpoints)
- S3, EC2, Lambda
- ECR, ECS, EKS
- IAM, CloudWatch
- Experience designing secure, compliant, and scalable ML architectures
- Experience implementing cost optimization strategies on AWS
DevOps, Containers & IaC
- Strong expertise in Docker and Kubernetes (EKS)
- Advanced CI/CD implementation
- Infrastructure as Code using Terraform and/or CloudFormation
- Experience implementing GitOps practices
Programming & Data
- Expert-level Python skills
- Experience designing robust data pipelines
- Strong understanding of SQL/NoSQL systems
- Exposure to streaming or real-time ML systems
Preferred Qualifications
- AWS Professional-level certifications
- Experience with ML security, explainability, and regulatory compliance
- Experience building enterprise feature stores
- Exposure to real-time inference systems
Apply for this position
Required*