AI Infrastructure Manager
Management Role with A Leading AI Technology Consulting Corporation

Role Overview

As the AI Infrastructure Manager, you will be responsible for designing, deploying, and managing secure and scalable AI cloud infrastructure while ensuring proper IAM configurations and compliance. You will lead a team of cloud engineers and ML-Ops specialists to manage AI model pipelines, high-performance computing (HPC), and multi-cloud environments. This role is ideal for a cloud security and AI infrastructure expert who thrives in fast-paced AI-driven SaaS environments and has a passion for cloud automation, IAM security, and AI scalability.


Key Responsibilities

1. AI & Cloud Infrastructure Management

  • Architect, deploy, and optimize multi-cloud (AWS, Azure, GCP) environments for AI workloads.

  • Ensure high availability, auto-scaling, and cost-efficient cloud infrastructure for AI model training and inference.

  • Manage containerized workloads using Kubernetes (K8s) and Docker.

2. Identity & Access Management (IAM) Security

  • Design and implement secure IAM policies, role-based access control (RBAC), and least-privilege access for AI and cloud environments.

  • Monitor and audit IAM policies to ensure compliance with security best practices (SOC2, ISO 27001, GDPR).

  • Automate IAM configuration management using Infrastructure-as-Code (IaC) tools such as Terraform and AWS CloudFormation.

3. AI Model Deployment & MLOps

  • Implement CI/CD pipelines for AI models to ensure smooth deployment and lifecycle management.

  • Optimize AI inference performance using GPU/TPU acceleration and cloud-native AI services.

  • Automate model retraining, versioning, and rollback strategies.

4. Performance Optimization & Reliability

  • Optimize compute resource allocation for AI models to balance cost and performance.

  • Implement observability, logging, and monitoring solutions for real-time infrastructure health tracking.

  • Drive incident response and disaster recovery planning for AI infrastructure.

5. Security, Compliance & Governance

  • Ensure AI and cloud security best practices, including encryption, zero-trust architecture, and network segmentation.

  • Implement compliance frameworks for AI workloads in regulated industries (finance, healthcare, legal).

  • Conduct regular penetration testing, vulnerability assessments, and security audits.

6. Team Leadership & Collaboration

  • Lead and mentor a team of cloud engineers, security specialists, and MLOps professionals.

  • Collaborate with AI researchers, software engineers, and DevOps teams to align infrastructure with AI product needs.

  • Report  founding leadership on AI infrastructure roadmaps and optimization strategies.


Qualifications & Skills

Education & Experience

  • Bachelor’s or Master’s degree in Computer Science, AI, Cloud Computing, or a related field.

  • 7+ years of experience in cloud infrastructure, IAM security, and AI/ML systems architecture.

  • Hands-on experience managing large-scale AI cloud environments, IAM policies, and security compliance.

Technical Skills

  • Cloud Platforms: Deep expertise in AWS, Azure, and Google Cloud (GCP).

  • IAM Security: Strong knowledge of RBAC, SAML, OAuth, IAM policy automation.

  • Infrastructure as Code (IaC): Hands-on experience with Terraform, CloudFormation, or Pulumi.

  • MLOps & AI Model Deployment: Experience with Kubeflow, MLflow, TensorFlow Serving.

  • Containerization & Orchestration: Proficiency in Docker, Kubernetes, Helm.

  • Security & Compliance: Familiarity with SOC2, ISO 27001, GDPR, and cloud security best practices.

Soft Skills & Leadership

  • Ability to manage an offshore team with Problem-solving mindset with a strategic approach to infrastructure optimization.

  • Excellent communication and cross-functional collaboration skills.

  • Ability to handle fast-paced, AI-driven startup environments.

 What We Offer

  • Competitive salary / Equity and comprehensive benefits package.
  • Opportunity to work on cutting-edge AI applications and projects.
  • A collaborative and innovation-driven work environment.
  • Professional development through training, mentorship, and industry events.
  • Flexible work arrangements, including hybrid work options.

How to Apply:
To apply, please send your resume detailing your hands-on leadership experience and expertise in startups and technology certifications in infrastructure and cloud  

 About Us:

Karbon Digital Limited is a Toronto-based leader in AI innovation, offering cutting-edge products and consulting services to industries such as healthcare, finance, retail, and logistics. Our mission is to help businesses unlock the full potential of AI, cloud, and data engineering technologies to solve real-world challenges and drive transformation.  Join Karbon Digital  and lead the charge in transforming industries with AI-powered solutions!

Fill out the form