Website LockedIn AI

AI Cloud Engineer

We are looking for a cloud-native, AI-infrastructure-focused AI Cloud Engineer to design, build, and optimize the cloud environments that power LockedIn AI’s machine learning workloads, real-time inference systems, and AI-driven product features.

This is a specialized infrastructure role at the intersection of cloud engineering and AI — you will architect and operate the cloud-based systems where models are trained, fine-tuned, evaluated, served, and scaled for over 1 million users.

Apply Now
Job Title: AI Cloud Engineer
Reports To: Co-Founder / CEO
Employment Type: Full-Time
Work Model: Remote (US-Based) · Optional hybrid in New York, NY
Compensation: $140,000 – $195,000 USD / yr

About LockedIn AI
LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted by over one million users worldwide. We are a fast-growing company building the most advanced career preparation platform on the market.

Our platform delivers real-time, AI-powered assistance during live job interviews, coding assessments, and professional meetings — helping candidates communicate with clarity, confidence, and competence.

Role Overview
As an AI Cloud Engineer, you will own the cloud infrastructure layer that supports the entire AI lifecycle.

You will design scalable GPU environments for model training and fine-tuning, build high-performance inference systems for real-time AI responses, optimize cloud costs for compute-heavy workloads, and manage AI-native cloud services at scale.

This role requires deep understanding of both cloud infrastructure and AI systems — bridging the gap between distributed systems engineering and machine learning operations.

Key Responsibilities
AI-Optimized Cloud Architecture
Design cloud infrastructure for AI/ML workloads including GPU training clusters and real-time inference systems
Architect scalable environments on AWS, GCP, or Azure optimized for large model workloads
Build multi-stage environments for training, evaluation, staging, and production AI systems
Implement elastic scaling strategies for AI workloads to balance performance and cost

AI Model Serving & Inference Infrastructure
Build production-grade inference systems for LLMs, speech-to-text, and RAG pipelines
Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, TGI, etc.)
Improve inference performance using batching, caching, and GPU optimization techniques
Design load balancing and failover systems for high-availability AI endpoints

GPU Compute & Training Infrastructure
Manage GPU clusters for model training, fine-tuning, and evaluation
Implement distributed training across multi-GPU and multi-node environments
Optimize compute utilization using spot instances, scheduling, and auto-scaling
Operate managed AI platforms such as SageMaker, Vertex AI, or Azure ML

Cloud Cost Optimization (FinOps for AI)
Monitor and optimize AI infrastructure spend across compute, storage, and APIs
Implement cost-saving strategies including reserved instances and GPU right-sizing
Track cost-per-inference and cost-per-training-job metrics
Optimize LLM usage and API consumption patterns

Networking, Security & Compliance
Design secure cloud networking for AI workloads (VPCs, private endpoints, IAM)
Protect sensitive AI assets including models, embeddings, and datasets
Implement encryption, access controls, and audit logging
Ensure compliance with privacy-first architecture principles

Infrastructure as Code & Observability
Build all infrastructure using Terraform, Pulumi, or CloudFormation
Create automated provisioning pipelines for AI environments
Implement monitoring for GPU health, latency, and system performance
Build alerting systems for failures, bottlenecks, and anomalies

Required Qualifications
Experience
3+ years in cloud engineering, DevOps, or infrastructure engineering
Experience with ML/AI workloads in production environments
Hands-on experience with GPU-based systems or AI cloud platforms
Experience working in cross-functional engineering teams
Startup or high-growth environment experience preferred

Technical Skills
Strong proficiency in Python, Go, or Bash
Deep experience with AWS, GCP, or Azure cloud platforms
Strong Kubernetes experience (GPU scheduling, clusters, Helm, autoscaling)
Experience with AI model serving frameworks (vLLM, Triton, TensorRT, etc.)
Infrastructure as Code tools (Terraform, Pulumi, CloudFormation)
Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch)

Preferred Qualifications
Experience with large-scale LLM inference systems
Background in distributed training and multi-GPU optimization
Experience with real-time streaming or low-latency systems
Familiarity with RDMA, InfiniBand, or high-performance networking
Contributions to open-source cloud or AI infrastructure tools
Startup founding or early-stage startup experience

What We Offer
Equity
Meaningful early-stage ownership in a fast-growing AI company

Impact
Your work directly powers a product used by 1M+ users worldwide

Team
Work with a lean, high-performance engineering team

Flexibility
Remote-first with optional NYC collaboration

Growth
Be part of a rapidly scaling AI-native company

Culture
Fast execution, high ownership, and deep user obsession

Why Join LockedIn AI?
Category-defining AI copilot platform
Massive and fast-growing global user base
Direct ownership of AI infrastructure at scale
Work at the frontier of applied AI systems
Build systems that ship fast and impact millions

How to Apply
Please submit:

Resume or CV
Short note explaining:Why you want to join LockedIn AI
Whether you’ve used the product
What you would improve
Optional: GitHub, portfolio, or technical writing

Equal Opportunity
LockedIn AI is committed to building a diverse and inclusive team. We welcome applicants from all backgrounds. Hiring decisions are based on merit, skills, and business needs.

To apply for this job email your details to info.lockedinai@gmail.com