Senior Distributed Systems Engineer – AI Inference
$200K–$260K base + equity
Bay Area, CA (on-site only)
Next-gen infrastructure for large models

A well-funded startup is rethinking how we run today’s largest machine learning models at scale. Their focus: building highly efficient systems for transformer inference, designed from the hardware up. With a team of experienced systems engineers, ML practitioners, and hardware experts, they’re creating infrastructure purpose-built for running massive AI workloads faster and more reliably than current approaches allow.

Now they’re hiring a hands-on systems engineer to help scale the software that powers this next-gen compute platform. The work is low-latency, performance-critical, and squarely at the intersection of machine learning and distributed systems.

What You’ll Do

Design and implement strategies for running transformer and MoE models across multi-node compute clusters
Write performance-optimized code in Python and C++, interfacing with ML frameworks like JAX or PyTorch
Collaborate closely with platform and hardware engineers to coordinate model execution across custom infrastructure
Troubleshoot complex system interactions, optimize data flows, and contribute to orchestration tooling
Own software performance end-to-end, from modeling to deployment

What They’re Looking For

Deep experience building distributed systems at scale
Strong understanding of how modern ML models (LLMs, MoEs) operate in production
Comfort working with orchestration tools (Kubernetes, Slurm, or similar)
Proficiency in C++ and Python, and familiarity with model-serving architectures
A systems mindset: performance tuning, memory efficiency, and reliability are second nature

Tech Stack & Environment

Languages: C++, Python
Frameworks: PyTorch, JAX (or equivalent)
Infra: Container orchestration, custom hardware, internal runtime environments
Culture: Deeply technical, fast-moving, collaborative, on-site

Why Join

This is a rare opportunity to help shape how large-scale AI inference gets done—not by tweaking existing tools, but by building new ones from scratch. If you're excited by low-level performance, cutting-edge model architectures, and collaborating closely with hardware teams, this is a place to have real impact.

About People In AI
We’re a boutique recruiting firm focused exclusively on top-tier AI and machine learning talent. We work directly with engineering leadership at frontier tech companies to connect them with candidates who want meaningful, technically challenging roles. When we reach out, it’s because we think it’s a genuinely strong fit.

MoE Software Engineer

What You’ll Do

What They’re Looking For

Tech Stack & Environment

Why Join

Similar roles

Staff ML Engineer, Applied AI

Software Engineer

Data Engineer

Data Scientist

Manager, Machine Learning

Staff Machine Learning Engineer

Forward Deployed Engineer

Research Engineer (RAG)

Upload resume