Image
img
img

MoE Software Engineer

  • Permanent
  • $200,000 - $300,000
  • San Jose, United States, California
  • Data Infrastructure & MLOps
Image

Senior Distributed Systems Engineer – AI Inference
$200K–$260K base + equity
Bay Area, CA (on-site only)
Next-gen infrastructure for large models

A well-funded startup is rethinking how we run today’s largest machine learning models at scale. Their focus: building highly efficient systems for transformer inference, designed from the hardware up. With a team of experienced systems engineers, ML practitioners, and hardware experts, they’re creating infrastructure purpose-built for running massive AI workloads faster and more reliably than current approaches allow.

Now they’re hiring a hands-on systems engineer to help scale the software that powers this next-gen compute platform. The work is low-latency, performance-critical, and squarely at the intersection of machine learning and distributed systems.

What You’ll Do

  • Design and implement strategies for running transformer and MoE models across multi-node compute clusters
  • Write performance-optimized code in Python and C++, interfacing with ML frameworks like JAX or PyTorch
  • Collaborate closely with platform and hardware engineers to coordinate model execution across custom infrastructure
  • Troubleshoot complex system interactions, optimize data flows, and contribute to orchestration tooling
  • Own software performance end-to-end, from modeling to deployment

What They’re Looking For

  • Deep experience building distributed systems at scale
  • Strong understanding of how modern ML models (LLMs, MoEs) operate in production
  • Comfort working with orchestration tools (Kubernetes, Slurm, or similar)
  • Proficiency in C++ and Python, and familiarity with model-serving architectures
  • A systems mindset: performance tuning, memory efficiency, and reliability are second nature

Tech Stack & Environment

  • Languages: C++, Python
  • Frameworks: PyTorch, JAX (or equivalent)
  • Infra: Container orchestration, custom hardware, internal runtime environments
  • Culture: Deeply technical, fast-moving, collaborative, on-site

Why Join

This is a rare opportunity to help shape how large-scale AI inference gets done—not by tweaking existing tools, but by building new ones from scratch. If you're excited by low-level performance, cutting-edge model architectures, and collaborating closely with hardware teams, this is a place to have real impact.

About People In AI
We’re a boutique recruiting firm focused exclusively on top-tier AI and machine learning talent. We work directly with engineering leadership at frontier tech companies to connect them with candidates who want meaningful, technically challenging roles. When we reach out, it’s because we think it’s a genuinely strong fit.

Share job:
Decor
Image

Upload resume

Boost your career with expert recruitment solutions!

Your resume will be confidentially submitted to our team, who will be in touch if we have a match for your job search

Upload resume
Image
jobs