Senior / Staff AI Engineer (Agentic Systems, Evals & Enterprise AI Reliability)

Senior / Staff AI Engineer

(Agentic Systems, Evals & Enterprise AI Reliability)

Compensation: Senior ~$240k–$250k base; Staff up to ~$300k base; equity package included

Location: San Francisco preferred; hybrid / in-person with relocation support for strong candidates

The Company

A high-growth, venture-backed SaaS company is building AI agents for complex audit, assurance, cybersecurity, privacy, and financial workflows. Its platform is trusted by major accounting and consulting firms to power mission-critical work in a highly regulated, high-trust market.

The company is well funded, scaling quickly, and making AI central to its product and engineering strategy.

The Opportunity

This is a chance to build production AI agents for enterprise workflows where reliability, explainability, and human judgment matter. The role sits at the intersection of product engineering, AI infrastructure, retrieval, evals, observability, and agent orchestration.

Senior candidates will own meaningful product areas end-to-end. Staff candidates will shape technical direction, define standards, and create systems that multiply engineering velocity across the organization.

The Role

You will design, build, and scale LLM-powered agentic systems that automate complex professional workflows. You will work across agents, retrieval, orchestration, evaluation, observability, reliability, and product delivery.

This is a hands-on engineering role for someone who has already shipped AI into production and wants to build trusted systems used every day by professionals.

What You’ll Do

Build and ship AI agents that automate complex audit and advisory workflows end-to-end.
Design agent orchestration systems combining LLMs, tools, retrieval, prompts, business logic, and guardrails.
Build evaluation frameworks, feedback loops, and observability systems for production AI behavior.
Develop retrieval pipelines, RAG systems, vector database workflows, and embedding infrastructure.
Prototype quickly, validate what works, then harden systems for enterprise-grade reliability.
Partner with Product, Design, Engineering leadership, and domain experts to translate customer problems into agent capabilities.
Create reusable abstractions, patterns, platforms, and tooling that raise quality and accelerate the wider team.

What You’ll Bring

Strong production software engineering experience in complex, real-world systems.
Hands-on experience shipping LLM-powered products or features serving real production traffic.
Strong TypeScript, Python, and Postgres skills, with solid distributed systems fundamentals.
Experience with agent orchestration, retrieval pipelines, RAG architectures, vector databases, and embedding models.
Practical experience building eval frameworks for model outputs, agent behavior, reliability, or feedback loops.
Familiarity with modern LLM APIs and agent frameworks such as OpenAI, Anthropic, Gemini, LangGraph, or similar tools.
Strong product judgment, ownership, communication skills, and comfort operating in ambiguity.

What This Role Requires

Production AI shipping experience, not just research, POCs, demos, or internal experiments.
Strong software engineering fundamentals, including coding, systems design, and production ownership.
Hands-on experience with LLMs, agents, RAG, retrieval, evals, observability, or AI infrastructure.
For Staff-level candidates: experience setting technical direction across teams, mentoring senior engineers, and defining AI reliability or platform standards.
Ability to work closely with an SF-based team, with relocation support available for exceptional candidates.

Tech Stack

TypeScript
Python
Postgres
LLM APIs
OpenAI
Anthropic
Gemini
LangGraph
RAG
Vector databases
Embedding pipelines
Agent orchestration
Evals
Observability
Guardrails
Distributed systems

Why Join

Build AI agents for high-stakes enterprise workflows where trust, reliability, explainability, and evaluation are core technical problems.
Join a well-funded, high-growth company with strong customer traction and AI at the center of its roadmap.
Take on meaningful ownership across agentic systems, AI platform infrastructure, and product experiences that customers use every day.

About People in AI

People in AI is a specialist recruitment partner connecting exceptional AI, machine learning, data, and engineering talent with some of the most ambitious technology companies in the world. We work closely with founders, technical leaders, and hiring teams to represent opportunities accurately and help candidates assess genuine fit.

Senior / Staff AI Engineer (Agentic Systems, Evals & Enterprise AI Reliability)

Similar roles

AI Automation Engineer

Engineering Lead

Staff Data Engineer

Founding Engineer (Full-stack)

Founding Engineer

Founding Data Scientist

Staff ML/AI Engineer

Software Engineer

Upload resume