Staff AI Engineer

$225K–$250K + meaningful equity

San Francisco, CA (Onsite)

A well-funded AI infrastructure startup is hiring a Staff AI Engineer to help build the core agentic intelligence layer powering automation inside complex engineering software environments. The product already has meaningful traction with Fortune 100 customers and is backed by top-tier investors in the AI ecosystem.

This is a highly technical, deeply hands-on role for someone who wants to work on difficult real-world agent problems, not lightweight chatbot wrappers or internal prototypes.

The Opportunity

You’ll own foundational agent architecture and help define how AI systems reliably execute complex workflows inside real enterprise environments.

This role sits directly alongside company leadership and will heavily influence the future direction of the platform, evaluation infrastructure, and broader AI engineering strategy.

The environment is highly execution-oriented. Leadership is measured through technical contribution, system ownership, and shipping production systems under ambiguity.

What You’ll Be Doing

Build and improve production agentic AI systems capable of executing multi-step workflows across desktop software environments
Own core architectural decisions around:
- Tool orchestration
- Context management
- State handling
- Error recovery
- Model routing
- Workflow execution reliability

Design and scale evaluation frameworks measuring:
- Workflow success rate
- Reliability
- Failure modes
- Cost efficiency
- Regression detection

Define token budgets and optimize inference efficiency for commercially viable agent execution
Work closely with researchers, domain experts, and product stakeholders to translate real user workflows into measurable agent benchmarks
Lead technically while remaining highly hands-on in implementation and architecture
Collaborate with customers and internal teams to improve workflow coverage and production performance

What They’re Looking For

Strong production experience building agentic AI systems, not just LLM-powered interfaces
Deep understanding of:
- Tool-calling agents
- Multi-step orchestration
- State management
- Context handling
- Workflow reliability under ambiguity
Strong Python engineering background
Experience designing evaluation and benchmarking systems for AI agents or complex ML systems
Comfort operating in highly ambiguous startup environments with broad ownership
Strong systems thinking and architecture instincts
Ability to remain deeply hands-on technically while influencing engineering direction

Strong Signals

Experience with:
- SWE-bench, GAIA, or similar evaluation frameworks
- LangSmith, Logfire, tracing, or observability tooling
- Agent orchestration frameworks
- Desktop automation
- Enterprise AI deployments
- Workflow automation systems
Startup experience is highly valued, especially in technically demanding product environments

Location / Work Setup

Full-time onsite in San Francisco
Flexible hours and a high-autonomy environment
Compensation includes meaningful equity participation

Staff AI Engineer

Similar roles

Software Engineer

Senior Engineering Manager, Multimodal AI

Upload resume