
Vector‑backed Retrieval
Hybrid search, chunking, schema‑aware re‑ranking, observability.
AI Engineer & Frontend — shipping agentic systems, RAG pipelines, and developer UX. I blend product intuition with systems engineering to build fast, reliable LLM apps.
Based in San Francisco
Open to remote work
AI Systems + Frontend
RAG, agents, benchmarks
Currently available
Starting mid‑September


Hybrid search, chunking, schema‑aware re‑ranking, observability.

Multi‑tool planning, retries, guardrails, tracing via OpenTelemetry.

Inline suggestions, context windows, evals, and latency budgets.

Streaming data processing with Apache Kafka and real‑time dashboards.

Fine‑tuned transformers for domain‑specific tasks with custom datasets.

Microservices architecture with GraphQL, Redis caching, and auto‑scaling.
From seed‑stage startups to enterprise platform groups.
Focus areas: RAG optimization, agentic workflows, prompt engineering, model evaluation, and production-ready AI systems with sub-second latency.
from fastapi import FastAPI
from rag import embed, search, rerank, answer
from tracers import trace
app = FastAPI()
@app.post("/ask")
@trace("ask")
def ask(q: str, user_id: str):
q_vec = embed(q)
chunks = search(q_vec, k=20, filters={"user": user_id})
ranked = rerank(q, chunks)[:6]
return answer(q, ranked, tools=["browser", "code"], guardrails=True)2025
Independent — AI Engineer
Building production AI systems, RAG pipelines, and agentic workflows for startups and enterprise teams.
2022 — 2024
Senior Product Designer — Analytics
Led design for data visualization platform, shipped ML-powered insights dashboard used by 10k+ analysts.
2017 — 2021
Frontend Engineer — Commerce
Built responsive e-commerce platform with React/Node.js, optimized for mobile conversion and performance.
Continuous evaluation of prompts, tools, and retrieval quality across production workloads.
Automated testing pipeline with custom metrics, human feedback loops, and A/B testing. Tracks accuracy, hallucination rates, tool usage effectiveness, and user satisfaction scores across different model versions and prompt templates.

I build AI products end‑to‑end: data ingestion, retrieval, prompt/tooling, evals, and production UI. Pragmatic about latency, cost, and safety — with strong attention to developer experience.
Years
Projects
Clients
Structured outputs, memory architectures, and low‑latency tool use with vLLM + GPU batching.
Have a project in mind, a question, or just want to say hello? I'd love to hear from you. Fill out the form, and I'll get back to you as soon as possible.