Garry Tan and Diana Hu argue the startup unit of production is changing. YC portfolio companies are going from zero to tens of millions in revenue in a year, and they walk through the primitives that make it possible.
The bottleneck on building a company used to be people: you raised money to hire a team. AI agents change the unit of production, so a person at a terminal can do the work of roughly 500 to 1000 people. That lets a six-person team reach $10M in revenue and pushes revenue per employee to $1-2M, roughly 10x the public-company norm. Building this way means treating your company as a closed feedback loop where agents read every artifact, and treating skills, resolvers, and evals as the new org chart.
The unit of production changed
Garry raised $4M and hired 10 people to build Posterous over two years; he says he rebuilt that same software in about five days using a $200/month Claude Code Max plan. Diana notes YC portfolio companies now go from zero to tens of millions in revenue in a year, traction that used to take four or five years and hundreds of millions in capital. The claim: it is 2026 and a six-person team can hit $10M in revenue.
The software factory, not a co-pilot
Garry frames coding agents as a factory rather than autocomplete. He cites Steve Yegge's claim that agent users are 10x to 100x more productive, and says the real production test is not lines of code but 80-90% test coverage and whether customers actually pay. He calls out AI slop and hallucinations as the things you are actively fighting, not proof the tools fail.
Skills, resolvers, and skillify
A skill is a runbook in markdown that an agent follows, and the trick is that it can also call deterministic code. A resolver is a master index that loads an instruction only when needed, which keeps context small. Skillify goes one level up: you do a task once, get the agent to do it right, then have it write the skill plus unit tests, LLM evals, an integration test, and a trigger check so it repeats reliably.
Latent space vs deterministic code
Agentic systems break when you ask the LLM to do work that should be deterministic code, or hard-code work that should stay in the model. Seating eight dinner guests is easy in latent space; seating an 800-person event is not, because the model hallucinates. The pattern is to make the fuzzy LLM layer and the exact code layer work together. Garry's example: a TypeScript context-now file with tests so the agent stops thinking it is 3am in Greenwich.
The company as a closed loop
Diana borrows from control systems: old companies run open-loop, so errors accumulate in unwritten meeting notes, DMs, and vibes until decisions drift. An AI-native company is closed-loop, with an agent that has read access to every artifact the company produces. She says YC's own engineering team cut sprint time in half and produced 10x the work this way, and the org flattens into three roles: individual contributors, DRIs, and the AI founder.
Taste, evals, and the wedge
The cost of shipping code goes to zero; taste does not. Generic benchmarks like MMLU do not tell you if your product works, so founders must build domain-specific evals and label failing traces by hand. The winning go-to-market is a wedge: pick a painful workflow, go undercover as a forward-deployed engineer, and automate messy real work. Examples: Salient (bank loan voice agents), Happy Robot (freight), and Reducto (document processing).
- One person at a terminal can do the work of roughly 500 to 1000 people, and the models themselves have not caught up to this.
- A six-person team can now reach $10M in revenue, with revenue per employee around $1-2M, roughly 10x the public-company comp.
- Lines of code is a gameable metric; the true test is test coverage, whether it works for customers, and whether they pay.
- A skill is a markdown runbook that can call code; a resolver loads instructions on demand to keep context small.
- Skillify captures a one-off task as a reusable skill plus its tests, evals, and triggers, so it runs reliably next time.
- Run the company as a closed loop: an agent with read access to every artifact suggests next work and self-heals the system.
- Generic benchmarks do not prove product quality; you need domain evals and a human labeling failure traces to preserve taste.
- The wedge play is going undercover in a painful workflow as a forward-deployed engineer, then automating the messy real work.
In their words
“Your generation is going to create the cognitive layer for all of society.”
“You sitting in front of one of these terminals can do the work of about 500 to 1,000 people.”
“Shipping code is going to zero, the cost of it. But what is not going to zero is the taste to build something good.”
Terms to know
- SAFE
- Simple Agreement for Future Equity, YC's two-page funding document that became the seed-stage standard, used as the analogy for what code and markdown will standardize next.
- Skill
- A markdown runbook of steps an agent follows to do a task, which can also invoke deterministic code.
- Resolver
- A master index that tells the agent which instruction file to load only when a task needs it, keeping the context window small.
- Skillify
- Promoting a proven one-off agent task into a reusable skill complete with unit tests, LLM evals, integration tests, and a trigger check.
- Closed loop
- A company run like a control system where an agent reads every artifact and feeds errors back tightly, versus lossy open-loop information trapped in people's heads.
- DRI
- Directly Responsible Individual, the single owner of an outcome who orchestrates ICs and agents to reach a goal.
Garry Tan & Diana Hu at Stanford CS 153: Frontier Systems
New to this? Come build with us.
Reading is good. Building with people is better. Our drop-ins are free and open to total beginners.