Most advice about shipping AI features is written for companies with AI teams. This one isn't. If you're a web developer who knows your stack and has an idea you want in production before the week ends, keep reading.
The secret: stop treating AI as a new discipline. It's a new API call. You've shipped features on top of API calls a hundred times. Apply the same reflexes.
Day 1 — pick a problem that's small enough
Rule of thumb: if you can describe the output in one sentence, it's the right size. If you need two, it's too big.
Good first features look like:
A one-click summariser for a blog post
Autofill one form field from the content of another
A "why does this user report matter" triage button for your admin dashboard
Bad first features look like:
An "AI customer support agent" — too big, too open-ended
A general-purpose assistant — you have no way to evaluate "general"
Anything that ends in "copilot" — you'll be shipping in Q4
Day 2 — choose a model
Default: Claude Sonnet via the Vercel AI Gateway. Setup friction is near zero, the quality is more than enough for a v1, and you can swap providers later with a one-line change.
If you want a deeper comparison, read our companion piece on picking your first LLM. For today: Sonnet. Move on.
Day 3 — wire it in
Install the AI SDK. Write the handler. Resist the urge to build abstractions on day one — you don't know what you'll need yet.
import { generateObject } from "ai";
import { z } from "zod";
export async function POST(req: Request) {
const { text } = await req.json();
const { object } = await generateObject({
model: "anthropic/claude-sonnet-4.6",
schema: z.object({
summary: z.string(),
tags: z.array(z.string()).max(5),
}),
prompt: `Summarise in one sentence and suggest up to five tags:\n\n${text}`,
});
return Response.json(object);
}That's it. Schema-validated output, one round trip, no surprises. If the model drifts from your schema, the SDK retries automatically.
Day 4 — add the UI
What "good" looks like on day four: a loading state, an error state, a retry button. Three states. That's the whole UI.
Don't stream yet. Streaming is a UX optimisation for v2. For v1, a 4-second spinner is fine if the output is right. Users forgive waiting. They don't forgive wrong answers.
Day 5 — test with 10 real inputs
Open a spreadsheet. Column A: input. Column B: expected output (roughly). Column C: actual output. Column D: pass / fail.
Ten rows. If fewer than seven pass, iterate on the prompt before you ship. If eight or more pass, you're ready for day six. This is the cheapest, most useful evaluation step in the whole process and almost everyone skips it.
Day 6 — ship behind a flag
If you're on Vercel, use a Feature Flag. If you're not, a plain environment variable toggle is fine. The flag lets you ship to production without showing the feature to production.
Turn it on for yourself. Use it for a day. Turn it on for five friends. Watch them use it. Then turn it on for the world. Three rollout steps, one flag.
Day 7 — watch it fail
Log four things, always:
The full prompt you sent
The full response you got back
What the user did next (accepted, edited, regenerated, abandoned)
Token counts and cost
Log Drains, a boring Postgres table, whatever. The first 48 hours of real traffic will surface every weird case your 10-row test set missed. That's the point.
What you didn't do
You didn't build an agent. You didn't fine-tune anything. You didn't pick a vector database. You didn't set up a multi-step tool-using orchestration layer.
That's deliberate. Those are all things you add when a real user problem forces you to add them. Until then, they're procrastination with extra steps.
Ship the small thing. Watch it work. Then earn the right to build the bigger thing.