Building with AI: What Actually Works (and what broke)

Lessons from three months building Sunday Planning with AI-led development. 

In late 2025, I started building Sunday Planning — a travel planning app — with an unusual constraint: AI wouldn’t just autocomplete code. It would act as a development partner.

Two months later, after thousands of commits, we’re shipping a beta with list creation, smart place discovery, group voting, public sharing, and a 973-test suite. Some sessions felt like superpowers. Others were expensive failures.

Here’s what actually worked — and what didn’t — when building a real product with AI.

 

The Biggest Unlock: Clear Ownership

What surprised me most is how much product rigor matters when you remove the engineering buffer. In a traditional team, some ambiguity gets resolved during implementation. When building with AI, ambiguity compounds. If a workflow, rule, or edge case isn’t defined, the system will still be built — just not the way you intended. Preparation isn’t overhead anymore — it’s leverage.

The most important improvement wasn’t technical — it was defining ownership.

I own: vision, features, UX, priorities
AI owns: quality, security, performance, testing

That simple framing changed how we worked. I stopped asking whether we should write tests or validate edge cases — those became default expectations. I focused on what we’re building and why. The AI focused on how to build it safely and correctly.

Reality check: AI is not truly accountable. You still have to think about everything, verify everything, and ask the right questions. But when you communicate ownership expectations clearly, the AI tends to work toward them. The collaboration becomes more structured and far less mentally exhausting.


What Worked: Describing Intent, Not Implementation

One surprising pattern: the more precise I got about implementation details, the worse the results.

When I gave pixel-level UI instructions, we got stuck in revision loops and mediocre outcomes. When I described intent and user feeling instead, results improved dramatically.

Better prompts sounded like:

  • “This feels cluttered — content should dominate actions.”

  • “Users might not realize this is clickable.”

  • “The loading state feels abrupt.”

Worse prompts sounded like:

  • “Move this button 12px right and change radius to 8px.”

AI is far better at solving problems than following micro-instructions. Describe the outcome and constraints — not the tweak.

This mirrors how good product specs work with strong engineering teams: define outcomes and constraints clearly, don’t prescribe every line of execution. AI responds to product intent the same way good engineers do — often with better solutions than the one you had in mind.


What Worked: Asking for Big Performance Wins

AI is excellent at tedious, high-leverage engineering work — especially optimization and refactoring.

One example: creating a list with many places originally took ~40 seconds. I flagged it as unacceptable and asked for aggressive optimization. The AI redesigned the flow with batched DB operations and parallel enrichment calls. Result: 2–3 seconds. Roughly a 15× improvement.

That kind of refactor would have taken me days of experimentation. AI will happily try ten structural approaches if you give it permission and a target.

Don’t ask for small gains. Ask for order-of-magnitude improvements.


What Failed: Untested Architecture Decisions

Our most expensive mistake came from trusting an architectural recommendation that hadn’t been tested.

AI proposed a modern routing setup that sounded correct — but conflicted with our caching configuration. A two-minute proof-of-concept would have revealed the issue. Instead, we burned time and tokens implementing something fundamentally incompatible.

Even worse, I had suggested an alternative approach that would have worked. It was dismissed as “legacy.” My instinct was right.

New rule: No architectural decisions without a working proof-of-concept.

Modern doesn’t mean compatible. Reasonable doesn’t mean verified.

A Hard Truth About Iteration Speed

Iteration is still the core loop — but iteration gets expensive when direction is loose. The more upfront thinking you do about flows, failure states, permissions, and data behavior, the cleaner your iteration cycles become.

What’s changed with AI is the economics of experimentation. Proof-of-concepts and technical prototypes are now faster and cheaper to produce than long debates or detailed speculative design. That shifts responsibility earlier in the process.

If a decision is structurally important or technically uncertain, build a small working version first. Test the constraint. Exercise the edge cases. Let reality answer before architecture solidifies.

AI speeds up building — it doesn’t remove the cost of unclear thinking. But it dramatically lowers the cost of early validation — and we should be using that advantage much more aggressively.


What Failed: Unfocused Debugging

In one debugging session, a route returned 404. The AI ran dozens of commands: logs, restarts, middleware checks, layout scans.

The real bug? A typo in a path string. I found it in 30 seconds by reading the file.

New rule: After several failed attempts, stop searching — read the failing code directly.

AI tends to expand the search space. Humans are better at narrowing it.


What Failed: Declaring Success Without Verification

The most trust-damaging behavior pattern was premature completion claims:

“Everything is working.”
It wasn’t.

Missing dependencies, broken imports, failing routes. The issue wasn’t malice — it was assumption. AI often infers success from plans, not outcomes.

Now the rule is simple: Nothing is done until it runs. Not compiled — run. Not reasoned — verified.

Trust improves immediately when verification becomes mandatory.


Teaching a Stateless AI Accountability

A deeper lesson emerged over time: conversational promises don’t change AI behavior. System structure does.

AI will follow hard constraints (type checks, schema validation, auth rules) but quietly skip workflow discipline (updating docs, linking issues, running review steps) unless enforced mechanically.

We discovered three reliable levers:

  1. Auto-loaded instruction files — always read, treated as authoritative

  2. Codified commands — step sequences, not suggestions

  3. Mechanical enforcement — hooks and checks that can’t be skipped

Not reminders — requirements.

We embedded workflow expectations into files and commands the AI must execute, instead of instructions it might remember. Behavior improved immediately — without adding new tools or infrastructure.

The key shift: Don’t rely on AI discipline. Design your workflow so discipline is required.


Practical Rules for AI-Assisted Development

After three months, here’s the condensed playbook:

  1. Define ownership up front

  2. Describe problems, not solutions

  3. Demand proof-of-concepts for architecture

  4. Trust your instincts about your codebase

  5. Pivot quickly when something smells wrong

  6. Verify before declaring done

  7. Document failures — future sessions should learn from them

  8. Encode workflow rules into files and commands, not conversations


The Bigger Picture

AI-assisted product building isn’t about writing less code. It’s about removing the distance between product thinking and product creation.

When you don’t have an engineering team translating your specs into reality, every gray area you leave unresolved becomes rework later. Planning depth matters more. Edge cases matter more. System behavior matters more. The product discipline that used to live in handoffs now has to live in your head.

AI accelerates implementation — but that makes clarity, prioritization, and decision quality even more important.

The best sessions feel like working with a tireless implementation partner who can turn product intent into working systems quickly. The worst sessions feel like executing against a plan that sounded right but wasn’t validated — and discovering the gap late.

That difference rarely comes down to AI capability. It comes down to planning clarity and verification discipline.

Three months in, I can’t imagine going back to product work where I can’t directly shape and test ideas myself. Not because I want to replace engineering — but because early product exploration has never been this fast or this hands-on.

The difference now is speed.

Once the decisions are made, the building happens fast.



Next
Next

Feature Speed, Strategy Bottlenecks: The Future of Building Products With AI Pt. 3