The Outcome Testing Manifesto
Or: How We Shipped a Full Marketing Site from WhatsApp at 3 AM Without Writing a Single Line of Code
The Idea
It started with Jeff.
"What if we showed businesses exactly how much their bank is stealing from them on FX? Like, upload your statement and we expose every hidden fee."
The idea was simple: make the invisible visible. Canadian banks hide 2.5-3% markups on every foreign exchange transaction. They don't disclose it. They don't itemize it. They just take it. And most businesses have no idea.
If we could build a tool that proves this — with their own data — we'd have the most powerful lead generation tool Loop has ever had. Every scan creates outrage. Every outrage creates a share. Every share creates a new scan.
The growth loop was obvious. The question was: how fast could we build it?
The Method: Outcome Testing
Here's what I didn't do:
- I didn't open an IDE
- I didn't write a project plan
- I didn't create Jira tickets
- I didn't set up a development environment
- I didn't write a single line of code
Here's what I did:
I described outcomes.
I opened WhatsApp at 11 PM, texted my AI agent (Ghost 👻 — powered by OpenClaw + Claude), and started talking about what I wanted to exist in the world.
This is what I call Outcome Testing: instead of building a product and then testing if it works, you describe the outcome you want and let AI figure out how to build it. Then you test if the outcome matches your vision. If not, you describe the delta. Repeat.
It's not prompt engineering. It's not vibe coding. It's product management at the speed of thought.
The Night: A Play-by-Play
11:00 PM — The Bug
I opened the existing bleed prototype and the charts were broken. Bars overflowing the page, bleeding through other sections.
I took a screenshot. Sent it to Ghost on WhatsApp.
"The chart is broken"
Ghost analyzed the screenshot, read the codebase, found the bug (chart scaling only accounted for bank costs, not Loop plan costs — when plan costs exceeded bank costs, bars rendered at 500% height), fixed it, committed, pushed, and deployed.
Time: 5 minutes. Lines of code I wrote: 0.
11:30 PM — The Vision
"What else can we do?"
Ghost came back with 15 ideas across growth, SEO, product, and quick wins. I picked the ones that mattered:
"Do it all"
Ghost spun up three sub-agents in parallel:
- Technical SEO (meta tags, sitemap, structured data)
- Content SEO (keyword research + 3 blog posts)
- Blog pages (build the /blog route)
All three ran simultaneously. I went to make tea.
12:00 AM — The Army
The first batch shipped. I wanted more.
"Actually!!!! Do it all so smart!"
Ghost spun up five agents in parallel:
- Bank-specific landing pages (/rbc, /td, /bmo, /scotiabank, /cibc)
- Quick wins (favicon, OG image, analytics, mobile fixes, loading states)
- 7 more blog posts
- PDF export of audit reports
- Shareable report links
Five agents. Working simultaneously. Each one reading the codebase, writing code, running builds, pushing to git. I'm on my phone in bed.
1:00 AM — The Feedback Loop
Things came back. I tested on my phone. Spotted issues:
"The floating button is hidden behind the banner" → [screenshot] "The nav menu doesn't work on mobile" → [screenshot]
Screenshots in. Fixes out. Deploy. Test. Repeat.
This is the core of outcome testing: you're not debugging code, you're debugging outcomes. "This doesn't look right" is a valid bug report when your agent can see what you see.
2:00 AM — Going Deeper
"What can we do to make it viral?"
Ghost laid out a viral loop analysis. Report cards like Spotify Wrapped. Pre-written share messages. Real-time scan counters. Social proof.
I said: "Nah let's stop for now. We have something working."
Knowing when to stop is part of the method. Outcome testing isn't about building everything — it's about building the right thing and shipping it.
3:00 AM — The Pitch
"Can you do a presentation? It's my project for the hackathon. I want to get it provocative and winning!"
14-slide pitch deck. Dark green on black. Punchy. Aggressive. The narrative: outrage is the growth engine.
"Slides should be responsive for mobile. Maybe it's missing one slide on the technology we used, from me talking to you on WhatsApp to OpenClaw, Railway, Mistral…"
Updated. Deployed. Tested on my phone.
3:30 AM — Jeff's Idea, My Build
"Put the idea from Jeff — Yan (me) tech implementation"
Credits added. Title slide: 💡 Idea: Jeff · ⚡ Built by: Yan
4:30 AM — Done
Site live. Pitch deck live. Blog live. Bank pages live. SEO live. PDF export. Share links. Calculator. Mobile responsive.
I went to sleep.
The Morning: Feedback at the Speed of Trust
Jeff tested the site. Sent a wall of feedback via Slack. Screenshots, suggestions, bugs, UX issues.
I forwarded everything to Ghost:
"Here's Jeff's feedback" → [screenshot] → [screenshot] → [screenshot]
Ghost catalogued 14 issues. I said:
"Fix it"
Three more sub-agents. Parallel again:
- Tone + disclaimers + link fixes
- Calculator redesign + Compare Banks fix
- Prompt accuracy + math improvements
All shipped within 15 minutes. Jeff's entire feedback thread — addressed, committed, deployed.
What Outcome Testing Actually Is
Traditional development:
Idea → Spec → Design → Code → Test → Deploy → Feedback → Repeat
Outcome testing:
Describe outcome → AI builds → Test outcome → Describe delta → Repeat
The difference isn't just speed. It's altitude. You're operating at the product level, not the code level. You never drop down to implementation details unless you want to.
When I said "the chart is broken," I didn't need to know it was a CSS overflow issue caused by percentage-based height calculations. I just needed to show what was wrong and describe what right looks like.
When I said "add SEO," I didn't need to know about meta tags, JSON-LD, or sitemap protocols. I needed to describe the outcome: "people Googling 'RBC FX fees' should find us."
When I said "make it work on mobile," I didn't need to specify breakpoints. I sent a screenshot of what was wrong.
The product manager becomes the entire team.
The Stack
- 💬 WhatsApp — My phone. My interface.
- 👻 OpenClaw — AI orchestrator. Connects me to agents.
- 🤖 Claude (Anthropic) — Writes code, content, everything.
- 👁️ Mistral — OCR + analysis. Reads bank statements.
- ⚡ Next.js 16 — React framework. Frontend.
- 🎨 Tailwind — CSS. Styling.
- 🚂 Railway — Hosting. Deploy.
The Numbers
- ⏱️ Total time: ~5.5 hours
- ⌨️ Lines of code written by me: 0
- 🚀 Features shipped: 16 major features
- 📝 Blog posts written: 10
- 🏦 Landing pages created: 6 (home + 5 banks)
- 🔄 Deploys: 15+
- 🤖 Sub-agents spawned: 12+
- 📸 Screenshots used as bug reports: 6
- 💬 WhatsApp messages sent: ~60
- 🍵 Cups of tea: 2
What I Actually Did vs What AI Did
Me (the human):
- Had the vision (with Jeff)
- Chose what to build and what to skip
- Prioritized ruthlessly
- Tested on my phone
- Forwarded feedback
- Said "deploy" a lot
- Knew when to stop
Ghost (AI agent):
- Read and understood the entire codebase
- Diagnosed bugs from screenshots
- Wrote all code
- Wrote all content
- Led teams of sub-agents
- Managed git workflow
- Deployed to production
- Built the pitch deck
The human provides taste, judgment, and direction. The AI provides execution at scale.
How to Do This Yourself
- Get OpenClaw — openclaw.ai. Connect WhatsApp, Telegram, or Discord.
- Give your agent a workspace — a repo it can read, write, and push to.
- Give it deploy access — Railway CLI, Vercel, whatever you use.
- Start with outcomes, not tasks. Say "the mobile layout is broken" not "change the padding on line 47."
- Use screenshots. Your agent can see images. A screenshot is worth a thousand words.
- Let it lead. Say "add SEO" and let the agent figure out what that means. It knows more about meta tags than you do.
- Spawn sub-agents for parallel work. Don't do one thing at a time. Describe 5 things and let 5 agents work simultaneously.
- Know when to ship. Perfect is the enemy of live.
The Point
We're entering an era where the bottleneck isn't engineering capacity. It's knowing what to build.
Jeff had the idea. I had the tools and the taste. Ghost had the execution power of a 10-person team.
In 5.5 hours, from a phone, in bed, we built something that would have taken a team weeks.
This isn't the future. This is now.
Stop writing code. Start describing outcomes.
Built by Yan Matagne with Ghost 👻 (OpenClaw + Claude) at 3 AM on a Tuesday. Idea by Jeff. Shipped before sunrise.
See what your bank is really charging you
Upload your bank statement and get a free, instant breakdown of your hidden FX fees.
Try the Free Audit Tool →