Idea Validation Playbooks
How founders pressure-test an idea before building it — the demand signals worth chasing, the cheap experiments that surface real intent, and the traps that waste months. Every tactic below is quoted directly from a founder podcast and linked to its source.
245 tactics · page 6 of 9
“I've also right away like used uh for example a usability hub to run 5-second tests with two messaging and ask people like what their preference is and why and like what stood out that's also really really good in refining it because sometimes you use a word that just triggers people the wrong way or people don't understand what you mean.”
5-Second Tests Catch Words That Confuse or Repel Before They Live in the Product
Even copy that performs in ads can contain words that create friction — a term that means something different to the user, or that triggers an unexpected negative reaction. Daphne uses UsabilityHub 5-second tests to get qualitative feedback on competing message options: what stood out, what was confusing, what made someone trust or distrust the product. It's cheap insurance before the message lives in the onboarding.
“If you're a developer and you want to make a go at being an indie developer and making a good living on the app store instead go the opposite direction go really really niche but in a niche that people are willing to spend money and that people find valuable.”
Go Really Really Niche — Find an Audience That Happily Pays
The indie app winning strategy is extreme niche focus — not a narrow niche, but a niche where spending is already normalized. Hobbyists, professionals, and enthusiasts who spend thousands on their interest in the physical world will pay $10-50/year for a great app in their domain. The goal is to find where people already spend freely, then build the obvious app for that community.
“Fishing hobby people spend thousands tens of thousands of dollars on boats and trips and other stuff so if you're a solo developer looking to make a living right now go find a niche one where people are gonna find it valuable enough to spend money.”
Hobby Niches Where People Already Spend Thousands Are Gold for Indie App Revenue
The fishing analogy is instructive: people who spend $30K on a boat will not blink at a $50/year app subscription. The same applies to golf, hunting, woodworking, homebrewing, sailing, and dozens of other hobbies with dedicated spending. Evaluating a niche means looking not just at user count but at existing offline spend — a community that spends big offline will spend proportionally on a great digital tool.
“Go find a niche one where people are gonna find it valuable enough to spend money and then two where there is some form where you understand how you're going to get attention for it.”
Two Filters for a Viable Solo App Niche: Pays Money + Has Built-In Attention
The two-filter test for evaluating an indie app niche: first, do people in this community already pay for things they value? Second, is there an existing channel or community through which you can reach them without paid advertising? Both filters must pass. A passionate community that does not pay is a content project. A paying audience with no accessible channel is a distribution puzzle you may never solve.
“We spend a lot of time looking at our data and pulling out specifically people that downloaded the app and then churned or took one class and then churned and we spend a lot of time talking to those people as well.”
Interview Churned Users Specifically — They Tell You What Kept Them From Converting
Most product teams over-interview power users and get survivorship-bias feedback. Zumba deliberately pulled churned segments from their data — one class then quit, downloaded but never opened — and ran structured conversations with them. The result: they discovered a major churn driver was users preferring in-person classes, which shaped the app's class locator feature.
“i realized like okay they're just kind of you know i i say scam but it's not really scam but all these apps that are built quickly there must be some demand around it and so that's uh i started with the background remover idea”
Spot Demand From Bad Incumbents Then Ship an MVP in Two Weeks
Matthieu noticed the top photo apps in background-removal were low-quality cash-grabs — not signs of a dead market, but proof of unmet demand. He used 10 years of app development experience to ship a working prototype in two weeks with on-device ML. The insight: bad incumbents with real downloads validate the category far more reliably than surveys.
“the reason we launched after two weeks with revenue cat was like the feedback from pros is so much more valuable than the feedback from free users”
Paying Users Give 10x Better Feedback Than Free Users — Add a Paywall Early
PhotoRoom added a paid subscription within two weeks of launch — not primarily for revenue, but because paying users are the only ones whose feedback is worth acting on. Free users have low stakes and low engagement; pro users who pay monthly have real jobs to be done and give actionable, specific feedback. The paywall is as much a research filter as it is a monetization mechanism.
“When we sort of pivoted into the burner standalone app… it was like a 10x to 100x level of response from even just friends we were showing this to… started to feel pull from the market instead of having to push yourself into the market.”
Stop Pushing — Wait for the Market to Pull
The shift from Wrangle to Burner happened because potential users stopped saying "my sister would like this" and started saying "I want this." Greg treats that reaction gap as the earliest reliable signal of product-market fit — 10–100x stronger response with essentially the same audience.
“I would back into that by thinking a little bit about retention… well what problem does it solve? The learning with Wrangle ultimately was it didn't solve a real problem… when we pivoted into the burner standalone app we were testing it by selling tickets on Craigslist and it quickly validated the problem.”
Use Retention to Tell Product Problem from Distribution Problem
Greg's framework for deciding whether to pivot: if people keep coming back (retention), it's a distribution problem you can solve. If they don't, it's a product problem no amount of marketing can fix. Testing Burner via fake Craigslist ads generated immediate, authentic demand — proof the problem was real, not created.
“For an app like that to work you need a network right — it's sort of an empty restaurant problem, classic for social. We were basically trying to do a Web 2.0 idea on top of the phone.”
The Empty-Restaurant Problem Kills Social Apps Before They Start
Wrangle required simultaneous availability of two contacts to work — a classic two-sided coordination problem that makes early growth nearly impossible without a large existing user base or built-in viral loop. Recognising this pattern early is a key pivot signal: if the core value depends on a network that doesn't exist yet, you likely have a distribution problem too big to solve cheaply.
“The only way to know if it's a must-have is to get it in the hands of potential customers and ask them how they would feel if they could no longer use it. If they wouldn't care then your product's not a must-have.”
The "Very Disappointed" Survey Is the Only Reliable Must-Have Signal
Sean Ellis coined the PMF survey question used by hundreds of companies: if more than 40% of users say they'd be 'very disappointed' if the product disappeared, you have enough signal to build a growth engine. Below that, you're still iterating toward must-have. The threshold isn't magic — it's the minimum observed level where sustainable growth is consistently achievable.
“we also are doing smoke testing so basically showing users burner plus and seeing if they're interested in purchasing before they can actually get it and so then we measure... actually purchase and then you give them the womp paywall like... sorry not ready yet but thanks for your interest coming soon”
Use smoke tests to validate new features before a single line of production code is written
Before building Burner Plus, Musetti surfaced a real purchase CTA to existing users and tracked click-through as a demand signal — anyone who tapped got a 'coming soon' modal. This costs almost nothing and generates real willingness-to-pay data. It is the fastest way to validate whether a premium tier will actually sell.
“i used two pricing models to do that the gover granger model and the van westender model and so these are two different survey models that basically ask questions about pricing... are you interested in buying burner plus with xyz features for this price if yes survey done if no what about for this price ten dollars cheaper”
Van Westendorp and Gabor-Granger surveys quantify willingness-to-pay before any live test
Musetti used Gabor-Granger and Van Westendorp surveys on existing users before running any live paywall test for Burner Plus. These models show where price-volume curves cross and reveal the psychologically comfortable price band. Running both surveys first shortens the live experiment window and reduces the risk of pricing wrong at launch.
“one of them is incrementality so you want to switch ads on switch ads off and see what's happening to your baseline which is always of course a good position for smaller developers if you don't have a ton of organic yet”
Use incrementality testing — switch ads on and off to isolate channel impact without a data science team
Before building data science infrastructure, Burke validates channel ROI using a simple incrementality check: pause spend for a period and observe baseline installs and trials. If they drop materially, the channel was contributing. This works especially well for early-stage apps with limited organic that can cleanly isolate the ad effect.
“The most important stuff was the qualitative in the early days. You need qualitative at scale — it's not just your friend, it's 50,000 MAUs at that time — and then people were actually able to record music on their phone and share it.”
Early PMF signal is qualitative at scale — not a single metric
Seth Miller's first sign of product-market fit was searching his own app's name on Twitter and finding strangers sharing tracks unsolicited. App Store reviews piled on. He had no formal retention dashboard — just unmistakable organic signals from people who genuinely loved it. His caution: qualitative enthusiasm only counts when it comes from a large enough pool of non-affiliated users, not friends and family.
“If you didn't have dyslexia you're like see a guy see a buddy. But if you had dyslexia you try five times because you're not gonna read the book by yourself and after the fifth time it works. To this day 15% of the reviews on the App Store are people who say they cried when they downloaded the app.”
Niche Audience as a Shoddy-Product Buffer — Acute Need Forgives Early Bugs
Cliff Weitzman shipped a buggy early version of Speechify that crashed frequently, but dyslexic users retried five times because there was no alternative for them. That acute, unmet need gave him time to improve the product without losing his early cohort. The lesson: the more specific the pain you solve, the more patience your users extend — a forgiving wedge that broad-market apps can't replicate.
“The main metric of success for us has been day-8 return on ad spend. What we try to figure out is not only are people willing to use an app but are they willing to pay for it — and then are we able to acquire customers in a way that makes sense economically with just immediate cash payback.”
Day-8 ROAS as Single Metric Took Opal from Zero to $5M ARR in One Year
Opal scaled from near-zero to $5M ARR in roughly 12 months by anchoring every product and marketing decision to a single metric: day-8 ROAS. The 7-day free trial makes this measurement tight and actionable. When above 100%, they scaled ad spend; when below, they cut back. No forecasting models, no complex attribution — just one number everyone understood.
“At a prior publication, when asked what their audience was — who are your readers — they would have strong opinions based on the assumptions of the executives. But when you went back and said 'what is this based on?' — it's often no, it's just 'that's what I built it for originally and I'm assuming that's who my audience is because no one said otherwise.'”
Your Actual Audience Will Surprise You — Validate Assumptions With Data Early
Every founder has a mental model of who their user is — and it's almost always wrong in at least one material way. Taylor Wells saw this pattern repeatedly: executives defending audience assumptions that had never been validated by data. At Disney+, over 50% of viewers were single adults on a platform assumed to be for families. Building the habit of asking 'what is this based on?' early prevents six months of product decisions aimed at the wrong person.
“That moment when the product is growing despite all the problems — that's usually the definition that there's something there.”
Product growing despite problems is the clearest PMF signal
Talking Parents was held together with duct tape on a server in someone's house and still acquired ~1,000 users through word of mouth alone. The insight: organic growth through dysfunction signals genuine demand. Don't wait for a polished product before treating traction as meaningful validation.
“I bought an app for two thousand dollars, fixed it up, and ended up selling it for a pretty good amount of money — like 10x my money — and realized I could do this.”
Buying a $2K app and 10x-ing it proved the roll-up model before pitching investors
Before pitching a single VC, Michael Ritter validated the Maple Media concept with his own money. A $2K roadside-game app he acquired and monetised became the proof-of-concept that eventually raised $1M from angel investors and attracted a private equity backer. Cheap first experiments compress the hypothesis-test cycle for any acquisition-driven strategy.
“I took this map I hosted it on a web page and added it to the home screen on a couple iPads so it looks like it was an app but it was just totally a dynamic map on a web page and went to this show and was like oh right you got to have a company name right.”
Validate With a Fake App Before Writing a Line of Code
Adam spent under $1,000 on a boat-show booth and showed a web-hosted map on iPads dressed up to look like an app. People tried to find it in the App Store, asked for their emails to be added to a beta list, and visibly lit up — giving him the confidence to quit his engineering job. The lesson: $1,000 to validate beats months of building and five-figure contractor bills before you get a single signal.
“that's great that's one thing would people open up their wallets and give a credit card number was sort of the next thing that I ultimately wanted to validate.”
Sequential Validation: Excitement First, Then Prove People Will Actually Pay
Seeing people light up at a boat show was step one. Adam's step two was proving willingness to pay — not raising investment, not hiring a team. Each stage of validation is its own gate. Too many founders treat excitement as a proxy for business viability. Excitement is cheap; a credit card number is not.
“Shielding yourself from the inevitable question of whether somebody will pay you is very harmful to a founder... you can make a lot of metrics go up before you get to that one not being able to.”
Validate Willingness to Pay Early — Don't Kick That Can Down the Road
Phil distinguishes two types of startups: those with clear precedent that users will eventually pay (where building to scale first makes sense) and those inventing something genuinely new (where deferring monetization is existential risk). For the second type, every growth metric is a vanity metric until you know someone will actually pay. He advises founders to tackle willingness-to-pay early, because discovering it won't work at year three is far more damaging than finding out in month three.
“the further you are away from the metrics you want you should proportionately increase the radicalness of the testing and experimentation you try...if you have total trash like you're not losing anything just try something nuts”
When Metrics Are Far Off Benchmark, Increase The Radicalism Of Your Experiments
Jacob's calibration rule for founders: the scale of your experimentation should be proportional to how far you are from your targets. If conversion is in the ballpark, run conservative A/B tests. If it's terrible, incremental tweaks are statistically equivalent to nothing — try a completely different paywall, a different business model, or a different app entirely. The time cost of trying something radical when things are broken is far lower than the time cost of grinding through marginal improvements.
“I ended up posting it in like the Apple subreddit I think and it really took off in a big way and people were really interested in beta testing it so kind of got to that point where I was like oh like this might have some legs.”
Post the beta in the community you are building for — that is your validation
Apollo's early traction came from a single post in a community Christian already lived in. The feedback was immediate, the beta filled fast, and the demand signal was real — not manufactured. Building for a community you already participate in shortens every loop: from idea validation to beta recruitment to launch distribution.
“Reddit was really big at the time... there wasn't really any like really iOS first app that really I loved so I was like you know screw it this is this could be a fun project.”
Scratch your own itch in a community you love — that is where great indie apps start
Apollo began not with a market research exercise but with a genuine gap Christian felt every time he opened Reddit on his iPhone. The best indie apps are built by people annoyed that the tool they want doesn't exist yet — because they will use the product obsessively, notice every flaw, and keep improving it long after a mercenary builder would have moved on.
“The way I think of benchmarks is they're one tool and a very big toolkit of things that Founders CEOs Indie developers product and marketing leaders can use to get a directional high level view of where they're performing better or worse than other peers in their set.”
Benchmarks are directional, not definitive — use them to filter ideas, not make decisions
Benchmarks can surface where a company is over- or underperforming relative to peers, but they cannot tell you why or what to do about it. Carter's framework: use benchmarks to eliminate wrong priorities and focus engineering and marketing bandwidth on the right problems — but always pair data with product intuition. The danger is treating them as a decision engine rather than an input to judgment.
“We ran a subscription survey that confirmed a lot of the same stuff that we had seen in the subscription value calculator but got into a lot more detail on the specifics of what needed to change in order to shore up some of those metrics.”
Subscription surveys complement benchmarks — quant shows where, qual shows why
For the edtech client, the calculator flagged low conversion, high price, and poor paid ad efficiency. A follow-up subscription survey confirmed the same diagnoses and revealed the specific onboarding and paywall changes needed. Carter's methodology: benchmarks first to triage across the loop, qualitative survey second to prescribe the precise intervention. Neither tool alone is sufficient.
“The first problem is can I actually trust this data... the second order problem is okay even if you're able to find a reliable source of Benchmark data often times it's too generic... the third order problem is okay even if your benchmarks are accurate and specific enough to be actionable they're only a jumping off point.”
Three problems with bad benchmarks: accuracy, specificity, and actionability — in that order
Carter sequences the three ways benchmarks fail: they're inaccurate (most publicly-available numbers), they're too generic (global averages across all categories), or they're accurate and specific but still just a starting point that can't prescribe actions. The Subscription Value Loop calculator attempts to solve problems one and two by using 30,000+ apps and category/tier filters — but problem three always requires founder judgment.
“Sean Ellis developed a simple test: if you were no longer able to use this product today, how disappointed would you be — very disappointed, somewhat, or not? The golden benchmark is 40%. Like Superhuman did, you can segment that data to find the nucleus of your highest-intent users and use it to build a product roadmap that moves you over the line.”
Use the 40% disappointment test to quantify product-market fit — not just 'feel it when you see it'
The binary 'you'll know it when you see it' PMF definition is not actionable. The 40% threshold gives a directional score that can be tracked over time and segmented by user type. Superhuman ran this test iteratively — found they were below 40%, identified which segments were closest, talked to those users, built toward their needs, and systematically crossed the threshold.