Documenting how 534 SaaS launch checklist items were derived from 80 service analyses, 12 guidelines, and 5 real-world failures, including the P0-P3 priority logic.
Background: Launch Delays Happen Outside the Code
When building a SaaS, code is roughly half of the total work. The other half includes privacy policies, payment webhook idempotency, OG image setup, refund policy documentation, and HTTP security header configuration. These are separate from the ability to write code — most of them are the kind of task where “you can only do it if you know it exists.”
We experienced this firsthand during the commercialization of WICHI (a GEO analytics SaaS). Feature development was done in 2 weeks, but launch took 3 more weeks on top of that. The cause wasn’t code quality. It was not knowing the requirements that lived outside the code.
This post documents how those 3 weeks of scrambling became a 534-item checklist. It covers the collection process, classification criteria, category design, quality control methods, and how the number 534 emerged as a convergence point.
It Started with 7 Bullet Points
While fixing items we’d missed during WICHI’s commercialization, we started writing them down simultaneously. A single Notion page with a bullet list.
The first 7:
- Write a privacy policy
- Write terms of service
- Integrate error monitoring (Sentry)
- Handle payment webhook idempotency
- Add sitemap.xml + robots.txt
- Move input validation to the backend
- Set up OG images
These were things we had actually missed on WICHI. Each took less than half a day to fix, but not knowing they existed delayed launch by 3 weeks.
The 7 items were unrelated to each other. Legal documents, payment infrastructure, SEO, security, monitoring — all mixed together. The only thing they had in common: they are items that don’t naturally come to mind during feature development. They’re invisible when you’re focused on building features, and only surface right before or right after launch.
After writing down 7, the thought came: “there must be more.” So we started looking systematically.
Collection Process — 3 Stages
The entire collection process progressed through three stages — experience, research, and systematization — each with a distinctly different character.
graph LR
subgraph S1["Stage 1: Direct Experience"]
A1["WICHI pain-point log"]
A2["~50 items"]
end
subgraph S2["Stage 2: Public Sources"]
B1["YC, Indie Hackers,<br/>HN, Product Hunt"]
B2["~200 items added"]
end
subgraph S3["Stage 3: Systematization"]
C1["OWASP, GDPR,<br/>Web Vitals, WCAG"]
C2["~280 items added"]
end
S1 -->|"There must be more"| S2
S2 -->|"Bullet lists can't manage this"| S3
S3 --> D["534 items / 15 categories"]
| Stage | Duration | Method | Items Found | Cumulative |
|---|---|---|---|---|
| Stage 1: Direct experience | 3 weeks (WICHI launch period) | Real failures → documented fixes | ~50 | 50 |
| Stage 2: Public sources | 2 weeks | Community posts, guides collection | ~200 | 250 |
| Stage 3: Systematization | 3 weeks | Framework mapping, gap analysis | ~284 | 534 |
Stage 1: Direct Experience (~50 Items)
These were items we personally encountered while building WICHI. They were extracted from the 6 categories covered in the previous post (legal notices, payments, security, monitoring, SEO, miscellaneous).
The defining trait of this stage: items where we actually missed them and experienced the consequences. These weren’t theoretically necessary — they were failures we lived through.
Discovery Patterns
Classifying the 50 items by how they were discovered reveals four patterns:
| Pattern | Ratio | Description | Examples |
|---|---|---|---|
| User reports | ~35% | Actual users or beta testers flagged the issue | ”The screen is blank” DM, “Where’s the privacy policy?” |
| External service warnings | ~25% | Google, Stripe, or other platforms detected the problem | Search Console sitemap warning, Stripe dispute alert |
| Self-testing | ~20% | Found while using the service ourselves | Mobile layout collapse, successful prompt injection |
| Chain discoveries | ~20% | Found additional issues in the same category while fixing one | Discovered missing refund handling while fixing payment webhook |
Concrete Examples: What WICHI Missed
| Source | Item Example | How Discovered | Fix Time | Launch Impact |
|---|---|---|---|---|
| Payment failure | Webhook idempotency | Credits double-charged during testing | 3 hours | Payment feature delayed 2 days |
| User report | Error page | Received “screen is blank” DM | 1 hour | User churn |
| Legal review request | Privacy policy | ”Where’s the privacy policy?” question | Half day | Legal risk |
| SEO audit | robots.txt, sitemap | Warning during Search Console registration | 2 hours | Delayed search visibility |
| Security test | Prompt injection | Successfully altered LLM instructions during testing | Half day | Security vulnerability |
| Mobile check | Responsive layout | Layout collapsed when accessed from phone | 2 hours | UX defect |
| Payment flow | Missing refund policy | Noticed while fixing webhook | Half day | Stripe dispute risk |
| Auth flow | Password reset | ”I forgot my password and can’t reset it” | 4 hours | User lockout |
Fix time for each item was mostly under half a day. None of it was technically difficult. The problem was “knowing it needed to be done.”
The total fix time for all 50 items was roughly 80 hours, spread across 3 weeks. If none had been missed, the work could have been completed in 1 week. The entire difference was in the timing of discovery.
WICHI-Specific Items: Additional Requirements for LLM-Based Services
Because WICHI is a GEO analytics service that uses LLMs, it needed items that wouldn’t apply to a typical SaaS:
- Prompt injection defense (preventing user input from altering LLM instructions)
- LLM API cost caps (preventing a single user from generating excessive costs)
- Hallucination warnings on LLM responses
- API key rotation and environment variable management
- Fallback handling when LLM services go down
These items were later classified as “conditional items” in Stage 3. They don’t apply to every SaaS, but are mandatory for any service that uses LLMs.
Stage 2: Public Sources (~200 Items Added)
Direct experience alone has limited scope. Because WICHI is a B2B SaaS, items relevant to B2C (e.g., social login, review systems) were missing. So we collected from other builders’ experiences.
Sources and Their Characteristics
| Source | Type | Items Extracted | Strength | Weakness |
|---|---|---|---|---|
| YC Launch Checklist | Official guide | ~30 | Systematic, validated | High-level. “Set up billing” level |
| Indie Hackers posts | Community stories | ~50 | Failure-focused, specific | Solo builder bias, hard to replicate |
| Hacker News “Ask HN” | Threads | ~40 | ”What did you miss at launch” type, diverse stacks | Opinions scattered, unvalidated |
| Product Hunt postmortems | Blog posts | ~30 | Launch-day problem focus | Skewed toward SEO/OG |
| SaaS Checklists (GitHub) | Open lists | ~25 | Union of existing checklists | Often unmaintained |
| Dev.to / Medium | Tech posts | ~25 | Deep dives on specific categories | Mixed with promotional content |
graph TB
subgraph SOURCES["Collection Sources"]
direction LR
S1["Official Guides<br/>(YC, Google)"]
S2["Communities<br/>(IH, HN, PH)"]
S3["Open Source<br/>(GitHub lists)"]
S4["Tech Blogs<br/>(Dev.to, Medium)"]
end
subgraph FILTER["Filtering Criteria"]
F1["Applicable to SaaS in general?"]
F2["Actionable level?"]
F3["Duplicate of existing item?"]
end
subgraph RESULT["Result"]
R1["200 items added"]
R2["~80 items excluded"]
end
S1 --> F1
S2 --> F1
S3 --> F2
S4 --> F2
F1 --> F3
F2 --> F3
F3 -->|Pass| R1
F3 -->|Reject| R2
Notable Findings by Source
From the YC Launch Checklist:
The YC guide contains mostly high-level items, so rather than using them directly, we decomposed them. For example, the single item “Set up billing” breaks down into:
- Implement Stripe Checkout or payment page
- Handle success/failure/cancellation paths
- Verify webhook signatures
- Ensure webhook idempotency
- Sync subscription status with access permissions
- Document refund policy and publish the page
- Auto-generate receipts
One high-level item becomes 7 actionable items. This decomposition pattern was applied extensively in Stage 3.
From Indie Hackers:
The value of Indie Hackers posts lies in their failure stories. Failure stories are more useful than success stories for building checklists. “I didn’t do X and it broke” identifies required items more precisely than “I did Y and it worked.”
Recurring patterns:
- “We didn’t realize we needed a refund policy until after we had set up payments”
- “Our first customer asked for a password reset and the feature didn’t exist”
- “We launched without a CDN and overseas users had 8-second load times”
- “Our Stripe account got frozen due to a dispute — we had no refund policy”
- “We posted on Product Hunt without OG tags and the share card looked broken”
From Hacker News “Ask HN” threads:
HN threads draw builders from diverse tech stacks, so we could discover items that aren’t biased toward any particular framework. Security, infrastructure, and legal items came disproportionately from HN comments.
From GitHub open lists:
We reviewed 10+ existing SaaS checklists on GitHub. Most were specialized in one category (security or SEO) or only listed high-level items. None covered “all categories at an actionable level.” Some hadn’t been updated in years and contained items irrelevant to current tech stacks. Others listed framework-specific items (Rails, Django) as if they were universal.
| Existing Checklist Type | Count Found | Problem |
|---|---|---|
| Security-focused | 4 | OWASP-based but not SaaS-universal. Heavy on infra security |
| SEO-focused | 3 | Only technical SEO, missing marketing infra and social previews |
| General launch | 2 | High-level. Stops at “add payments.” Not actionable |
| Stack-specific | 3 | ”Rails launch checklist” etc. Useless after framework switch |
The conclusion from this analysis: the biggest problem with existing checklists wasn’t “depth” but “consistency.” Some had detailed security but a single line on legal. Others had thorough SEO but zero accessibility coverage. No resource provided consistent depth across all areas a builder actually needs to verify before launch.
Items We Didn’t Add: Exclusion Criteria
Roughly 80 items were intentionally excluded during collection. Three criteria for exclusion:
| Exclusion Criteria | Description | Examples | Items Excluded |
|---|---|---|---|
| Marketing/growth strategy | Growth playbook territory, not launch checklist | ”Build an email newsletter before launch”, “Plan Product Hunt strategy” | ~30 |
| Business model dependent | Only valid for specific business models | ”Design marketplace commission structure”, “Build affiliate program” | ~25 |
| Not verifiable | Can’t be judged as “done/not done" | "Improve user experience”, “Establish brand image” | ~25 |
The hardest calls were on the boundary between marketing and growth strategy. “Register your site on Google Search Console” is a launch checklist item. “Conduct keyword research” is growth strategy. The difference: “is it incomplete as a launch if this is missing?” Not registering with Search Console means your site won’t get indexed — that’s a launch defect. Not doing keyword research means less search traffic, but it doesn’t block the launch itself.
By the same logic, “document a refund policy” is included, but “write a customer support manual” is excluded. Without a refund policy, your Stripe account can be frozen during a dispute. Without a support manual, things still work, at least for now.
Core exclusion principle: “Does the absence of this item cause launch failure or delay?” If the answer is “no,” exclude it. A checklist’s value is determined not by what it includes, but by what it leaves out. Put everything in and nobody reads it.
Stage 3: Systematization (~280 Items Added)
Once the count exceeded 250, bullet lists became unmanageable. Search was impossible, detecting duplicates was difficult, and prioritization was out of the question.
At this point, the approach shifted. Instead of adding items, we started mapping existing items to industry-standard frameworks.
Frameworks Used for Mapping
graph TB
subgraph FRAMEWORKS["Frameworks Used for Mapping"]
F1["OWASP Top 10<br/>Web Security Vulnerabilities"]
F2["GDPR / CCPA<br/>Data Privacy Regulations"]
F3["WCAG 2.1<br/>Web Accessibility Guidelines"]
F4["Core Web Vitals<br/>Performance Metrics"]
F5["12-Factor App<br/>Cloud-Native Principles"]
end
subgraph RESULT["Mapping Results"]
R1["Existing items structured"]
R2["Missing items discovered"]
R3["Category classification criteria established"]
end
F1 --> R1
F2 --> R2
F3 --> R2
F4 --> R1
F5 --> R3
| Framework | Mapped Category | Existing Coverage | Newly Discovered Items |
|---|---|---|---|
| OWASP Top 10 | Security, Auth | ~40% | ~65 |
| GDPR / CCPA | Legal, Email | ~30% | ~45 |
| WCAG 2.1 | Accessibility, Frontend | ~20% | ~55 |
| Core Web Vitals | Performance, Frontend | ~50% | ~35 |
| 12-Factor App | Backend, DevOps, CI/CD | ~35% | ~50 |
| Google SEO Guide | SEO & Marketing | ~45% | ~35 |
The Mapping Process: Decomposing High-Level Items
This process revealed a large number of items that always existed but were never spelled out. The pattern was consistent: when a single high-level item is broken down against a framework, it yields multiple actionable items.
Example 1: “HTTP Security Headers”
Originally a single item: “set up security headers.” Under OWASP guidelines:
- [ ] Set Content-Security-Policy header
- [ ] Set X-Content-Type-Options: nosniff
- [ ] Set X-Frame-Options: DENY (or SAMEORIGIN)
- [ ] Set Referrer-Policy: strict-origin-when-cross-origin
- [ ] Restrict browser features with Permissions-Policy
- [ ] Set Strict-Transport-Security (HSTS) + includeSubDomains
1 becomes 6.
Example 2: “Cookie Consent”
A single item — “implement cookie consent” — under GDPR requirements:
- [ ] Implement cookie consent banner
- [ ] Classify cookies by category (essential, analytics, marketing)
- [ ] Allow opt-in/opt-out per category
- [ ] Save consent preferences and reflect them in subsequent requests
- [ ] Do not load tracking scripts before consent is given
1 becomes 5.
Example 3: “Implement Authentication”
A single item — “implement login/sign-up” — under the OWASP Auth guide:
- [ ] Implement email/password authentication
- [ ] Implement OAuth social login (Google, GitHub, etc.)
- [ ] Email verification flow
- [ ] Password strength requirements (minimum length, complexity)
- [ ] Block disposable email addresses (if needed)
- [ ] Detect duplicate accounts (same email, different provider)
- [ ] Send welcome email
- [ ] Account lockout on repeated login failures
- [ ] Log all login attempts (IP, User Agent)
- [ ] Time-limited token for "forgot password"
- [ ] Rate limit password reset requests
- [ ] Notify user on reset completion
- [ ] Choose session strategy (JWT vs server-side)
- [ ] Set token expiration time
- [ ] Refresh token rotation
- [ ] "Log out from all devices" feature
1 becomes 16.
Decomposing high-level items into actionable units like this caused the item count to grow rapidly.
Decomposition Ratios: Expansion Multipliers by Framework
| Framework | Before (High-Level) | After (Actionable) | Avg Expansion |
|---|---|---|---|
| OWASP Top 10 | 10 | 65 | 6.5x |
| GDPR/CCPA | 8 | 45 | 5.6x |
| WCAG 2.1 | 10 | 55 | 5.5x |
| Core Web Vitals | 6 | 35 | 5.8x |
| 12-Factor App | 12 | 50 | 4.2x |
On average, one high-level item decomposes into roughly 5 actionable items.
Cross-Validation and Deduplication
After framework mapping, a large number of duplicates emerged. For example:
- “Validate all input on the server side” appeared under both OWASP (injection defense) and 12-Factor App (server-side logic)
- “Encrypt data” appeared under both GDPR (encryption in transit) and OWASP (data exposure prevention)
- “Error logging” appeared under monitoring, security, and DevOps — three categories
The deduplication principle was simple: place each item in the single category where it most naturally belongs. “Input validation” goes in Security, “data encryption” goes in Security, “error logging” goes in Monitoring. Other categories only reference them.
The total before deduplication was roughly 620 items. After deduplication: 534. About 14% were duplicates.
Priority Assignment
Every item received a priority level. The criteria were two questions: “Does missing this block the launch?” and “Does missing this cause problems after launch?”
| Priority | Criteria | Item Ratio | Examples |
|---|---|---|---|
| P0 (Required) | Launch is impossible without this | ~25% (~134 items) | Payment processing, authentication, privacy policy |
| P1 (Important) | Problems likely within 1 week of launch | ~35% (~187 items) | Error monitoring, password reset, refund policy |
| P2 (Recommended) | Affects quality/growth | ~30% (~160 items) | Accessibility, SEO optimization, performance tuning |
| P3 (Conditional) | Only applies to specific stacks/features | ~10% (~53 items) | i18n, MFA, specific payment providers |
The 15-Category Structure
The final organization: 15 categories across 3 groups.
graph TB
subgraph BUILD["Product Build (198 items)"]
B1["01 Frontend<br/>35 items"]
B2["02 Backend<br/>46 items"]
B3["03 Auth<br/>42 items"]
B4["04 Billing<br/>36 items"]
B5["11 Testing<br/>39 items"]
end
subgraph LAUNCH["Launch Readiness (194 items)"]
L1["08 SEO & Marketing<br/>58 items"]
L2["09 Legal<br/>35 items"]
L3["06 Security<br/>35 items"]
L4["10 Performance<br/>38 items"]
L5["12 CI/CD<br/>28 items"]
end
subgraph OPS["Operations (182 items)"]
O1["07 Monitoring<br/>35 items"]
O2["14 Analytics<br/>33 items"]
O3["13 Email<br/>30 items"]
O4["15 Accessibility<br/>39 items"]
O5["05 DevOps<br/>45 items"]
end
All 15 Categories Defined
| # | Category | Group | Items | Core Question | Reference Framework |
|---|---|---|---|---|---|
| 01 | Frontend | Product Build | 35 | Does the UI behave correctly in all states? | — |
| 02 | Backend | Product Build | 46 | Is the API stable and scalable? | 12-Factor App |
| 03 | Auth | Product Build | 42 | Is authentication/authorization secure and complete? | OWASP Auth |
| 04 | Billing | Product Build | 36 | Is the payment flow secure and legally compliant? | PCI DSS (indirect) |
| 05 | DevOps | Operations | 45 | Is the infrastructure stable and recoverable? | 12-Factor App |
| 06 | Security | Launch Readiness | 35 | Are all known vulnerabilities defended against? | OWASP Top 10 |
| 07 | Monitoring | Operations | 35 | Can you detect problems immediately when they occur? | — |
| 08 | SEO & Marketing | Launch Readiness | 58 | Is the product discoverable via search and social? | Google SEO Guide |
| 09 | Legal | Launch Readiness | 35 | Are legal requirements met? | GDPR, CCPA |
| 10 | Performance | Launch Readiness | 38 | Are there performance bottlenecks hurting UX? | Core Web Vitals |
| 11 | Testing | Product Build | 39 | Do code changes break existing features? | — |
| 12 | CI/CD | Launch Readiness | 28 | Is deployment safe and repeatable? | — |
| 13 | Operations | 30 | Are emails delivered accurately? | — | |
| 14 | Analytics | Operations | 33 | Is user behavior being measured? | — |
| 15 | Accessibility | Operations | 39 | Can all users access the service? | WCAG 2.1 |
Why These 15: How the Categories Were Decided
The number 15 wasn’t arbitrary. It was determined by the intersection of two constraints.
Constraint 1: Each category should be completable in a single work session
If categories are too large (e.g., combining “Backend + DevOps + CI/CD” into one), a single session can’t cover it. If they’re too small (e.g., “HTTP Headers” as a standalone category), management overhead increases.
The sweet spot was 25-58 items per category. Dividing at this granularity naturally produces 15 categories.
Constraint 2: Minimal inter-category dependencies
The criterion for splitting was: “can this be worked on independently from other categories?” For example:
- Auth and Security are closely related, but you don’t need to review all of Security when working on Auth
- Frontend and Performance overlap, but performance optimization is more efficient as a separate session
- Legal and Email overlap on cookie consent/opt-out, but each can be checked independently
We initially started with 12 categories. Email and Analytics were merged into Monitoring, and Accessibility was included in Frontend. As items grew, separating these three became clearly better.
Specific rationale for each split:
- Email separation: When included in Monitoring, the combined count exceeded 65. Email items — deliverability, SPF/DKIM setup, template management — are fundamentally different from monitoring in nature. Splitting yielded Monitoring at 35 and Email at 30, each a manageable size.
- Analytics separation: When merged with Monitoring, error tracking (Monitoring) and user behavior analysis (Analytics) have different purposes. “Is the service broken?” vs “How are users using it?” are different work sessions. Post-split: 35 and 33 respectively.
- Accessibility separation: When included in Frontend, accessibility items alone (keyboard navigation, screen reader compatibility, color contrast, ARIA labels) totaled 39. Combined with Frontend’s own 35 items, that’s 74 — too much for one session. The existence of WCAG 2.1 as a clear reference framework also justified an independent category.
Product Build — Where Code Is Written (198 Items)
| Category | Items | Core Sections | Reference |
|---|---|---|---|
| Frontend | 35 | UI states, form validation, responsive design, error states, loading UX | — |
| Backend | 46 | API design, data models, error handling, logging, pagination | 12-Factor App |
| Auth | 42 | Auth/authorization, sessions, MFA, password reset, social login | OWASP Auth |
| Billing | 36 | Payments, subscriptions, webhooks, refunds, taxes, receipts | PCI DSS (indirect) |
| Testing | 39 | Unit, integration, E2E, accessibility testing, test coverage | — |
The highest count in this group is Backend (46 items). API design, data models, error handling, logging, caching, and queue processing — the server-side scope is broad. Auth (42) follows closely, because auth flow branches (sign-up, login, social login, password reset, MFA) each form their own item clusters.
Launch Readiness — Checks Outside the Code (194 Items)
| Category | Items | Core Sections | Reference |
|---|---|---|---|
| SEO & Marketing | 58 | Meta tags, sitemap, OG, structured data, analytics tools | Google SEO Guide |
| Legal | 35 | Privacy policy, terms, cookies, GDPR/CCPA | GDPR, CCPA |
| Security | 35 | OWASP Top 10, HTTP headers, input validation, API security | OWASP Top 10 |
| Performance | 38 | Core Web Vitals, image optimization, caching, CDN, bundle size | Web Vitals |
| CI/CD | 28 | Pipelines, environment separation, rollback, secret management | — |
The highest count in this group is SEO & Marketing (58 items). Meta tags, structured data, social sharing, analytics integration, Search Console registration, i18n SEO, OG images — the number of items needed to “be discoverable via search” was larger than expected. Not a single line of feature code, yet directly impacting launch success.
Operations — Post-Launch Maintenance (182 Items)
| Category | Items | Core Sections | Reference |
|---|---|---|---|
| Monitoring | 35 | Error tracking, uptime, log aggregation, alert channels | — |
| Analytics | 33 | Event tracking, funnel analysis, conversion measurement, dashboards | — |
| 30 | Transactional email, templates, deliverability, opt-out, SPF/DKIM | — | |
| Accessibility | 39 | Keyboard navigation, screen readers, color contrast, ARIA | WCAG 2.1 |
| DevOps | 45 | Infrastructure, env vars, backups, scaling, log rotation | 12-Factor App |
The highest count in this group is DevOps (45 items). Infrastructure setup, environment separation, backups, monitoring infrastructure, scaling strategy, incident response — everything focused on “keeping the service alive.”
Category Design Principles
Principle 1: Independently Checkable Units
The criterion for splitting categories was: Can this be worked on independently from other categories?
You don’t need to check payments while doing a security review. But reviewing Auth alongside Security does reduce oversights.
Inter-category dependency map:
| Category | Strong Dependency | Weak Dependency |
|---|---|---|
| Auth | Security | Backend, Email |
| Billing | Legal | Backend, Email |
| SEO & Marketing | Frontend | Analytics, Performance |
| DevOps | CI/CD | Monitoring, Backend |
| Legal | Auth, Billing |
Strongly related category pairs are more efficient to work on together in the same session, but it’s not mandatory.
Principle 2: Gate Checklists vs Blueprints
MMU has two types of checklists. This distinction matters.
graph LR
subgraph GATE["Gate Checklist"]
G1["15-20 items"]
G2["Pass/fail judgment"]
G3["Milestone transition criteria"]
end
subgraph BLUEPRINT["Blueprint"]
B1["20-58 items"]
B2["Implementation depth guide"]
B3["Category-level detail"]
end
GATE -->|"After passing"| NEXT["Next Milestone"]
BLUEPRINT -->|"Reference during implementation"| CODE["Code Work"]
GATE -.->|"extends"| BLUEPRINT
| Gate Checklist | Blueprint | |
|---|---|---|
| Purpose | Determine if phase transition is possible | Guide implementation depth |
| Depth | High-level (15-20 items) | Detailed (20-58 items) |
| When Used | Before milestone transition | During implementation |
| Example | ”Is payment working?" | "Webhook signature verification, idempotency, refund handling…” |
| On Failure | Cannot proceed to next phase | Quality degradation, tech debt |
| File Location | docs/checklists/ | docs/blueprints/ |
Gate checklists determine “can we move past this phase?”, and blueprints tell you “specifically what needs to be done in this category?”
MMU’s 6-stage gate structure:
| Gate | Name | Core Question | Gate Items |
|---|---|---|---|
| M0 | Problem Fit | Who is this for, and why build it? | 4 |
| M1 | Build Fit | Do core features work end-to-end? | 5 |
| M2 | Revenue Fit | Can you charge and can you refund? | 4 |
| M3 | Trust Fit | Are legal docs, support channels, and logging in place? | 4 |
| M4 | Growth Fit | Is it discoverable via search? Do share links work? | 4 |
| M5 | Scale Fit | What happens if it goes down at 3 AM? | 5 |
Principle 3: Actionable Items Only
Every checklist item is written to be judgeable as “done/not done.”
# Bad: Abstract, ambiguous criteria
- [ ] Strengthen security
- [ ] Improve user experience
- [ ] Optimize performance
# Good: Specific, actionable
- [ ] Set Content-Security-Policy header
- [ ] Validate all user input on the server side
- [ ] Limit file upload type and size, scan for malware
- [ ] Achieve Lighthouse performance score of 90+
- [ ] LCP (Largest Contentful Paint) under 2.5 seconds
“Strengthen security” can’t be checked off. “Set CSP header” can. This difference determines whether a checklist is actually useful.
Item writing test — every item must pass these checks:
| Test | Question | Pass Criteria |
|---|---|---|
| Binary judgment | Can you answer “done/not done”? | Yes |
| Single session | Can one person complete it in one session? | Yes |
| Verifiable | Can a third party confirm completion? | Yes |
| Independence | Can it be executed without depending on other items? | Mostly Yes |
Principle 4: One-Way Dependencies
When cross-category references are needed, the dependency direction is fixed to one-way. For example:
- Auth blueprints can reference Security items, but Security blueprints don’t reference Auth implementation details
- Billing references Legal (refund policy), but Legal doesn’t reference Billing specifics
- Gate checklists reference blueprints, but blueprints don’t reference gates
This prevents circular dependencies regardless of which category you work on first. Builders can set their own priority order.
Principle 5: Conditional Item System
About 100 of the 534 items are marked conditional (<!-- if:flag -->). These don’t apply to every SaaS.
| Condition Flag | Item Count | Description |
|---|---|---|
has_billing | ~25 | Uses Stripe/payment features |
has_i18n | ~15 | Supports multiple languages |
has_mfa | ~8 | Implements MFA |
targets_eu | ~12 | Targets EU users (GDPR) |
targets_california | ~6 | Targets California users (CCPA) |
has_file_upload | ~10 | Allows file uploads |
has_llm | ~8 | Uses LLM/AI features |
has_native_mobile | ~12 | Has a native mobile app |
During mmu init, the stack configuration (.mmu/config.toml) auto-sets these flags, and non-applicable items are excluded from score calculations.
Quality Control: Why 534 Doesn’t Become 535
Checklists tend to grow. Every time someone thinks “this would be nice to have,” the count creeps up and quickly hits 1,000. That’s a checklist nobody reads.
Addition Criteria: Inclusion Conditions
Adding a new item requires satisfying all four:
- Launch relevance: Does its absence cause launch failure or problems within 1 month post-launch?
- Actionability: Can it be judged as “done/not done”?
- Universality: Does it apply to 50%+ of all SaaS products? (If not, mark as conditional)
- Non-duplication: Is it substantively different from existing items?
The hardest call among the four is #3, universality. “Password reset flow” applies to virtually every SaaS — that’s clear. But “TOTP-based MFA implementation” is near-mandatory for B2B SaaS while being overkill for personal projects. Such items are classified as conditional and only activated when the relevant flag is turned on.
The benchmark we used for universality: “If you randomly picked 50 SaaS products with $1,000+ monthly MRR from Indie Hackers, would this item be needed by at least 25?” This isn’t rigorous statistics — it’s experience-based estimation, but it was effective enough for filtering borderline items.
Removal Criteria: Deletion Conditions
Existing items are periodically reviewed. An item is deleted or merged if any of the following apply:
- The framework/library handles it by default (e.g., Next.js auto-sets X-Frame-Options)
- It’s effectively subsumed by another item (e.g., “hash passwords” is included in “set up auth provider”)
- It only becomes meaningful 6+ months post-launch (that’s operations guide territory, not checklist)
- Technological progress has made manual setup unnecessary (e.g., most hosting providers auto-provision HTTPS, so “install SSL certificate” isn’t a separate item)
Items actually removed during Stage 3 systematization:
| Removed Item | Removal Reason | Where Merged |
|---|---|---|
| Install SSL certificate | Vercel, Netlify, etc. handle this automatically | Absorbed into DevOps “force HTTPS redirect” |
| Hash passwords with bcrypt | Auth providers handle this | Included in Auth “provider setup” |
| Update jQuery to latest version | Not applicable to modern frameworks | Deleted |
| Create database indexes | Too specific, schema-dependent | Included in Backend “verify query performance” |
A checklist’s value is determined not by “how many items it includes” but by “the judgment behind what it excludes.” Keeping 534 from becoming 535 is harder than getting to 534 in the first place.
How This Differs from Existing Checklists
“SaaS checklists” already exist. Search GitHub and you’ll find dozens. Three ways MMU is different:
1. Depth
Most existing checklists list high-level items. “Set up billing.” “Secure your app.” “Configure SEO.” Those are “lists of things to do,” not “checklists.”
| Category | Existing Checklists | MMU |
|---|---|---|
| Billing | ”Set up billing” (1 item) | Webhook signature verification, idempotency, refund handling, subscription status sync… (36 items) |
| Security | ”Secure your app” (1 item) | CSP header, HSTS, input validation, rate limiting, API keys… (35 items) |
| SEO | ”Add meta tags” (1 item) | title, description, canonical, OG, sitemap, robots.txt, structured data, hreflang… (58 items) |
2. Actionability
Every item can be judged as “done/not done.” No abstract guidance.
3. Conditional Filtering
Non-applicable items are automatically excluded based on project stack. A project without i18n won’t see hreflang items. A service not targeting the EU won’t be forced through GDPR data access right workflows. Thanks to this conditional system, 534 items isn’t a burden — it’s “only the items relevant to my project” filtered and presented.
Most existing checklists say “pick and choose what applies to you.” The problem: beginner builders can’t judge “what applies.” The conditional system replaces that judgment with stack configuration. The builder only needs to answer “does my project have payments?” The system handles the rest.
What 534 Means: The Convergence Process
534 wasn’t a planned number. It’s where “newly discovered items” and “deduplicated/removed items” reached equilibrium during Stage 3 systematization.
The convergence:
| Timepoint | Total Items | Weekly Additions | Weekly Removals | Net Change |
|---|---|---|---|---|
| Stage 3 Week 1 | 320 | 70 | 0 | +70 |
| Stage 3 Week 2 | 450 | 80 | 12 | +68 |
| Stage 3 Week 3 (dedup) | 530 | 45 | 35 | +10 |
| Stage 3 Week 4 | 538 | 12 | 16 | -4 |
| Final cleanup | 534 | 3 | 7 | -4 |
Starting from Week 3, additions and removals occurred at nearly the same rate. By Week 4, net change turned negative. That’s when collection stopped.
534 sounds like a lot, but not all 534 apply to any single SaaS:
- Projects without i18n skip ~15 related items
- Projects not using Stripe skip ~10 Stripe-specific items
- B2B SaaS skips social login, review systems, and other B2C items
- Projects not targeting the EU skip ~12 GDPR items
WICHI’s applicable items were around 430. The remaining ~100 are marked conditional (<!-- if:flag -->) and auto-excluded based on stack settings.
534 isn’t a “magic number.” It’s just the point where completeness plateaued. Adding new items either overlapped with existing ones or failed the universality threshold. That’s where it stopped.
Universality — 80% Are Common Items
Of the 534 items, only about 20% are specific to WICHI. GEO-analysis-specific items (multi-engine queries, LLM response parsing, etc.) fall into that bucket.
The remaining 80% are items any SaaS builder should verify.
“Does a password reset flow exist?”
“Does the payment webhook have signature verification?”
“Does the privacy policy list third-party services?”
“Is the 404 page customized?”
“Is the refund policy clearly stated?”
These questions apply equally whether you’re building WICHI, a project management tool, or an e-commerce platform.
Universal vs project-specific distribution:
| Category | Universal | Conditional | Project-Specific |
|---|---|---|---|
| Frontend | 90% | 10% | — |
| Backend | 85% | 10% | 5% |
| Auth | 75% | 20% | 5% |
| Billing | 70% | 25% | 5% |
| SEO & Marketing | 80% | 15% | 5% |
| Security | 95% | 5% | — |
| Legal | 60% | 35% | 5% |
| Performance | 90% | 5% | 5% |
| DevOps | 85% | 10% | 5% |
| Overall Average | ~80% | ~15% | ~5% |
Security has the highest universal rate at 95%. Security items apply to virtually every SaaS. Legal has the lowest at 60%, because regional regulations (GDPR, CCPA) are classified as conditional.
Next
There was no reason to keep this checklist to ourselves. Why we decided to release it as open source, and why we chose a CLI over a web dashboard — that’s the story of the next post.
Related Posts

The Code Was Done, But Everything Else Wasn't
Analyzing the difference between 'code complete' and 'product complete' through the 3-week post-feature work (payment stability, security, legal docs, SEO, etc.) which accounted for 88% of the effort.

Why We Made It Open Source -- The Case Against Closing a Checklist
Why MMU is an MIT open-source CLI, not a paid SaaS. Details the solo builder workflow fit, the 3-tier 'What-How-Auto' revenue model, and explicit validation metrics.

Using Feature Flags to Fix CLI Score Accuracy — Solving the False-Pass Problem
Solving the false-pass problem in MMU's checklist by designing a condition marker system with 17 feature flags and improving score accuracy through 48 regression tests.