Railway + Supabase Operational Review

Operational review of WICHI's backend on Railway and Supabase, covering real-world incidents like connection pooling and migration conflicts, plus criteria for infrastructure migration.

Why This Stack

WICHI’s backend is a FastAPI + PostgreSQL stack. When choosing infrastructure to host it, we had three criteria:

Deployment must be simple — a GitHub push should be all it takes to deploy. We didn’t want to spend time configuring a separate CI/CD pipeline.
DB + auth in one place — PostgreSQL, user authentication, and Row Level Security (RLS) should all be available within a single platform, without combining multiple services.
Minimal initial cost — at a pre-revenue stage, fixed costs had to be minimized. Free tiers or usage-based billing were mandatory.

Railway satisfied the first and third criteria. Supabase satisfied the second and third. Combining the two covers everything from FastAPI app deployment to PostgreSQL operations, authentication, and RLS — without any additional external services.

This post covers both services in detail based on real operational experience, highlights things to know when running them in combination, and outlines the criteria for knowing when to migrate away from this stack.

Architecture Overview

graph TB
    subgraph Client
        A[Web App - Vercel]
    end

    subgraph Railway
        B[FastAPI Server]
        C[Health Check Endpoint]
    end

    subgraph Supabase
        D[PostgreSQL DB]
        E[Connection Pooler - PgBouncer]
        F[Auth Service]
        G[REST API - PostgREST]
        H[Realtime]
        I[Storage]
    end

    subgraph External
        J[GitHub Repository]
    end

    A -->|HTTPS| B
    B -->|port 6543 - Transaction Mode| E
    E --> D
    A -->|Direct| F
    J -->|push trigger| B
    C -->|DB connection check| E

    style E fill:#f9f,stroke:#333
    style B fill:#bbf,stroke:#333

The critical detail in this diagram is that the FastAPI server does not connect directly to the Supabase DB. It routes through the Connection Pooler (PgBouncer). Why this matters is covered in detail in the Supabase section.

Railway: In Detail

Core Feature: GitHub Auto-Deploy

Railway’s greatest strength is GitHub-integrated automatic deployment. Once you connect a repository, every push to the main branch automatically triggers a build and deploy. No separate GitHub Actions workflow required.

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub
    participant RW as Railway
    participant Svc as Service

    Dev->>GH: git push origin main
    GH->>RW: Webhook trigger
    RW->>RW: Detect Dockerfile / Nixpacks
    RW->>RW: Build image
    RW->>Svc: Deploy new version
    RW->>RW: Health check pass
    RW-->>Svc: Route traffic to new version
    Note over RW,Svc: Zero-downtime rolling deploy

The deployment process works as follows:

Push to GitHub triggers Railway via webhook
If a Dockerfile exists, Docker build runs; otherwise, Nixpacks auto-configures the build environment
After image build completes, deployment goes to a new container
Traffic switches to the new version once health check passes

For a typical FastAPI app, build time (including dependency installation) runs 1-3 minutes, and traffic switchover completes within tens of seconds. Writing your own Dockerfile enables layer caching, which dramatically reduces build time when dependencies haven’t changed.

Preview Deploy

Opening a PR automatically creates a Preview Deploy for that branch. It gets its own independent URL, so you can verify real behavior during PR review. This feature is similar to Vercel’s Preview Deploy but applied to backend services, which makes it particularly useful.

One caveat: you need to decide upfront how to manage the DB connection string for Preview Deploys. Using the production DB is risky, and connecting a separate staging DB requires environment variable separation.

Usage-Based Billing

Railway’s billing is calculated on three axes:

Billing Item	Unit	Notes
vCPU	Per hour	Based on actual CPU time used
Memory	GB-hours	Based on usage, not allocation
Network Egress	GB	Only outbound traffic is billed
Disk	GB-hours	When persistent storage is used

The advantage of this structure is low costs when traffic is low. The disadvantage is that it’s hard to know the exact cost until you open the month-end invoice. The dashboard shows a real-time Estimated Cost, but this is a running total, not a month-end projection.

Railway’s billing model is a “pay for what you use” structure. If there’s no traffic, costs are nearly zero — but a traffic spike can generate unexpected charges. Setting up Spending Alerts is not optional; it’s essential.

Developer Experience (DX)

Railway’s DX ranks among the best in its class. Here’s a breakdown by category:

DX Area	Rating	Details
Initial setup	Excellent	First deploy within 5 minutes of GitHub connection
Dashboard	Excellent	Visual service topology, intuitive
Logs	Good	Real-time streaming supported; search and filtering are limited
Environment variable management	Excellent	Per-service separation, reference variable (`${{}}` syntax) support
CLI	Good	Basic commands like `railway run`, `railway logs`
Documentation	Fair	Core content is there, but edge case docs are lacking
Community	Good	Active Discord, fast response times

A note on logging: Railway’s built-in log viewer is sufficient for debugging, but falls short for structured log searching or long-term retention. For production, integrating an external log service (Axiom, Better Stack, etc.) is recommended. Railway supports log drain, so integration isn’t difficult.

Limitations

Limitations observed while using Railway:

Cold starts: When there’s no traffic, instances enter a sleep state. The first request can take several seconds. For FastAPI apps, this includes Uvicorn startup time, making the perceived delay noticeable.
Limited regions: Region options are fewer than AWS or GCP. Asia regions exist, but there’s no Seoul region for Korean users. This directly impacts response latency.
Scaling control: Horizontal scaling (running multiple instances) is possible, but you can’t configure the fine-grained auto-scaling policies available with AWS ECS or Kubernetes. This is a limitation for services with irregular traffic patterns.
Cron job limitations: You can spin up a separate cron service on Railway, but there’s no built-in scheduler. You need to embed a library like APScheduler directly in your app or separate it into a dedicated service.

Supabase: In Detail

PostgreSQL as a Service

Supabase’s core offering is managed PostgreSQL. Instance provisioning, backups, and upgrades are handled by Supabase, so you rarely need to directly manage DB operations.

Supabase is architecturally a set of services layered on top of PostgreSQL. You don’t have to use all of them. Here’s the breakdown of what WICHI actually uses versus what it doesn’t:

Feature	Used?	Purpose / Reason for Not Using
PostgreSQL DB	Yes	Core data store
Auth	Yes	Email/OAuth authentication, JWT issuance
Row Level Security (RLS)	Yes	Per-user data access control
Connection Pooler	Yes	Railway→DB connection management
SQL Editor	Yes	Ad-hoc queries, data verification
REST API (PostgREST)	No	FastAPI handles queries directly
Realtime	No	No real-time features needed currently
Storage	No	No file upload functionality
Edge Functions	No	Server logic handled by Railway

Looking at just the features we use, Supabase is effectively “managed PostgreSQL + Auth + RLS.” This combination alone significantly reduces management overhead compared to operating separate authentication (Auth0, Firebase Auth) and database services (RDS, Cloud SQL).

Auth Service

Supabase Auth is a GoTrue-based authentication service. It supports email/password authentication and OAuth (Google, GitHub, etc.) login out of the box, and handles JWT token issuance and refresh automatically.

On the FastAPI server, we verify the JWT sent by the client and execute DB queries based on the user ID in the token. Not having to build the auth service from scratch is the biggest advantage.

There’s an important caveat, though. Supabase Auth’s JWT includes role and sub (user ID) by default. RLS policies can identify the current user via auth.uid(), but this only works automatically when accessing through the Supabase client library. When querying PostgreSQL directly from FastAPI, RLS is not automatically applied — you need to write separate access control logic on the server side.

RLS does not mean “configured and therefore safe.” The scope of RLS enforcement depends on how the DB is accessed. Accessing with the service_role key bypasses RLS entirely, so managing this key is the crux of security.

Connection Pooling: A Deep Dive

This is the most important configuration in Supabase operations, warranting its own section.

graph LR
    subgraph Railway
        A1[FastAPI Instance 1]
        A2[FastAPI Instance 2]
        A3[FastAPI Instance 3]
    end

    subgraph Supabase Connection Pooler
        B[PgBouncer<br/>port 6543<br/>Transaction Mode]
    end

    subgraph PostgreSQL
        C[DB Instance<br/>Max Connections Limited]
    end

    A1 -->|conn 1| B
    A1 -->|conn 2| B
    A2 -->|conn 3| B
    A2 -->|conn 4| B
    A3 -->|conn 5| B
    B -->|pooled conn A| C
    B -->|pooled conn B| C
    B -->|pooled conn C| C

    style B fill:#f96,stroke:#333

PostgreSQL has a physical limit on concurrent connections. On Supabase’s Free tier, this limit is even stricter. If the FastAPI server creates a new DB connection per request, connection exhaustion occurs under concurrent load.

The Connection Pooler (PgBouncer) solves this. It accepts connection requests from multiple clients and reuses a small number of actual PostgreSQL connections.

Supabase offers two Pooler modes:

Mode	Port	Behavior	Best For
Transaction	6543	Allocates/returns connections per transaction	Serverless patterns, short queries
Session	5432 (Pooler)	Holds connection until session ends	Cases requiring persistent connections

WICHI’s backend uses Transaction mode. Each FastAPI request is an independent transaction, so Transaction mode — which returns the connection immediately after query execution — is the right fit.

Configuration caveats:

The DATABASE_URL environment variable must use the Pooler address (port 6543). Using a direct connection (port 5432) means no connection pooling, and you’ll hit the concurrent connection limit quickly.
Transaction mode does not support PREPARE statements or LISTEN/NOTIFY. If using SQLAlchemy, you may need to set prepared_statement_cache_size=0.
Migrations must use the direct connection. DDL commands (CREATE TABLE, ALTER TABLE, etc.) can behave unexpectedly when run through the Transaction mode Pooler.

Free Tier vs Pro Plan

Supabase pricing based on publicly available information:

Item	Free	Pro ($25/mo)
DB size	500 MB	8 GB (expandable)
Auth MAU	50,000	100,000
Storage	1 GB	100 GB
Edge Function invocations	500K/month	2M/month
Daily backups	Not included	7-day retention
Branching	Not included	Available
Pausing	Auto-pause after 7 days of inactivity	None

The biggest constraint on the free tier is auto-pausing after 7 days of inactivity. A paused project must be manually restored from the dashboard, taking several minutes. This is fine for personal projects or early development, but for any service with real users, upgrading to Pro is mandatory.

The free tier’s 500 MB DB limit is reached faster than you’d expect. If you’re storing log-type data or history tables in the DB, you can hit the ceiling within weeks. It’s worth designing early on which data lives in the DB and which gets offloaded elsewhere.

Migration Management

Supabase supports a migration workflow via its CLI. Commands like supabase migration new and supabase db push let you manage schema changes.

The temptation is the dashboard’s SQL Editor. “Let me just quickly add one table” via a direct CREATE TABLE in the dashboard creates drift between your local migration files and the actual DB schema. This drift gets harder to resolve the more it accumulates.

Recommended workflow:

Always write schema changes as local migration files
Apply to the remote DB with supabase db push
Use the dashboard SQL Editor for data queries only
Urgent data modifications (DELETE, UPDATE) can go through the dashboard, but schema changes (DDL) should never be done there

Operational Experience

Deployment Speed and Stability

Railway’s deployments are stable. From GitHub push to the new version receiving traffic, it generally takes 2-4 minutes. Zero-downtime deploy is supported, so there’s no service interruption during deployment.

However, when a build fails, the previous version is retained — which means “not realizing a deployment failed” is possible. Without separate deployment failure alerts, you won’t know until you manually check the dashboard. Setting up Slack or Discord webhook integration early is recommended.

Uptime

Both Railway and Supabase offer SLA-backed plans, but the perceived uptime in the Free-to-Pro range is in the high 99% range. We haven’t experienced any major service outages, but intermittent delays and brief anomalies did occur.

Incident Log

Key incidents experienced during the operational period:

#	Type	Symptoms	Root Cause	Detection Method	Resolution Time	Action Taken
1	DB connection exhaustion	Spike in API 500 errors	Direct connection (no pooler) under concurrent load	Manual error log review	~30 min	Switched to Pooler (port 6543)
2	Cold start delay	First request timeout	Heavy model load during startup after sleep	User report	~10 min	Lightened startup event
3	Migration conflict	Deploy failure	Dashboard DDL change conflicted with CLI migration	Deploy log review	~1 hour	Manual migration file sync
4	Missing env variable	Service startup failure	Typo in Railway environment variable	Deploy log review	~5 min	Added env variable validation script
5	Supabase auto-pause	DB connection failure	Free tier 7-day inactivity auto-pause	Health check failure alert	~5 min (manual restore)	Upgraded to Pro plan

The highest-impact incident was #1 (connection exhaustion). Operating with direct connections instead of the Pooler, we hit the concurrent connection limit as users grew. The API intermittently returned 500 errors, and diagnosing the cause took time.

Most incidents were caused by deferring configuration we already knew we should do. Connection Pooler setup, environment variable validation, and alert configuration are things that should be completed alongside the first deployment.

Logging and Monitoring

Railway’s logging system is sufficient for debugging but falls short for production-grade monitoring.

Item	Railway Built-in	Additional Tooling Needed
Real-time log streaming	Provided	—
Log search/filter	Basic level	When structured log searching is needed
Log retention period	Limited	When long-term retention is needed
CPU/Memory metrics	Dashboard provided	Alert threshold configuration
Custom metrics	Not provided	Prometheus, Datadog, etc.
Error tracking	Not provided	Sentry, etc.
APM	Not provided	Separate tooling if needed

The same goes for Supabase. The dashboard shows basic metrics like DB size, connection count, and API request count, but query performance analysis and slow query tracking require separate configuration.

The realistic approach: start with just the Railway and Supabase built-in dashboards in the early stage, then incrementally add Sentry (error tracking) and log drain (long-term log retention) as users grow. Setting up a full monitoring stack from day one is overkill.

Performance Benchmarks

Approximate figures measured in the actual production environment. Since exact numbers vary by service characteristics, they’re expressed as ranges.

Item	Range	Conditions
API response time (warm)	50-200 ms	Including DB query, via Pooler
API response time (cold start)	3-8 sec	First request after sleep
Deployment time	2-4 min	Dockerfile build, dependency cache hit
DB query (simple SELECT)	5-20 ms	Indexed table
DB query (with JOINs)	20-100 ms	Varies by table size and index design
Pooler overhead	~5 ms	Additional latency vs. direct connection

The 3-8 second cold start is not negligible for a production service. However, it only occurs after a period with no traffic, so it’s not an issue during active hours. Users accessing intermittently during nighttime or early morning hours may notice it.

Cost Structure

Specific dollar amounts are not disclosed, but here is the structural overview.

Railway Cost Characteristics

Railway uses usage-proportional billing. Costs are low in months with little traffic and rise in high-traffic months. This unpredictability is Railway’s billing model’s defining characteristic.

Methods for controlling costs:

Spending Alert: Configure notifications when estimated monthly cost exceeds a threshold
Sleep policy adjustment: Allowing instances to sleep during zero-traffic periods reduces cost (at the trade-off of cold starts)
Resource caps: Setting CPU and memory ceilings indirectly limits cost ceilings

Supabase Cost Characteristics

Supabase uses tier-based billing. Tiers progress from Free → Pro ($25/mo) → Team ($599/mo) → Enterprise (custom negotiation), with overage charges within each tier.

The typical trigger for upgrading from Free to Pro is one of the following:

Hitting the 500 MB DB limit
Needing to avoid the 7-day auto-pause
Requiring daily backups
Needing branching functionality

Total Cost Structure Comparison

Stage	Railway	Supabase	Overall Cost Profile
Development/testing	Free to minimal	Free ($0)	Nearly free
Small-scale operations	Usage-proportional	Pro ($25/mo)	Fixed + variable blend
Medium-scale operations	Usage-proportional (increasing)	Pro + overage	Growing variability
Large-scale	Hard to predict	Team/Enterprise	Migration evaluation zone

Alternative Comparisons

Server Hosting: Railway vs Render vs Fly.io

Item	Railway	Render	Fly.io
Deployment method	GitHub push auto	GitHub push auto	CLI (`fly deploy`) primarily
Build	Dockerfile / Nixpacks	Dockerfile / auto-detect	Dockerfile / Buildpacks
Cold starts	Yes (on sleep)	Yes (Free tier)	No (minimum 1 machine always on)
Pricing model	Usage-based	Instance-based + usage	Instance-based + usage
Preview Deploy	Supported	Supported	Not supported (manual setup needed)
Region selection	Limited	US/EU	30+ regions
DX	High (intuitive dashboard)	High	Medium (CLI proficiency needed)
Docker support	Native	Native	Native
Scaling	Vertical + horizontal (limited)	Vertical + horizontal	Vertical + horizontal (fine-grained)

Why Railway was chosen: It had the best DX, and the fact that deployment was complete with just a GitHub push was decisive. Fly.io is superior in performance and region coverage but its CLI-based workflow increases initial setup cost. Render is similar to Railway, but Railway’s dashboard UX was better at the time of evaluation.

Database: Supabase vs PlanetScale vs Neon

Item	Supabase	PlanetScale	Neon
DB engine	PostgreSQL	MySQL (Vitess)	PostgreSQL
Built-in auth	Auth included	Not included	Not included
RLS	Supported	Not supported (MySQL limitation)	Supported (PostgreSQL)
Branching	Pro and above	Supported (core feature)	Supported (core feature)
Connection pooling	PgBouncer built-in	Custom proxy	Custom proxy
Serverless driver	supabase-js	@planetscale/database	@neondatabase/serverless
Free tier	500 MB, 50K MAU	Free tier discontinued in 2025	512 MB, 190 compute hours
Additional services	Realtime, Storage, Edge Functions	None (DB only)	None (DB only)

Why Supabase was chosen: Being able to get Auth + RLS + PostgreSQL in one place was decisive. Neon is PostgreSQL-based and similar to Supabase, but lacks Auth, requiring a separate authentication service. PlanetScale is MySQL-based, which excluded it for a PostgreSQL-preferring project.

Migration Decision Criteria

Conditions Where This Stack Can Be Maintained

Monthly active users (MAU) in the low thousands or fewer
Traffic is not centered on Korea/Asia (regional latency is not critical)
Response time requirements are not strict (intermittent cold starts are acceptable)
Team size is small (1-3 people) — no dedicated infrastructure personnel
Infrastructure costs are contained within a monthly fixed budget

Signals That Warrant Migration Evaluation

graph TD
    A[Maintain Current Stack] --> B{Cold starts impacting<br/>the business?}
    B -->|No| C{Costs within<br/>predictable range?}
    B -->|Yes| G[Need always-on environment]
    C -->|Yes| D{Multi-region<br/>needed?}
    C -->|No| H[Evaluate fixed-rate plans or<br/>reserved instances]
    D -->|No| E{DB size within<br/>Supabase Pro limits?}
    D -->|Yes| I[Evaluate AWS/GCP migration]
    E -->|Yes| F[Maintain current stack]
    E -->|No| J[Evaluate self-hosted PostgreSQL<br/>or RDS]

    G --> K[Evaluate Fly.io / AWS ECS]
    H --> K
    I --> L[Full cloud migration]
    J --> L

    style F fill:#9f9,stroke:#333
    style K fill:#ff9,stroke:#333
    style L fill:#f99,stroke:#333

Specifically, when any one of these conditions applies, it’s time to evaluate migration:

Cold starts are impacting the business: If real-time response becomes a core requirement, you need an always-on environment.
Costs become unpredictable: If Railway’s usage-based billing produces high month-to-month variance and cost control becomes difficult, switching to a service with fixed pricing is more rational from a business management perspective.
Multi-region is needed: If you need to reduce latency for a global user base, a cloud provider with CDN and flexible region selection is advantageous.
DB size exceeds Supabase Pro limits: If you’re handling data beyond 8 GB or need complex query optimization, self-managed PostgreSQL or AWS RDS should be evaluated.
Compliance requirements emerge: If regulations apply to data storage location, encryption standards, audit logging, etc., managed services may not provide sufficient control.

Why Leaving Supabase Is Hard

Migrating from Supabase to another PostgreSQL service is not just a data migration problem. Each Supabase feature has a different migration difficulty level.

Feature	Migration Difficulty	Reason
PostgreSQL data	Low	Can migrate via pg_dump / pg_restore
Schema + RLS policies	Medium	RLS policies depend on Supabase-specific functions (`auth.uid()`)
Auth	High	User tables, JWT structure, and OAuth configuration all need to be rebuilt
Storage	Medium	S3-compatible so data transfer is possible, but URL references need updating
Realtime	High	Requires building a replacement solution from scratch
Edge Functions	Medium	Deno-based; requires rewriting for another serverless platform

The key is Auth. If you’re dependent on Supabase Auth, migration means redesigning your entire authentication system. This isn’t a simple infrastructure swap — it’s closer to a product architecture change. It’s important to recognize this lock-in at the outset when choosing Supabase.

Stack migration should not be “we’ll switch when problems arise.” Define migration criteria upfront and check them periodically. When the migration moment arrives, you’re already under operational load — so set the criteria while you still have bandwidth.

Combined Operations Checklist

When setting up Railway + Supabase for the first time, the following items should be completed before the first deployment:

Conclusion

Railway + Supabase is a stack well-suited for early-stage products. It delivers deployment automation, managed PostgreSQL, and built-in authentication with minimal configuration. The trade-off is accepting cold starts, unpredictable costs, and monitoring limitations.

The essence of this stack is “choosing not to spend time on infrastructure so you can focus on the product.” Once the product is validated and traffic stabilizes, evaluate migration — but until then, maximizing this stack’s development speed advantage is the rational approach.

That said, starting with AWS/GCP from day one is over-engineering. Define your migration criteria in advance, but maintain the current stack until real bottlenecks appear.