Minbook
KO

Railway + Supabase Operational Review

MJ · · 13 min read

Operational review of WICHI's backend on Railway and Supabase, covering real-world incidents like connection pooling and migration conflicts, plus criteria for infrastructure migration.

Why This Stack

WICHI’s backend is a FastAPI + PostgreSQL stack. When choosing infrastructure to host it, we had three criteria:

  1. Deployment must be simple — a GitHub push should be all it takes to deploy. We didn’t want to spend time configuring a separate CI/CD pipeline.
  2. DB + auth in one place — PostgreSQL, user authentication, and Row Level Security (RLS) should all be available within a single platform, without combining multiple services.
  3. Minimal initial cost — at a pre-revenue stage, fixed costs had to be minimized. Free tiers or usage-based billing were mandatory.

Railway satisfied the first and third criteria. Supabase satisfied the second and third. Combining the two covers everything from FastAPI app deployment to PostgreSQL operations, authentication, and RLS — without any additional external services.

This post covers both services in detail based on real operational experience, highlights things to know when running them in combination, and outlines the criteria for knowing when to migrate away from this stack.

Architecture Overview

graph TB
    subgraph Client
        A[Web App - Vercel]
    end

    subgraph Railway
        B[FastAPI Server]
        C[Health Check Endpoint]
    end

    subgraph Supabase
        D[PostgreSQL DB]
        E[Connection Pooler - PgBouncer]
        F[Auth Service]
        G[REST API - PostgREST]
        H[Realtime]
        I[Storage]
    end

    subgraph External
        J[GitHub Repository]
    end

    A -->|HTTPS| B
    B -->|port 6543 - Transaction Mode| E
    E --> D
    A -->|Direct| F
    J -->|push trigger| B
    C -->|DB connection check| E

    style E fill:#f9f,stroke:#333
    style B fill:#bbf,stroke:#333

The critical detail in this diagram is that the FastAPI server does not connect directly to the Supabase DB. It routes through the Connection Pooler (PgBouncer). Why this matters is covered in detail in the Supabase section.


Railway: In Detail

Core Feature: GitHub Auto-Deploy

Railway’s greatest strength is GitHub-integrated automatic deployment. Once you connect a repository, every push to the main branch automatically triggers a build and deploy. No separate GitHub Actions workflow required.

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub
    participant RW as Railway
    participant Svc as Service

    Dev->>GH: git push origin main
    GH->>RW: Webhook trigger
    RW->>RW: Detect Dockerfile / Nixpacks
    RW->>RW: Build image
    RW->>Svc: Deploy new version
    RW->>RW: Health check pass
    RW-->>Svc: Route traffic to new version
    Note over RW,Svc: Zero-downtime rolling deploy

The deployment process works as follows:

  1. Push to GitHub triggers Railway via webhook
  2. If a Dockerfile exists, Docker build runs; otherwise, Nixpacks auto-configures the build environment
  3. After image build completes, deployment goes to a new container
  4. Traffic switches to the new version once health check passes

For a typical FastAPI app, build time (including dependency installation) runs 1-3 minutes, and traffic switchover completes within tens of seconds. Writing your own Dockerfile enables layer caching, which dramatically reduces build time when dependencies haven’t changed.

Preview Deploy

Opening a PR automatically creates a Preview Deploy for that branch. It gets its own independent URL, so you can verify real behavior during PR review. This feature is similar to Vercel’s Preview Deploy but applied to backend services, which makes it particularly useful.

One caveat: you need to decide upfront how to manage the DB connection string for Preview Deploys. Using the production DB is risky, and connecting a separate staging DB requires environment variable separation.

Usage-Based Billing

Railway’s billing is calculated on three axes:

Billing ItemUnitNotes
vCPUPer hourBased on actual CPU time used
MemoryGB-hoursBased on usage, not allocation
Network EgressGBOnly outbound traffic is billed
DiskGB-hoursWhen persistent storage is used

The advantage of this structure is low costs when traffic is low. The disadvantage is that it’s hard to know the exact cost until you open the month-end invoice. The dashboard shows a real-time Estimated Cost, but this is a running total, not a month-end projection.

Railway’s billing model is a “pay for what you use” structure. If there’s no traffic, costs are nearly zero — but a traffic spike can generate unexpected charges. Setting up Spending Alerts is not optional; it’s essential.

Developer Experience (DX)

Railway’s DX ranks among the best in its class. Here’s a breakdown by category:

DX AreaRatingDetails
Initial setupExcellentFirst deploy within 5 minutes of GitHub connection
DashboardExcellentVisual service topology, intuitive
LogsGoodReal-time streaming supported; search and filtering are limited
Environment variable managementExcellentPer-service separation, reference variable (${{}} syntax) support
CLIGoodBasic commands like railway run, railway logs
DocumentationFairCore content is there, but edge case docs are lacking
CommunityGoodActive Discord, fast response times

A note on logging: Railway’s built-in log viewer is sufficient for debugging, but falls short for structured log searching or long-term retention. For production, integrating an external log service (Axiom, Better Stack, etc.) is recommended. Railway supports log drain, so integration isn’t difficult.

Limitations

Limitations observed while using Railway:

  • Cold starts: When there’s no traffic, instances enter a sleep state. The first request can take several seconds. For FastAPI apps, this includes Uvicorn startup time, making the perceived delay noticeable.
  • Limited regions: Region options are fewer than AWS or GCP. Asia regions exist, but there’s no Seoul region for Korean users. This directly impacts response latency.
  • Scaling control: Horizontal scaling (running multiple instances) is possible, but you can’t configure the fine-grained auto-scaling policies available with AWS ECS or Kubernetes. This is a limitation for services with irregular traffic patterns.
  • Cron job limitations: You can spin up a separate cron service on Railway, but there’s no built-in scheduler. You need to embed a library like APScheduler directly in your app or separate it into a dedicated service.

Supabase: In Detail

PostgreSQL as a Service

Supabase’s core offering is managed PostgreSQL. Instance provisioning, backups, and upgrades are handled by Supabase, so you rarely need to directly manage DB operations.

Supabase is architecturally a set of services layered on top of PostgreSQL. You don’t have to use all of them. Here’s the breakdown of what WICHI actually uses versus what it doesn’t:

FeatureUsed?Purpose / Reason for Not Using
PostgreSQL DBYesCore data store
AuthYesEmail/OAuth authentication, JWT issuance
Row Level Security (RLS)YesPer-user data access control
Connection PoolerYesRailway→DB connection management
SQL EditorYesAd-hoc queries, data verification
REST API (PostgREST)NoFastAPI handles queries directly
RealtimeNoNo real-time features needed currently
StorageNoNo file upload functionality
Edge FunctionsNoServer logic handled by Railway

Looking at just the features we use, Supabase is effectively “managed PostgreSQL + Auth + RLS.” This combination alone significantly reduces management overhead compared to operating separate authentication (Auth0, Firebase Auth) and database services (RDS, Cloud SQL).

Auth Service

Supabase Auth is a GoTrue-based authentication service. It supports email/password authentication and OAuth (Google, GitHub, etc.) login out of the box, and handles JWT token issuance and refresh automatically.

On the FastAPI server, we verify the JWT sent by the client and execute DB queries based on the user ID in the token. Not having to build the auth service from scratch is the biggest advantage.

There’s an important caveat, though. Supabase Auth’s JWT includes role and sub (user ID) by default. RLS policies can identify the current user via auth.uid(), but this only works automatically when accessing through the Supabase client library. When querying PostgreSQL directly from FastAPI, RLS is not automatically applied — you need to write separate access control logic on the server side.

RLS does not mean “configured and therefore safe.” The scope of RLS enforcement depends on how the DB is accessed. Accessing with the service_role key bypasses RLS entirely, so managing this key is the crux of security.

Connection Pooling: A Deep Dive

This is the most important configuration in Supabase operations, warranting its own section.

graph LR
    subgraph Railway
        A1[FastAPI Instance 1]
        A2[FastAPI Instance 2]
        A3[FastAPI Instance 3]
    end

    subgraph Supabase Connection Pooler
        B[PgBouncer<br/>port 6543<br/>Transaction Mode]
    end

    subgraph PostgreSQL
        C[DB Instance<br/>Max Connections Limited]
    end

    A1 -->|conn 1| B
    A1 -->|conn 2| B
    A2 -->|conn 3| B
    A2 -->|conn 4| B
    A3 -->|conn 5| B
    B -->|pooled conn A| C
    B -->|pooled conn B| C
    B -->|pooled conn C| C

    style B fill:#f96,stroke:#333

PostgreSQL has a physical limit on concurrent connections. On Supabase’s Free tier, this limit is even stricter. If the FastAPI server creates a new DB connection per request, connection exhaustion occurs under concurrent load.

The Connection Pooler (PgBouncer) solves this. It accepts connection requests from multiple clients and reuses a small number of actual PostgreSQL connections.

Supabase offers two Pooler modes:

ModePortBehaviorBest For
Transaction6543Allocates/returns connections per transactionServerless patterns, short queries
Session5432 (Pooler)Holds connection until session endsCases requiring persistent connections

WICHI’s backend uses Transaction mode. Each FastAPI request is an independent transaction, so Transaction mode — which returns the connection immediately after query execution — is the right fit.

Configuration caveats:

  • The DATABASE_URL environment variable must use the Pooler address (port 6543). Using a direct connection (port 5432) means no connection pooling, and you’ll hit the concurrent connection limit quickly.
  • Transaction mode does not support PREPARE statements or LISTEN/NOTIFY. If using SQLAlchemy, you may need to set prepared_statement_cache_size=0.
  • Migrations must use the direct connection. DDL commands (CREATE TABLE, ALTER TABLE, etc.) can behave unexpectedly when run through the Transaction mode Pooler.

Free Tier vs Pro Plan

Supabase pricing based on publicly available information:

ItemFreePro ($25/mo)
DB size500 MB8 GB (expandable)
Auth MAU50,000100,000
Storage1 GB100 GB
Edge Function invocations500K/month2M/month
Daily backupsNot included7-day retention
BranchingNot includedAvailable
PausingAuto-pause after 7 days of inactivityNone

The biggest constraint on the free tier is auto-pausing after 7 days of inactivity. A paused project must be manually restored from the dashboard, taking several minutes. This is fine for personal projects or early development, but for any service with real users, upgrading to Pro is mandatory.

The free tier’s 500 MB DB limit is reached faster than you’d expect. If you’re storing log-type data or history tables in the DB, you can hit the ceiling within weeks. It’s worth designing early on which data lives in the DB and which gets offloaded elsewhere.

Migration Management

Supabase supports a migration workflow via its CLI. Commands like supabase migration new and supabase db push let you manage schema changes.

The temptation is the dashboard’s SQL Editor. “Let me just quickly add one table” via a direct CREATE TABLE in the dashboard creates drift between your local migration files and the actual DB schema. This drift gets harder to resolve the more it accumulates.

Recommended workflow:

  1. Always write schema changes as local migration files
  2. Apply to the remote DB with supabase db push
  3. Use the dashboard SQL Editor for data queries only
  4. Urgent data modifications (DELETE, UPDATE) can go through the dashboard, but schema changes (DDL) should never be done there

Operational Experience

Deployment Speed and Stability

Railway’s deployments are stable. From GitHub push to the new version receiving traffic, it generally takes 2-4 minutes. Zero-downtime deploy is supported, so there’s no service interruption during deployment.

However, when a build fails, the previous version is retained — which means “not realizing a deployment failed” is possible. Without separate deployment failure alerts, you won’t know until you manually check the dashboard. Setting up Slack or Discord webhook integration early is recommended.

Uptime

Both Railway and Supabase offer SLA-backed plans, but the perceived uptime in the Free-to-Pro range is in the high 99% range. We haven’t experienced any major service outages, but intermittent delays and brief anomalies did occur.

Incident Log

Key incidents experienced during the operational period:

#TypeSymptomsRoot CauseDetection MethodResolution TimeAction Taken
1DB connection exhaustionSpike in API 500 errorsDirect connection (no pooler) under concurrent loadManual error log review~30 minSwitched to Pooler (port 6543)
2Cold start delayFirst request timeoutHeavy model load during startup after sleepUser report~10 minLightened startup event
3Migration conflictDeploy failureDashboard DDL change conflicted with CLI migrationDeploy log review~1 hourManual migration file sync
4Missing env variableService startup failureTypo in Railway environment variableDeploy log review~5 minAdded env variable validation script
5Supabase auto-pauseDB connection failureFree tier 7-day inactivity auto-pauseHealth check failure alert~5 min (manual restore)Upgraded to Pro plan

The highest-impact incident was #1 (connection exhaustion). Operating with direct connections instead of the Pooler, we hit the concurrent connection limit as users grew. The API intermittently returned 500 errors, and diagnosing the cause took time.

Most incidents were caused by deferring configuration we already knew we should do. Connection Pooler setup, environment variable validation, and alert configuration are things that should be completed alongside the first deployment.

Logging and Monitoring

Railway’s logging system is sufficient for debugging but falls short for production-grade monitoring.

ItemRailway Built-inAdditional Tooling Needed
Real-time log streamingProvided
Log search/filterBasic levelWhen structured log searching is needed
Log retention periodLimitedWhen long-term retention is needed
CPU/Memory metricsDashboard providedAlert threshold configuration
Custom metricsNot providedPrometheus, Datadog, etc.
Error trackingNot providedSentry, etc.
APMNot providedSeparate tooling if needed

The same goes for Supabase. The dashboard shows basic metrics like DB size, connection count, and API request count, but query performance analysis and slow query tracking require separate configuration.

The realistic approach: start with just the Railway and Supabase built-in dashboards in the early stage, then incrementally add Sentry (error tracking) and log drain (long-term log retention) as users grow. Setting up a full monitoring stack from day one is overkill.

Performance Benchmarks

Approximate figures measured in the actual production environment. Since exact numbers vary by service characteristics, they’re expressed as ranges.

ItemRangeConditions
API response time (warm)50-200 msIncluding DB query, via Pooler
API response time (cold start)3-8 secFirst request after sleep
Deployment time2-4 minDockerfile build, dependency cache hit
DB query (simple SELECT)5-20 msIndexed table
DB query (with JOINs)20-100 msVaries by table size and index design
Pooler overhead~5 msAdditional latency vs. direct connection

The 3-8 second cold start is not negligible for a production service. However, it only occurs after a period with no traffic, so it’s not an issue during active hours. Users accessing intermittently during nighttime or early morning hours may notice it.


Cost Structure

Specific dollar amounts are not disclosed, but here is the structural overview.

Railway Cost Characteristics

Railway uses usage-proportional billing. Costs are low in months with little traffic and rise in high-traffic months. This unpredictability is Railway’s billing model’s defining characteristic.

Methods for controlling costs:

  1. Spending Alert: Configure notifications when estimated monthly cost exceeds a threshold
  2. Sleep policy adjustment: Allowing instances to sleep during zero-traffic periods reduces cost (at the trade-off of cold starts)
  3. Resource caps: Setting CPU and memory ceilings indirectly limits cost ceilings

Supabase Cost Characteristics

Supabase uses tier-based billing. Tiers progress from Free → Pro ($25/mo) → Team ($599/mo) → Enterprise (custom negotiation), with overage charges within each tier.

The typical trigger for upgrading from Free to Pro is one of the following:

  • Hitting the 500 MB DB limit
  • Needing to avoid the 7-day auto-pause
  • Requiring daily backups
  • Needing branching functionality

Total Cost Structure Comparison

StageRailwaySupabaseOverall Cost Profile
Development/testingFree to minimalFree ($0)Nearly free
Small-scale operationsUsage-proportionalPro ($25/mo)Fixed + variable blend
Medium-scale operationsUsage-proportional (increasing)Pro + overageGrowing variability
Large-scaleHard to predictTeam/EnterpriseMigration evaluation zone

Alternative Comparisons

Server Hosting: Railway vs Render vs Fly.io

ItemRailwayRenderFly.io
Deployment methodGitHub push autoGitHub push autoCLI (fly deploy) primarily
BuildDockerfile / NixpacksDockerfile / auto-detectDockerfile / Buildpacks
Cold startsYes (on sleep)Yes (Free tier)No (minimum 1 machine always on)
Pricing modelUsage-basedInstance-based + usageInstance-based + usage
Preview DeploySupportedSupportedNot supported (manual setup needed)
Region selectionLimitedUS/EU30+ regions
DXHigh (intuitive dashboard)HighMedium (CLI proficiency needed)
Docker supportNativeNativeNative
ScalingVertical + horizontal (limited)Vertical + horizontalVertical + horizontal (fine-grained)

Why Railway was chosen: It had the best DX, and the fact that deployment was complete with just a GitHub push was decisive. Fly.io is superior in performance and region coverage but its CLI-based workflow increases initial setup cost. Render is similar to Railway, but Railway’s dashboard UX was better at the time of evaluation.

Database: Supabase vs PlanetScale vs Neon

ItemSupabasePlanetScaleNeon
DB enginePostgreSQLMySQL (Vitess)PostgreSQL
Built-in authAuth includedNot includedNot included
RLSSupportedNot supported (MySQL limitation)Supported (PostgreSQL)
BranchingPro and aboveSupported (core feature)Supported (core feature)
Connection poolingPgBouncer built-inCustom proxyCustom proxy
Serverless driversupabase-js@planetscale/database@neondatabase/serverless
Free tier500 MB, 50K MAUFree tier discontinued in 2025512 MB, 190 compute hours
Additional servicesRealtime, Storage, Edge FunctionsNone (DB only)None (DB only)

Why Supabase was chosen: Being able to get Auth + RLS + PostgreSQL in one place was decisive. Neon is PostgreSQL-based and similar to Supabase, but lacks Auth, requiring a separate authentication service. PlanetScale is MySQL-based, which excluded it for a PostgreSQL-preferring project.


Migration Decision Criteria

Conditions Where This Stack Can Be Maintained

  • Monthly active users (MAU) in the low thousands or fewer
  • Traffic is not centered on Korea/Asia (regional latency is not critical)
  • Response time requirements are not strict (intermittent cold starts are acceptable)
  • Team size is small (1-3 people) — no dedicated infrastructure personnel
  • Infrastructure costs are contained within a monthly fixed budget

Signals That Warrant Migration Evaluation

graph TD
    A[Maintain Current Stack] --> B{Cold starts impacting<br/>the business?}
    B -->|No| C{Costs within<br/>predictable range?}
    B -->|Yes| G[Need always-on environment]
    C -->|Yes| D{Multi-region<br/>needed?}
    C -->|No| H[Evaluate fixed-rate plans or<br/>reserved instances]
    D -->|No| E{DB size within<br/>Supabase Pro limits?}
    D -->|Yes| I[Evaluate AWS/GCP migration]
    E -->|Yes| F[Maintain current stack]
    E -->|No| J[Evaluate self-hosted PostgreSQL<br/>or RDS]

    G --> K[Evaluate Fly.io / AWS ECS]
    H --> K
    I --> L[Full cloud migration]
    J --> L

    style F fill:#9f9,stroke:#333
    style K fill:#ff9,stroke:#333
    style L fill:#f99,stroke:#333

Specifically, when any one of these conditions applies, it’s time to evaluate migration:

  1. Cold starts are impacting the business: If real-time response becomes a core requirement, you need an always-on environment.
  2. Costs become unpredictable: If Railway’s usage-based billing produces high month-to-month variance and cost control becomes difficult, switching to a service with fixed pricing is more rational from a business management perspective.
  3. Multi-region is needed: If you need to reduce latency for a global user base, a cloud provider with CDN and flexible region selection is advantageous.
  4. DB size exceeds Supabase Pro limits: If you’re handling data beyond 8 GB or need complex query optimization, self-managed PostgreSQL or AWS RDS should be evaluated.
  5. Compliance requirements emerge: If regulations apply to data storage location, encryption standards, audit logging, etc., managed services may not provide sufficient control.

Why Leaving Supabase Is Hard

Migrating from Supabase to another PostgreSQL service is not just a data migration problem. Each Supabase feature has a different migration difficulty level.

FeatureMigration DifficultyReason
PostgreSQL dataLowCan migrate via pg_dump / pg_restore
Schema + RLS policiesMediumRLS policies depend on Supabase-specific functions (auth.uid())
AuthHighUser tables, JWT structure, and OAuth configuration all need to be rebuilt
StorageMediumS3-compatible so data transfer is possible, but URL references need updating
RealtimeHighRequires building a replacement solution from scratch
Edge FunctionsMediumDeno-based; requires rewriting for another serverless platform

The key is Auth. If you’re dependent on Supabase Auth, migration means redesigning your entire authentication system. This isn’t a simple infrastructure swap — it’s closer to a product architecture change. It’s important to recognize this lock-in at the outset when choosing Supabase.

Stack migration should not be “we’ll switch when problems arise.” Define migration criteria upfront and check them periodically. When the migration moment arrives, you’re already under operational load — so set the criteria while you still have bandwidth.


Combined Operations Checklist

When setting up Railway + Supabase for the first time, the following items should be completed before the first deployment:

  • Set the Supabase Pooler address (port 6543) in the Railway service
  • Separate service_role key and anon key in environment variables
  • Configure Railway Spending Alert
  • Set up deployment failure alerts (Slack/Discord webhook)
  • Implement health check endpoint (including DB connection status)
  • Write RLS policies and verify with test data
  • Decide on migration workflow (CLI only, no dashboard DDL)
  • Decide on Preview Deploy DB connection strategy (whether to separate a staging DB)
  • Confirm Supabase free tier auto-pause policy
  • Decide on log retention strategy (built-in vs. log drain)

Conclusion

Railway + Supabase is a stack well-suited for early-stage products. It delivers deployment automation, managed PostgreSQL, and built-in authentication with minimal configuration. The trade-off is accepting cold starts, unpredictable costs, and monitoring limitations.

The essence of this stack is “choosing not to spend time on infrastructure so you can focus on the product.” Once the product is validated and traffic stabilizes, evaluate migration — but until then, maximizing this stack’s development speed advantage is the rational approach.

That said, starting with AWS/GCP from day one is over-engineering. Define your migration criteria in advance, but maintain the current stack until real bottlenecks appear.

Share

Related Posts