Zero-Downtime Deployments with Docker Compose and Nginx

There is a specific kind of anxiety that comes with deploying to production on a Friday afternoon. Your monitoring dashboard is open in one tab, your terminal in another, and somewhere in the back of your mind you are calculating how long it will take to roll back if something goes wrong.

For the past two years, I have been running a handful of services on a single Hetzner VPS. Nothing glamorous: a Node.js API, a static frontend, a webhook processor, and a PostgreSQL database. The kind of setup where Kubernetes would be overkill but manual docker compose down && docker compose up means a few seconds of downtime every deploy.

Those few seconds add up. More importantly, they add up in customer-visible ways. A failed health check here, a dropped WebSocket connection there. So I built a blue-green deployment pipeline using nothing more than Docker Compose and Nginx.

The Architecture

The core idea is simple: run two identical copies of your application stack behind Nginx. At any given moment, one is "live" (receiving traffic) and the other is "standby" (either idle or running the previous version). When you deploy, you bring up the new version on the standby stack, verify it is healthy, then tell Nginx to switch traffic over.

Blue-green deployment architecture diagram — Fig. 1: Blue-green deployment topology. Nginx sits in front of both stacks, routing all traffic to the active one.

The trick is that Nginx can reload its configuration without dropping existing connections. A nginx -s reload gracefully transitions traffic from one upstream to another. Combined with Docker Compose's ability to run multiple project instances using the -p flag, you get a surprisingly robust deployment pipeline.

Setting Up the Compose Files

I use a single docker-compose.yml with environment variable interpolation to distinguish between the blue and green stacks. The key is the project name and the port bindings.

version: "3.8"

services:
  api:
    image: myapp/api:${TAG:-latest}
    ports:
      - "${API_PORT:-3001}:3000"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=${DATABASE_URL}
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 15s
    restart: unless-stopped

  worker:
    image: myapp/worker:${TAG:-latest}
    environment:
      - REDIS_URL=${REDIS_URL}
    restart: unless-stopped

When I deploy, the script sets API_PORT=3001 for the blue stack and API_PORT=3002 for the green stack. Nginx knows about both ports and routes to whichever is currently active.

The Deployment Script

The deploy script is roughly 80 lines of bash. It determines which stack is currently active, brings up the other one with the new image tag, waits for health checks to pass, then swaps the Nginx upstream configuration.

#!/bin/bash
set -euo pipefail

# Determine active stack
ACTIVE=$(cat /etc/nginx/active-stack 2>/dev/null || echo "blue")

if [ "$ACTIVE" = "blue" ]; then
  TARGET="green"
  TARGET_PORT=3002
else
  TARGET="blue"
  TARGET_PORT=3001
fi

echo "Deploying to $TARGET stack (port $TARGET_PORT)..."

# Pull new images and start target stack
TAG="$1" API_PORT="$TARGET_PORT" \
  docker compose -p "$TARGET" up -d --pull always

# Wait for health check
echo "Waiting for health check..."
for i in $(seq 1 30); do
  if curl -sf http://localhost:"$TARGET_PORT"/health > /dev/null; then
    echo "Health check passed on attempt $i"
    break
  fi
  sleep 2
done

# Swap Nginx upstream
sed -i "s/server 127.0.0.1:.*/server 127.0.0.1:$TARGET_PORT;/" \
  /etc/nginx/conf.d/upstream.conf

nginx -s reload
echo "$TARGET" > /etc/nginx/active-stack

echo "Deployed. $TARGET is now active."

I wrote the first version of this script at 11 PM on a Tuesday, after a deploy had knocked out our webhook processor for 14 seconds and a client noticed before our monitoring did. There is nothing like a customer support email to motivate infrastructure improvements.

Nginx Configuration

The Nginx configuration is minimal. The upstream block points to whichever port is currently active, and the proxy_pass directive forwards all traffic there. The important detail is the upstream definition living in a separate file that the deploy script can modify independently.

upstream app_backend {
    server 127.0.0.1:3001;
}

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate     /etc/letsencrypt/live/api.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;

    location / {
        proxy_pass http://app_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Why Not Use Docker's Built-in Load Balancing?

Docker Compose can scale services and load-balance between them using the built-in round-robin DNS. The problem is that it does not give you control over the transition. You cannot tell Docker "drain connections from the old container before removing it." With Nginx in front, you have explicit control over when traffic shifts and can verify the new stack is healthy before committing.

Handling Database Migrations

The elephant in the room with blue-green deployments is database schema changes. If your new version expects a column that does not exist yet, you cannot simply switch traffic over after the migration runs because the old version (still running on the other stack for rollback purposes) might break.

My approach is borrowed from the Parallel Change pattern:

Expand: Add the new column or table. Make it nullable or provide a default. Deploy this change first, without any application code that uses it.
Migrate: Deploy the application code that writes to the new structure. Backfill historical data if needed.
Contract: Once you are confident the new code is stable and the old stack will not be needed, remove the old column or constraint.

This means every breaking schema change becomes at least two deploys. It is more work up front, but it means you always have a safe rollback path.

Database migration timeline — Fig. 2: The expand-migrate-contract pattern ensures both blue and green stacks can work with the database at any point.

A colleague once told me that the best infrastructure is the kind that lets you sleep at night. This pipeline is not elegant. It is not cutting-edge. But I have not been woken up by a failed deploy since I built it, and that counts for more than architectural purity.

Monitoring the Swap

During the switchover window, I log a few metrics to make sure everything is healthy:

Response time from the new stack's health endpoint (should be under 200ms)
Active connection count on the old stack (should drain to zero within 30 seconds)
Error rate from the Nginx access log (any spike triggers automatic rollback)
Memory and CPU usage of the new containers (catching runaway processes early)

I use a simple bash script that polls these metrics for 60 seconds after the swap. If anything looks off, it automatically reverts the Nginx config and sends me a notification.

Key Takeaways

You do not need Kubernetes to achieve zero-downtime deployments. Docker Compose, Nginx, and about 80 lines of bash will get you surprisingly far on a single VPS.

The expand-migrate-contract pattern for database changes is more work per deploy but eliminates the "migration broke the old version" failure mode entirely.

The best infrastructure is not the most sophisticated. It is the kind you can debug at 2 AM without a reference guide.

What I Would Do Differently

If I were starting today, I would use Docker Compose deploy configs more aggressively. The update_config section with order: start-first gets you part of the way there without the custom scripting. I would also explore Traefik as a replacement for Nginx, since it has native Docker integration and can detect new containers automatically.

But the script works. It has deployed 247 times without a single second of downtime. And sometimes, that is exactly the right amount of engineering.

Zero-Downtime Deployments with Docker Compose and Nginx.

The Architecture

Setting Up the Compose Files

The Deployment Script

Nginx Configuration

Why Not Use Docker's Built-in Load Balancing?

Handling Database Migrations

Monitoring the Swap

What I Would Do Differently

Fred Lackey

The Architecture

Setting Up the Compose Files

The Deployment Script

Nginx Configuration

Why Not Use Docker's Built-in Load Balancing?

Handling Database Migrations

Monitoring the Swap

What I Would Do Differently

Fred Lackey

Rethinking Config Management: From .env Files to Vault

TypeScript Monorepo Patterns That Actually Scale

Event Sourcing in Practice: Lessons from a Billing System

Snow Day Protocols: When School Cancels and Dad Works Remote

The Bookshelf I Built (and Rebuilt, and Rebuilt Again)

Cooking for a Ten-Year-Old Food Critic