Optimizing Docker Builds with Multi-Stage Strategy

Disclaimer: This blog post is automatically generated from project documentation and technical proposals using AI assistance. The content represents our development journey and architectural decisions. Code examples are simplified illustrations and may not reflect the exact production implementation.

The Image Size Problem
Multi-Stage Build Strategy
esbuild Bundling
Production vs Development
Results and Lessons

The Image Size Problem

When Caroline and I first containerized the Scores services, we took the straightforward approach: install dependencies, copy the workspace, and run with tsx for TypeScript execution. Simple, but expensive.

Each service image was ~944MB:

Base Node.js image: ~200MB
Workspace packages + node_modules: ~744MB

With 8 services, our total deployment size was ~7.5GB. Pushing images across the network took minutes. Starting containers was slow. We were shipping dev dependencies, source maps, and TypeScript files to production.

Caroline pointed out the obvious: we’re wasting resources. Time to optimize.

Multi-Stage Build Strategy

The solution was multi-stage Docker builds. Instead of one monolithic stage, we split the build into focused stages with clear responsibilities:

graph TB
    A[Stage 1: Builder] --> B[Stage 2: Bundler]
    A --> C[Stage 3: Development]
    B --> D[Stage 4: Production]

    A -->|"Install all deps<br/>Full workspace"| A
    B -->|"esbuild bundle<br/>Single .mjs file"| B
    C -->|"Hot reload<br/>tsx for TS"| C
    D -->|"Bundled files only<br/>~200MB"| D

    style D fill:#c8e6c9
    style C fill:#e3f2fd

Stage 1 (Builder): Shared base with full dependencies

FROM node:25.2.0-slim AS builder
WORKDIR /app
COPY package*.json tsconfig.json ./
COPY packages ./packages
RUN npm ci --prefer-offline

Stage 2 (Bundler): Create production bundles

FROM builder AS bundler
ARG APP_PATH

# Bundle OpenTelemetry separately (shared dep)
RUN npx esbuild packages/otel/index.mts \
    --bundle \
    --platform=node \
    --format=esm \
    --outfile=/dist/otel.mjs

# Bundle the service
RUN npx esbuild ${APP_PATH}/index.mts \
    --bundle \
    --platform=node \
    --format=esm \
    --external:./otel.mjs \
    --outfile=/dist/index.mjs \
    --sourcemap

Stage 3 (Development): Full workspace for hot reload

FROM builder AS development
CMD ["npx", "tsx", "watch", "./index.mts"]

Stage 4 (Production): Minimal runtime

FROM node:25.2.0-slim AS production
WORKDIR /app

# Only copy bundled files
COPY --from=bundler /dist/otel.mjs ./
COPY --from=bundler /dist/index.mjs ./

# Non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser
USER appuser

CMD ["node", "index.mjs"]

esbuild Bundling

The magic happens in the bundler stage. esbuild is blazingly fast (10-50ms per service) and creates a single JavaScript file with all dependencies included.

Here’s what esbuild does:

Tree Shaking: Eliminates unused code paths. If you import a library but only use one function, esbuild includes just that function.

Dependency Bundling: Resolves all import statements and combines them into one file. No node_modules needed at runtime.

Format Conversion: Transpiles TypeScript to JavaScript and handles ESM module syntax.

We bundle OpenTelemetry separately because multiple services share it. This saves space—without it, each service would include a copy of the otel code:

# otel.mjs is ~2MB bundled once
RUN npx esbuild packages/otel/index.mts \
    --bundle \
    --outfile=/dist/otel.mjs

# Services mark otel as external to avoid duplication
RUN npx esbuild ${APP_PATH}/index.mts \
    --bundle \
    --external:./otel.mjs \
    --outfile=/dist/index.mjs

Production vs Development

One concern Caroline and I had: does bundling break the development workflow? The answer: not if we target the right stage.

Production build (default):

docker build --target production -t api-service .

Creates a tiny image with bundled code. Fast to transfer, fast to start.

Development build:

docker build --target development -t api-service-dev .

Includes full workspace with tsx for TypeScript execution and hot reload. Perfect for local development.

In docker-compose, we specify the target:

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile
      target: ${BUILD_TARGET:-production}
      args:
        APP_PATH: packages/matches-api
    image: scores-api:${VERSION:-latest}

Set BUILD_TARGET=development locally, and BUILD_TARGET=production in CI/CD.

Results and Lessons

The optimization delivered dramatic improvements:

Image Size: 944MB → 200MB (78% reduction) Build Time: Similar (bundling is fast, network transfer is faster) Startup Time: 30% faster (no TypeScript transpilation at runtime) Security: Fewer files = smaller attack surface

Here’s the breakdown:

Component	Old (MB)	New (MB)	Reduction
Base image	200	200	0%
node_modules	744	0	100%
Bundled code	0	~10	-
Total	944	~210	78%

Lessons Learned

Bundling Isn’t Always the Answer: Development builds still use the full workspace because hot reload and source maps matter more than size during development.

Shared Dependencies Save Space: Bundling OpenTelemetry once saved ~16MB across 8 services (2MB × 8 = 16MB saved).

Optimize for the Right Metric: We optimized for deployment size and startup time, not build time. The builds are slightly slower, but that’s a one-time cost.

Layer Caching Still Matters: Even with multi-stage builds, Docker’s layer caching is crucial. We copy package.json before npm ci so dependency installation is cached unless dependencies change.

Trade-offs

Debugging: Bundled code is harder to debug. We include source maps in production (--sourcemap flag) to maintain stack trace clarity.

Dynamic Imports: Code using dynamic import() may have issues with bundling. We avoided dynamic imports in production code paths.

Bundle Size Monitoring: We don’t currently monitor bundle sizes automatically. If dependencies grow, bundle sizes grow. This is something to watch.

esbuild Performance: Bundling all 8 services added only ~30 seconds to the build pipeline. This is negligible compared to the 2-3 minute total build time and the benefits gained.

Working through Docker optimization with Caroline and Claude reinforced a key principle: measure first, optimize second. We knew image size was a problem because we measured deployment times and calculated the total 7.5GB we were pushing across the network. The multi-stage approach wasn’t premature optimization—it was a response to real pain.

The pattern also scales: when we add new services, they automatically benefit from the optimized Dockerfile. We’ve encoded the optimization in infrastructure rather than depending on developers to remember best practices. The monorepo structure with a single Dockerfile and APP_PATH build args keeps configuration consistent across all services.

As we move toward Kubernetes or other orchestration platforms, these smaller images will make rolling updates faster and reduce network transfer costs. The initial investment in multi-stage builds continues to pay dividends with every deployment.

Table of Contents#

The Image Size Problem#

Multi-Stage Build Strategy#

esbuild Bundling#

Production vs Development#

Results and Lessons#

Lessons Learned#

Trade-offs#

Table of Contents