A Dockerfile is a text file containing instructions for building a Docker image. Each instruction creates a layer in the final image. Writing efficient Dockerfiles is a core Docker skill.
Basic Dockerfile Structure
Here is a Dockerfile for a Node.js application:
# Use an official base image
FROM node:20-alpine
# Set the working directory inside the container
WORKDIR /app
# Copy dependency files first (for better caching)
COPY package.json pnpm-lock.yaml ./
# Install dependencies
RUN npm install -g pnpm && pnpm install --frozen-lockfile
# Copy the rest of the application
COPY . .
# Expose the port the app runs on
EXPOSE 3000
# Define the command to run the app
CMD ["node", "server.js"]Build and run it:
docker build -t my-node-app .
docker run -d -p 3000:3000 my-node-appDockerfile Instructions Reference
| Instruction | Purpose |
|---|---|
FROM | Set the base image |
WORKDIR | Set the working directory |
COPY | Copy files from host to image |
ADD | Copy files (also handles URLs and tar extraction) |
RUN | Execute a command during build |
CMD | Default command when container starts |
ENTRYPOINT | Fixed command that always runs |
EXPOSE | Document which ports the app uses |
ENV | Set environment variables |
ARG | Build-time variables |
VOLUME | Define mount points for persistent data |
USER | Set the user for subsequent commands |
LABEL | Add metadata to the image |
FROM — Choosing a Base Image
Every Dockerfile starts with FROM. Choose the smallest base image that meets your needs:
# Full Debian-based (large, ~350MB)
FROM node:20
# Slim variant (smaller, ~80MB)
FROM node:20-slim
# Alpine-based (smallest, ~50MB)
FROM node:20-alpine
# Start from scratch (for compiled binaries)
FROM scratchAlpine images are popular because they are tiny, but they use musl instead of glibc, which can cause compatibility issues with some packages.
COPY vs ADD
Use COPY for straightforward file copying. Use ADD only when you need its extra features:
# Copy a single file
COPY server.js /app/server.js
# Copy a directory
COPY src/ /app/src/
# ADD can extract tar archives automatically
ADD archive.tar.gz /app/
# ADD can fetch URLs (but curl in RUN is preferred)
ADD https://example.com/file.txt /app/
RUN — Executing Build Commands
Combine related commands into a single RUN instruction to reduce layers:
# Bad — creates 3 separate layers
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# Good — single layer, includes cleanup
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
CMD vs ENTRYPOINT
CMD sets the default command, which can be overridden:
CMD ["node", "server.js"]# Runs: node server.js
docker run my-app
# Overrides CMD: runs bash instead
docker run my-app bashENTRYPOINT sets a fixed command. Arguments are appended:
ENTRYPOINT ["node"]
CMD ["server.js"]# Runs: node server.js
docker run my-app
# Runs: node repl.js
docker run my-app repl.jsAlways use the exec form (JSON array) instead of the shell form for proper signal handling:
# Good — exec form (PID 1, receives signals correctly)
CMD ["node", "server.js"]
# Avoid — shell form (wraps in /bin/sh -c, signal issues)
CMD node server.jsENV and ARG
ENV sets variables available at build time and runtime:
ENV NODE_ENV=production
ENV PORT=3000ARG sets variables available only at build time:
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
ARG BUILD_DATE
LABEL build-date=$BUILD_DATEPass build args with --build-arg:
docker build --build-arg BUILD_DATE=$(date -u +%Y-%m-%d) -t my-app .Layer Caching and Build Optimization
Docker caches each layer. If a layer has not changed, Docker reuses the cache. The order of instructions matters:
FROM node:20-alpine
WORKDIR /app
# Step 1: Copy dependency files (changes rarely)
COPY package.json pnpm-lock.yaml ./
# Step 2: Install dependencies (cached unless package.json changed)
RUN npm install -g pnpm && pnpm install --frozen-lockfile
# Step 3: Copy application code (changes frequently)
COPY . .
# Step 4: Build
RUN pnpm build
CMD ["node", "dist/server.js"]If you copy everything first and then install dependencies, every code change would invalidate the dependency cache.
The .dockerignore File
Create a .dockerignore file to exclude files from the build context:
node_modules
.git
.env
*.log
dist
.next
coverage
.DS_Store
Dockerfile
docker-compose.yml
README.mdThis speeds up builds and prevents sensitive files from ending up in the image.
Multi-Stage Builds
Multi-stage builds let you use multiple FROM instructions to create smaller final images:
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN npm install -g pnpm && pnpm install --frozen-lockfile
COPY . .
RUN pnpm build
# Stage 2: Production
FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
# Copy only what we need from the builder
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]The final image only contains the production files — no source code, no dev dependencies, no build tools.
Security Best Practices
Run your application as a non-root user:
FROM node:20-alpine
# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --chown=appuser:appgroup . .
RUN npm install --production
# Switch to non-root user
USER appuser
EXPOSE 3000
CMD ["node", "server.js"]Practical Example: Python Flask App
FROM python:3.12-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY . .
# Create non-root user
RUN useradd -m appuser
USER appuser
EXPOSE 5000
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]Build, tag, and run:
docker build -t flask-app:1.0 .
docker run -d -p 5000:5000 flask-app:1.0
curl http://localhost:5000
Build Commands
# Build with a tag
docker build -t my-app:v1 .
# Build from a specific Dockerfile
docker build -f Dockerfile.prod -t my-app:prod .
# Build with no cache (force rebuild all layers)
docker build --no-cache -t my-app .
# Build with build arguments
docker build --build-arg NODE_ENV=production -t my-app .
# Show build output (BuildKit)
docker build --progress=plain -t my-app .Summary
Dockerfiles define how images are built, layer by layer. You learned the key instructions, how layer caching works, how to optimize builds with multi-stage patterns, and security best practices. In the next lesson, you will learn about volumes for persistent data and networks for container communication.