From 342MB to 37MB: The Go Image I Was Shipping Wrong
From 342MB to 37MB: The Go Image I Was Shipping Wrong#
Our Go backend's Docker Hub page said 342.4 MB. I'd been staring at that number for a week.
I'm relatively new to Go. Image-size optimization was outside my experience. When I searched for what to do, the suggestions came in fast: UPX-compress, switch bases, strip symbols, prune dependencies. Half of them used terms I didn't fully understand — symbol table, DWARF, static linking, libc dependency. I could repeat the words. I couldn't have explained them.
The honest thing to do at that point was to slow down. I didn't. I almost started randomly applying advice from the top three Stack Overflow answers. That impulse — pattern-matching on optimization slogans before understanding the problem — is what this post is about.
When I finally stopped guessing and actually looked at our Dockerfile, the problem turned out to be a sentence I would have been embarrassed to say out loud:
I was shipping the Go toolchain inside the production image.
Not the binary. The compiler. go, the linker, the standard library source, the module cache. All of it. Sitting inside the image that runs in production, doing nothing.
After the fix the image is 37 MB. That's a 9× reduction with no application code changes — just understanding what was actually in the artifact I was shipping.
This post is what each of the four levers actually does, written out the way I wish someone had written it for me when I was searching last week.
What "shipping the toolchain" actually means#
The first Dockerfile I'd ever written for a Go service looked something like this. It works. It builds. It runs in production.
FROM golang:1.25-alpine
WORKDIR /app
COPY . .
RUN go build -o /server ./cmd/server
CMD ["/server"]The base image, golang:1.25-alpine, is around 700 MB. It contains:
goitself (the compiler)- The full standard library source tree (
/usr/local/go/src/...) git(used bygo mod download)- Alpine packages pulled in for build dependencies
- The module cache from
go mod download - Build artifacts left over from compilation
Your server binary is maybe 30 MB of that. The rest is overhead that existed only so the binary could be built. None of it is needed once the binary is sitting on disk.
If you copy a binary on top of this base and docker push, you ship the whole 700 MB. The image runs fine. It pulls slowly, eats disk on every node, and costs registry storage. You don't notice until someone points at the number.
That was me. Someone pointed at the number. I assumed it was about dependencies — pgx, the full OTel stack, Gin, OpenAPI validation. I was wrong about the cause. The deps had nothing to do with it. The toolchain did.
The four levers#
The fix is the standard Go production-image recipe. All four levers stack — leave any one out and the image grows back. Below is what each one does and why.
┌──────────────────────────────────────────────────────────────────┐
│ 1. Multi-stage build toolchain stays out of the runtime │
│ 2. Distroless static base no glibc, no shell, no apt │
│ 3. CGO_ENABLED=0 static binary, no glibc dependency │
│ 4. -trimpath -ldflags=-s -w strip symbols + paths │
└──────────────────────────────────────────────────────────────────┘
↓
342 MB → 37 MB
Lever 1: Multi-stage build#
A multi-stage Dockerfile uses two FROM statements. Only the last stage gets shipped to the registry. The first stage exists at build time and then disappears.
# Stage 1: the builder — has the full toolchain
FROM golang:1.25-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o /server ./cmd/server
# Stage 2: the runtime — only the binary
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /server /server
ENTRYPOINT ["/server"]Stage 1 sits on the ~700 MB Go base. It compiles. It produces /server. That's its whole job.
Stage 2 starts from a different (tiny) base and picks only /server out of stage 1 with COPY --from=builder. Stage 1 evaporates — it does not become part of the final image. None of its layers are referenced by the final manifest. They might exist briefly in the local build cache, but docker push never touches them.
A colleague asked me what changed in our deploy: "but what contributed to that high image size earlier?"
My answer was: "golang toolchain. so i was shipping that as well, it was a mistake from my end."
That sentence is the entire first lever. Before: single-stage, ~700 MB base shipped to production. After: multi-stage, ~2 MB base shipped to production.
| What ships in single-stage | What ships in multi-stage |
|---|---|
| Go compiler | — |
| Standard library source | — |
| Alpine packages from build phase | — |
| Module cache | — |
git | — |
| Your binary | Your binary ✓ |
Everything except the bottom row is build-time-only. Multi-stage is how you tell Docker that.
Lever 2: gcr.io/distroless/static-debian12:nonroot as the runtime base#
Once the toolchain is out of the picture, the question becomes: what should the runtime image even contain?
Most base images give you a full Linux user-space — a shell, a package manager, busybox, libc, common tools. That's useful when you're SSH-ing into a server to debug. It's overhead when you're shipping a single Go binary that already has everything it needs baked in.
Distroless is Google's family of base images that ship only what an application actually needs at runtime. The static-debian12:nonroot variant is ~2 MB and contains exactly this:
| What's in it | What's NOT in it |
|---|---|
| CA certificates (for HTTPS calls) | A shell (/bin/sh, bash) |
/etc/passwd, /etc/group | Package manager (apt, apk) |
tzdata (time zone data) | Busybox / coreutils |
nonroot user (UID 65532) | glibc / any libc |
That last row is the important one. static-* distroless is "static" because it has no libc. If your binary needs glibc to load, this base won't run it.
That sounds like a downside until you realize it's what makes the next lever work. (And if you're CGO-clean, you're already producing a binary that doesn't need libc — you just have to tell Go that explicitly. That's lever 3.)
Compared to common alternatives:
| Base | Approx size | Trade-off |
|---|---|---|
ubuntu:22.04 | ~75 MB | Full distro |
debian:bookworm-slim | ~75 MB | Slimmer Debian |
alpine:3.20 | ~5 MB | Tiny — but uses musl libc, can break binaries built against glibc |
gcr.io/distroless/static-debian12 | ~2 MB | No libc at all — only runs static binaries |
If you ever do docker exec -it <container> sh to poke around — you can't, on distroless. There's no shell to exec. That's the security model (smaller attack surface — no shell for an attacker to land in) and the size model (less stuff to ship) coming from the same design decision.
Debugging on distroless is different. You don't enter the container. You watch logs, traces, and metrics that the container already emits. If you really need to look inside, you can use kubectl debug or docker debug (which attaches an ephemeral debug container with shell tools to a distroless one). Most of the time you don't need it.
Lever 3: CGO_ENABLED=0#
This one took me the longest to understand because it links together three things I'd encountered separately: CGO, static linking, and libc.
CGO is Go's bridge to C. Any time you write import "C" in Go, you're using CGO. It lets you call C functions from Go and vice versa. Some standard library packages and many database drivers use it under the hood without you knowing.
The relevant fact is: when CGO is enabled, the resulting binary is dynamically linked against glibc. That means the binary contains placeholders that say "look up malloc in libc.so.6 at runtime" instead of having the code for malloc inlined. When you run the binary, the OS dynamic linker resolves those placeholders against whatever libc.so.6 exists on the host.
Three consequences cascade from that single fact:
- You can't use a libc-less base image.
distroless/static-*has nolibc.so.6. The dynamic linker would fail at startup witherror while loading shared libraries: libc.so.6: cannot open shared object file. - You're locked into a larger runtime. You'd need
distroless/base-debian12(~20 MB) which ships glibc, or Alpine (which has musl — not 100% glibc-compatible), or full Debian. - You inherit libc version constraints. A binary built against glibc 2.36 won't run on a host with glibc 2.31. Static linking removes that whole class of compatibility problems.
CGO_ENABLED=0 tells the Go compiler: refuse to link against C. If you accidentally try to use CGO, it'll fail at build time. The output is a fully static binary — every dependency, including the Go runtime, baked into a single file. It runs anywhere, including on distroless/static-* which has no libc at all.
For most Go services, this is free. You're CGO-clean already. The default database/sql drivers, the net/http stack, the crypto packages all have pure-Go implementations. You only hit CGO if you specifically opt into something like the CGO-based SQLite driver, or a binding to a C library.
You can check whether you're CGO-clean with one command:
CGO_ENABLED=0 go build ./...If it builds, you're clean and can use distroless static. If it fails, the error message tells you which package needs CGO — and you decide whether to switch to a pure-Go alternative or accept the larger base.
That was my colleague's question after I explained this: "so just the deps and glibc?" Yes — once you turn off CGO, everything your binary needs is in the binary itself. The runtime image has nothing to provide except the kernel ABI.
Lever 4: -trimpath -ldflags="-s -w"#
Three flags passed to go build:
go build -trimpath -ldflags="-s -w" -o /server ./cmd/serverThis one I had to actually sit down and read about, because I'd been copy-pasting the flag combination from a tutorial for months without knowing what any of them did. Here's what I learned.
What -ldflags is#
-ldflags passes flags to the Go linker (the part of the build that combines compiled packages into a final executable). The string "-s -w" is two separate linker flags.
-s — strip the symbol table#
A symbol table is a section of an executable that maps memory addresses to human-readable names — function names, global variable names, type names. When you call main.handleRequest, the compiler produces machine code at some address like 0x4a23c0. The symbol table is the lookup that says "the function at 0x4a23c0 is called main.handleRequest".
Symbols are used by:
- Debuggers (
gdb,dlv) — so you can break on a function name instead of an address - Profilers (
pprof) — so flamegraphs show readable names - Crash reporters
Symbols are not used by:
- The kernel running your code
- The Go runtime calling your functions
- Network handlers responding to requests
-s strips the symbol table. The compiled instructions are unchanged. The program runs identically. What you lose is the ability for external tools (nm, gdb, profilers attaching to a stripped binary) to map addresses back to names.
Stack traces still work in Go after -s. Go's runtime maintains its own internal name table for the standard library and runtime functions. So a panic still prints function names. What changes is that external tools attaching to the binary lose visibility.
-w — strip DWARF debug information#
DWARF (Debugging With Attributed Record Formats) is a separate, much larger section that contains source-level debug information:
- Mapping from machine instructions back to lines of source code
- Variable names, types, and stack offsets
- Inlined function chains
This is what lets a debugger step through your-source-file.go line by line. Without DWARF you can still attach a debugger to a running process, but you can't ask it "show me the source for this line" — only "show me the instructions at this address."
DWARF is often the biggest section in an unstripped binary. -w strips it. Same as -s: program runs identically, runtime stack traces still work, external debug tools lose source-level visibility.
-trimpath — strip filesystem paths#
When you compile, the Go toolchain bakes paths into the binary. These come from:
runtime.Callerandruntime.Stack— they need to know "this line of code came from/Users/avik/go/src/myproject/main.go:42"- Module cache paths like
/Users/avik/go/pkg/mod/github.com/jackc/pgx/v5@v5.9.2/conn.go - Stack trace formatting
Without -trimpath, your production binary contains your home directory in dozens of strings. That:
- Leaks your build environment (e.g., your username, the exact path layout of your laptop)
- Makes the binary non-reproducible — building from
/home/ciproduces a different binary than building from/Users/avik, even if the source is identical - Adds a small amount of bloat from the long path strings
-trimpath replaces all those absolute paths with relative ones like myproject/main.go. The binary stays functional (Go's stack traces use the trimmed paths and still work) and it becomes reproducible — the same source compiled in any environment produces the same bytes.
Combined impact#
All three together:
| Flag | Strips | What you lose | What still works |
|---|---|---|---|
-s | Symbol table | External tools mapping addresses → function names | Go runtime stack traces, panics, signals |
-w | DWARF debug info | Source-level debugging in dlv/gdb | All execution; runtime info |
-trimpath | Absolute paths | Knowing the build environment from the binary | Stack traces with trimmed paths |
Size impact on a typical Go binary: roughly 25–30%. On our backend, the binary went from about 42 MB unstripped to 33 MB stripped.
Importantly, you can still debug a production crash. Go's runtime emits goroutine stack dumps on panic, including function names from its own internal tables — not from the symbol table you stripped. You just can't attach a fresh debugger and step through arbitrary source lines. For a server, you're going to read logs, traces, and metrics anyway. The debugger workflow is a development-time concern, not a production one.
If you ever need to debug a specific build, you can produce a non-stripped version locally:
go build -gcflags="all=-N -l" -o /server-debug ./cmd/server-N disables optimizations and -l disables inlining. Use this binary in dlv, not the one in your production image.
End-to-end: the four levers in one Dockerfile#
Here's the structural pattern, all four levers applied (this is the canonical template — not our actual file):
# ---- Stage 1: build ----
FROM --platform=$BUILDPLATFORM golang:1.25-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
ARG TARGETOS
ARG TARGETARCH
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} \
go build -trimpath -ldflags="-s -w" -o /server ./cmd/server
# ---- Stage 2: ship ----
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /server /server
EXPOSE 8080
ENTRYPOINT ["/server"]Twelve lines doing all four levers. Build and measure:
docker build -t check .
docker images checkREPOSITORY TAG IMAGE ID CREATED SIZE
check latest 413183e2583e 2 min ago 37 MBThe --mount=type=cache lines are BuildKit's persistent cache for go mod and the compile cache — they don't affect image size, only rebuild speed.
The conversation that made me realize I was hand-waving#
A colleague asked: "yeah I read what changed but couldn't understand a single bit of it."
I sat down to type out an explanation. I got two sentences in and realized I was about to say something like "distroless is just a smaller base."
That's not an explanation. It's a slogan.
What is distroless? Why is it 2 MB? Why does the static variant matter specifically? Why does CGO_ENABLED=0 interact with the choice of base? What's actually inside the symbol table and why is -s safe to apply to a production binary?
I knew the right flags. I had the right Dockerfile in front of me. I had it from somewhere a year ago and it worked. I had not actually understood any of it.
That afternoon I spent maybe an hour learning each piece properly. I want to be honest about how I did that — because I didn't just sit alone with the docs.
Where AI fit in the loop#
Before I touched the Dockerfile, I went to Claude and described our situation in plain English: a 342 MB Go backend image, here's roughly what's in the repo, here's what I was thinking of doing (UPX, dep pruning, the whole flawed plan from earlier in this post). I asked it to walk me through what each option would actually do and whether the order made sense.
The questions I asked were the ones I couldn't ask a coworker without feeling silly:
- "If I drop CGO and switch to distroless static, will my binary load? What does 'static' actually mean here?"
- "What's the difference between Alpine and distroless? Why would I pick one over the other?"
- "What does
-s -wstrip — does my server still produce useful crash logs after?" - "Is UPX worth it for a long-running server, or is the startup cost real?"
The answers gave me a starting model. Terms I'd seen but never understood (symbol table, DWARF, dynamic linker, libc dependency) got concrete meanings. The cascade between CGO and base-image choice clicked. I went from "these flags do something good" to "I know what each one removes and why it's safe."
But — and this is the part I want to be clear about — I didn't stop there.
AI is excellent at producing plausible explanations. It's not always great at noticing when its plausible explanation is missing the thing actually in front of you — like, say, the fact that your repo's Dockerfile already does the optimization you're planning. Or that the load-bearing detail of "static linking" depends on exactly which Go packages you've imported, which Claude can't see from a chat description.
So after I had a starting model, I verified the load-bearing parts:
- The distroless README — what each variant actually contains, in writing from Google
- Go's blog post on binary size — the team's own notes on
-s,-w,-trimpath - The multi-stage docs — to confirm I understood how
COPY --from=builderactually works - Our own Dockerfile, line by line, with the flags I'd just learned about
The workflow that ended up working for me:
- Describe the problem to AI in plain English
- Get a starting model with terms I didn't know yet
- Look up each unfamiliar term in primary docs
- Cross-check against the actual repo before changing anything
- Make the change. Measure.
Skipping step 3 is how you end up confidently making changes based on a model that's 80% right and 20% subtly wrong. Skipping step 4 is how you end up "fixing" a problem that doesn't exist. I almost did the second one.
The lesson isn't about AI or about Docker. It's: if you can't explain why each line of your config is there, you don't actually own it. Someone else owns it, you're just a tenant. AI can shorten the path to ownership — it cannot stand in for it. The day something breaks, the model is in your head, not in the chat window.
How I had been thinking about the problem (wrongly)#
Before I dug into this, my mental model for "why is the image big" was vibes-based:
- "Lots of dependencies imported in
go.mod" — half right; deps do add to the binary, but the binary was 33 MB, not 300 - "OTel pulls in gRPC + protobuf which are huge" — true in absolute terms, irrelevant to the actual problem
- "I should UPX-compress the binary" — would have shaved 20 MB off something that wasn't the problem
- "Audit transitive deps like
mongo-driverandquic-go" — multi-day investigation chasing a non-issue
Every one of those would have produced a real size reduction. But I'd have been picking up pennies in front of a steamroller. The steamroller was a ~600 MB Go toolchain layered into my image because nobody had ever multi-staged it. Pruning a 25 MB binary down to 18 MB doesn't help when the runtime base is dragging 600 MB of build artifacts.
Order-of-magnitude problems demand order-of-magnitude fixes. Multi-stage was the order-of-magnitude fix. Everything else was rounding error.
The four-lever checklist for your next Go service#
You can skip everything above and use just this:
| Lever | Code | What it does |
|---|---|---|
| Multi-stage | Two FROM statements; second COPY --from=builder | Toolchain stays in builder, only binary ships |
| Static base | FROM gcr.io/distroless/static-debian12:nonroot | ~2 MB runtime base, no shell, no libc |
| CGO off | CGO_ENABLED=0 env in the build step | Statically linked binary, works on libc-less base |
| Strip + trimpath | go build -trimpath -ldflags="-s -w" | ~25–30% smaller binary, reproducible builds, runtime still works |
If any one of these is missing, your image will be larger than it needs to be. The fix is mechanical. The understanding is the hard part — and it's the part that lets you debug it the day something goes wrong.
What I'd tell past me#
You don't need to optimize the binary. You don't need UPX. You don't need to prune go.mod. You need to not ship the compiler.
Open the Dockerfile. Count the FROM statements. If there is only one, you are shipping the toolchain. That is the problem. Fix that first. Measure. Then decide if you have a second problem.
Most of the time, you don't.
And when someone asks you what each flag does — be honest if you don't know yet. Then go learn it. That hour of reading is the difference between running a recipe and owning it.
If you want to go deeper#
- Distroless images — what each variant ships, and why
- Go binary size — the Go team's notes on
-s,-w, and-trimpath - Multi-stage builds — the Docker docs page, short and worth re-reading once a year
- How Nixpacks, buildctl, and BuildKit Actually Fit Together — the layer below this, if you want to understand what
docker buildis actually doing
Questions or corrections — GitHub or open an issue on my site.
Related posts
Sponsor
Support my open-source work
If my projects, blog posts, or tools have helped you, consider sponsoring me on GitHub. Every contribution keeps the side projects shipping.