Compiling native applications for alpine containers

Table of contents

Operators strive for ever smaller and more efficient containers in modern environments, with alpine linux being a clear winner for the base image and native compilation producing the smallest and fastest applications of all. Combining both seems like a logical next step, but there is a catch: alpine uses musl libc, while most other linux distributions rely on glibc, making some programs compiled on one incompatible with the other.

What is the problem?

The C programming language comes with a standard library of functions, just like any other. But there are multiple implementations of this standard library - two of which are glibc and musl. Glibc is the GNU C standard library, in use by most linux operating systems today because of its rich feature set and extensions that make writing modern software more efficient. Musl on the other hand aims to be a lightweight and standards-compliant implementation, thus lacking some of the extended features and quirks that glibc has adopted over the years.

Compiling to native binaries will default to dynamic linking, meaning the standards library isn't compiled into every program, but instead linked against. This saves resources, as all c programs on the computer can share a single instance of the standard library in memory, providing significant resource savings when several hundreds or thousands of programs are running.

But dynamic linking also has the downside that the program now depends on the linked library - normally not a problem, until you switch to a system that uses a different standard library implementation, like alpine with musl.

Installing glibc on alpine

The first solution that comes to mind is to simply switch alpine's c library to glibc, making it compatible with software that has been compiled on other linux systems. There are two ways to accomplish this, the first is to use the official packages available through apk:

apk add gcompat

Installing the glibc compatibility package will get most glibc-compiled binaries to run, but some will still not work with this patch. In such cases, you can try to install third-party patched libraries like this, but they have been known to break between alpine linux version changes.

Choosing either of these options is unreliable and not suited for production environments that expect stable and reliable builds in the future.

Compiling in an alpine container

Instead of trying to make glibc binaries work in alpine, you could just simply compile your source code with musl libc. The easiest way in an environment already using container is to add a building step to the dockerfile:

Dockerfile

FROM alpine:3 as builder
WORKDIR /build
COPY . /build
RUN apk add go
RUN go build -o main main.go


FROM alpine:3
COPY --from=builder /build/main /app
WORKDIR /app
ENTRYPOINT ["./main"]

This simplified example uses a two-step build process: first it installs the golang runtime on an alpine image and uses that to compile the source code (using alpine's musl libc). The second step creates a fresh container image from alpine again, copying just the compiled binary from the builder image. Using such two-step build processes has the advantage that all build tools like the go runtime and installed packages won't be present in the resulting image, causing minimal additional overhead.

This approach works for all languages, but not equally well for all of them: Projects using package managers that have caching mechanisms outside the working directory like C/C++ applications using conan, or software written in go, will effectively bust the cache on every build (the cache is discarded after every build, so the performance benefits are never available and builds are slower than they need to be).

The issues can be remedied to some degree by using a custom-build base image for building that comes preloaded with required dependencies or previous cache contents, but that introduces additional cost into the build process and the registry to store all that data, plus added complexity to now maintain and regularly update the base package or performance degrades over time.

You can safely use this approach with languages like cpython or c/c++ without package managers, which require that dependencies be installed into the source directory on every build anyway, but for projects using go or c/c++ with conan, there are better ways.

Statically link binaries

Compiling outside of a container is often preferable for performance reasons, and there is a way to make native binaries work across different standard libraries: compile the library statically into the binary, so it has no external dependencies. Compiling dependencies into the binary directly is called statically linking or compiling a static executable. To make this work, all you have to do is add a -static flag to the ldflags (linker arguments), to tell compilers like gcc to produce a portable standalone program.

Depending on your programming language, this will look differently:

c

gcc -o my_program my_program.c -static

c++

g++ -o my_program my_program.cpp -static

golang (with cgo)

go build -o my_program my_program.go -ldflags="-extldflags=-static"

java (with native compile through graalvm)

native-image --static --no-fallback -H:+StaticExecutable -o my_program Main

Compiling a static binary uses more resources at build time (because all external libraries need to be compiled, which would have simply been linked to normally). Compiling more code also means the resulting executable is larger in size and takes more resources to run (compiled code needs to be loaded into memory at runtime). Whether or not this tradeoff works for you is use-case dependent: if your company produces only a dozen applications with slow update cycles you may not care about a few MB of bandwidth, disk storage and memory usage. But companies running hundreds or thousands of containers at scale will definitely feel the pain of accumulated costs per container, where even 10MB overhead quickly sum up to gigabytes of wasted resources.

Remember that size matters in multiple way: at runtime (memory consumption), during build (bandwidth to push image to registry), at rest (storage needed inside registry) and during deployment (bandwidth/storage to pull/run image). This is extended by less-visible cost, like backup bandwidth/storage and auditing software needing to process more data when scanning images for security vulnerabilities.

Cross-compile musl binaries from glibc hosts

The last option in the list uses a standard glibc-based host system to compile dynamically linked musl binaries instead. We assume a debian system for this example, which needs some additional packages to allow cross-compilation for musl:

sudo apt install musl musl-dev musl-tools

It is important to understand that this does not change your default c standard library (that will still be glibc). All the package does is add the musl library as an alternative, with special wrappers like musl-gcc to compile executables linked against it instead of glibc.

Now all that is left is to tell the language-specific compiler to use musl-gcc for compiling C code. How exactly that works depends on your language, but here are some examples:

c

musl-gcc -o my_program my_program.c -static

golang (with cgo)

CC=musl-gcc go build -o my_program my_program.go

java (with native compile through graalvm)

native-image --libc=musl -o output MainClass

Cross-compiled executables will behave just like ones built on a system using musl directly (like alpine), and should be small and optimized for the container applications.

Which option is right for me?

The right approach for you is what works best for your chosen language, environment and infrastructure requirements. The easiest approach is to use a two-step Dockerfile. If you need faster builds for projects using golang or c/c++ with conan, check out cross-compiled musl executables instead. In case you have room to afford the size overhead, you can choose a simpler ci/cd pipeline setup with statically linked executables instead. Making a choice is impossible without knowing the individual circumstances of the company, policies and requirements beforehand, so take some time to think what tradeoff works best for your use case.

More articles

Running LLMs locally with ollama

Painless AI text models on localhost

Writing user-friendly bash scripts

Meeting user expectations from cleanup to help output

Exploring CPU caches

Why modern CPUs need L1, L2 and L3 caches

Measuring software performance with percentiles

Understanding what insights P99, P95 and P50 percentiles provide