Dockerfiles - Ofer Nave

# Tools : Docker : Dockerfiles ## Overview - Instructions are case-insensitive, but all caps by convention. - Comments must start at the beginning of the line. - Leading whitespace (before either) is ignored but discouraged. - Dockerfile must begin with FROM. (This may be after parser directives, comments, and ARGs.) #### Best Practices - Use multi-stage builds. - Leverage build cache. - Exclude w/.dockerignore (similar patterns as .gitignore). - Pin base image versions to specific digests: `FROM alpine:3.19@sha256:13b7e62e8df80264dbb747995705a986aa530415763a6c58f84a3ca8af9a5bcd` #### Multi-Stage Builds - Multi-stage builds let you reduce the size of your final image, by creating a cleaner separation between the building of your image and the final output. - Split your Dockerfile instructions into distinct stages to make sure that the resulting output only contains the files that's needed to run the application. - Using multiple stages can also let you build more efficiently by executing build steps in parallel. ## Parser Directives - Directives must come before any other lines, including empty ones. - Directives are case-insensitive, but lowercase by convention. - Each directive may only be used once. - The only directives are "syntax" and "escape", and neither are particularly important. ## ENV Vars - Variables declared with `ENV` can be used in instructions with `$var` or `${var}`. - The brace syntax is typically used to facilitate variable names with no whitespace: `${foo}_bar` - Also supports a few standard bash modifiers: ``` ${var:-word} if `var` is NOT set, use `word` instead ${var:+word} if `var` is set, use `word`, else empty string ``` ## Commands — CMD, ENTRYPOINT, RUN #### Shell vs Exec Forms ``` Exec : INSTRUCTION [ "executable", "param1", "param2" ] (JSON array) (no wildcards or env vars) Shell : INSTRUCTION command param1 param2 ``` - To use Exec Form with a shell: `[ "sh", "-c", "echo $HOME" ]` - To use Shell Form but still receive signals: `exec <command> ...` - Full path not necessary if executable is found in `PATH`. #### Exec Form The Exec Form is best used to specify an `ENTRYPOINT` instruction, combined with `CMD` for setting default arguments that can be overridden at runtime. #### Shell Form The Shell Form is basically a string, and thus supports escaping newlines to span lines: ```Dockerfile RUN source $HOME/.bashrc && \ echo $HOME ``` Also supports heredocs: ```Dockerfile RUN <<EOF source $HOME/.bashrc && \ echo $HOME EOF ``` #### CMD & ENTRYPOINT - Should specify at least one of CMD or ENTRYPOINT commands. - Each should appear only once. (If multiple times, only last is used.) **CMD w/wo ENTRYPOINT — Exec Form** The purpose of `CMD` is to provide defaults for an executing container. These defaults can include an executable, or they can omit the executable, in which case you must also specify an `ENTRYPOINT` instruction. If `CMD` is used to provide default arguments for `ENTRYPOINT`, use Exec Form for both. Arguments passed to `docker run` will override `CMD` and be appended after all elements in an Exec Form `ENTRYPOINT`. **ENTRYPOINT — Shell Form** The Shell Form of `ENTRYPOINT` prevents `CMD` or CLI args from being used. It also starts your `ENTRYPOINT` as a subcommand of `/bin/sh -c`, which does not pass signals — including `SIGTERM` from `docker stop`. If you need to write a starter script for a single executable, you can ensure that the final executable receives Unix signals by using exec and gosu commands. ```bash #!/usr/bin/env bash set -e ... exec "$@" ``` **Conclusion** ``` No ENTRYPOINT ENTRYPOINT bar p2 ENTRYPOINT [ "bar", "p2" ] ┌────────────────── ───────────────── ────────────────────────── No CMD │ error /bin/sh -c bar p2 bar p2 CMD [ "foo", "p1" ] │ foo p1 /bin/sh -c bar p2 bar p2 foo p1 CMD foo p1 │ /bin/sh -c foo p1 /bin/sh -c bar p2 bar p2 /bin/sh -c foo p1 ``` If you would like to run the same executable every time, use `ENTRYPOINT` + `CMD`. Otherwise, just use `CMD`. > [!NOTE] > If `CMD` is defined from the base image, setting `ENTRYPOINT` will reset `CMD` to an empty value. In this scenario, `CMD` must be defined in the current image to have a value. ## Instructions ```Dockerfile +e ADD Copy files into image. (Gen1 - remote sources and autoextraction) ARG Use build-time variables. (Seems weird and unnecessary.) CMD Specify defaults for container execution. +e COPY Copy files into image. (Gen2 - only actual copying, but also from other images) ENTRYPOINT Specify default executable. e ENV Set environment variables. e EXPOSE Describe which ports your application is listening on. +e FROM Create a new build stage from a base image. HEALTHCHECK Check a container's health on startup. e LABEL Add metadata to an image. MAINTAINER Specify the author of an image. (DEPRECATED — use LABEL instead) e ONBUILD Specify instructions for when the image is used in a build. + RUN Execute build commands. SHELL Set the default shell of an image. e STOPSIGNAL Specify the system call signal for exiting a container. e USER Set user and group ID. e VOLUME Create volume mounts. (Seems weird and unnecessary.) e WORKDIR Change working directory. -- ----------- ----------------------------------------------------------- + adds a layer (otherwise only affects metadata) e supports env var substitution You can also use env vars with CMD/ENTRYPOINT/RUN (in shell form), but that's actually handled by the command shell, not the builder. ``` ## Reference #### ADD ```Dockerfile ADD [OPTIONS] <src> ... <dest> ADD [OPTIONS] [ "<src>", ... "<dest>" ] # this form required for paths containing whitespace Options: --checksum --chmod see: COPY --chown see: COPY --keep-git-dir ``` - Supports remote sources, and when specifying a compressed file, will auto-extract it. - First version. The `COPY` instruction was added later to only do normal copying from local sources to avoid unpleasant surprises. #### CMD ```Dockerfile CMD [ "executable","param1","param2" ] # Exec Form CMD [ "param1","param2" ] # Exec Form as default parameters to ENTRYPOINT CMD command param1 param2 # Shell Form ``` #### COPY ```Dockerfile COPY [OPTIONS] <src1> ... <dest> COPY [OPTIONS] [ "<src1>", ... "<dest>" ] # this form required for paths containing whitespace Options: --chmod=<perms> --chown=<user>[:<group>] # UID/GID or names (looked up in /etc/passwd and /etc/group) # defaults to 0 --from=<image|stage|context> # look for <src> in specified image (relative to root) ``` - Preferred over `ADD` unless you need to copy from a remote source or extract from a compressed file. - Local (host) sources are relative to build context and support wildcards via Go's `filepath.Match` rules. - Destinations are absolute or relative to `WORKDIR`. - Destination directories are auto-created. - If multiple sources, destination must be a directory and end in "/". #### ENTRYPOINT ```Dockerfile ENTRYPOINT [ "executable", "param1", "param2" ] # Exec Form (preferred) ENTRYPOINT command param1 param2 # Shell Form ``` #### ENV ```Dockerfile ENV k1=v1 ... ``` #### EXPOSE ```Dockerfile EXPOSE <port> [<port>/<protocol>...] ``` - Doesn't actually publish the port. - Functions as a type of documentation between the person who builds the image and the person who runs the container about which ports are intended to be published. With `docker run`, - Use `-p` to publish and map one or more ports. - Use `-P` to publish all exposed ports and map them to high-order ports. - Defaults to TCP unless you specify otherwise: 80/udp - To expose a port on both protocols, specify each separately. #### FROM ```Dockerfile FROM <image> [AS <name>] <image> = <repo>:<tag>[@sha256:<digest>] ex: alpine:3.19@sha256:13b7e62e8df80264dbb747995705a986aa530415763a6c58f84a3ca8af9a5bcd ``` - Initializes a new build stage and sets the base image for subsequent instructions. - If image spec has no tag or digest, assumes tag "latest". - If name given, can be used in subsequent instructions to refer to the image: ``` FROM <name> COPY --form=<name> RUN --mount=type=bind,from=<name> ``` - Can appear multiple times to create multiple images or use one stage as a dependency for another. #### HEALTHCHECK TODO #### LABEL ```Dockerfile LABEL k1=v1, ... ``` #### RUN ```Dockerfile RUN [OPTIONS] [ "<command>", ... ] # Exec Form RUN [OPTIONS] <command> ... # Shell Form Options: --mount create filesystem mounts that the build can access --network which networking environment the command is run in ``` > [!NOTE] > Always combine `apt-get update` with `apt-get install -y` in the same `RUN` statement. #### SHELL ```Dockerfile SHELL [ "executable", "parameters" ] ``` Must use JSON form. Defaults to: `[ "/bin/sh", "-c" ]` #### STOPSIGNAL ```Dockerfile STOPSIGNAL <signal> ``` - Sets the system call signal that will be sent to the container to exit. - This signal can be a signal name in the format `SIG<NAME>`, for instance `SIGKILL`, or an unsigned number that matches a position in the kernel's syscall table, for instance 9. - The default is `SIGTERM` if not defined. #### USER ```Dockerfile USER <user>[:<group>] ``` - Determines user/group for `CMD`, `ENTRYPOINT`, and `RUN` commands. - Can use UID/GID or names (looked up in `/etc/passwd` and `/etc/group`). - If group specified, user will _only_ have that group. Otherwise, defaults to "root". #### WORKDIR ```Dockerfile WORKDIR <path> ``` - Sets basedir for subsequent `ADD`, `CMD`, `COPY`, `ENTRYPOINT`, and `RUN`. - If relative, is relative to previous `WORKDIR`. - Inherits from base image, so specify your own to be sure. - If not declared at all (like in from-scratch image), defaults to "/".