Creating a Custom Docker Image
Docker containers are created from images. An image contains software that can be launched, including the underlying system. A container is an instance of that software.
When we want to automate MariaDB, creating an image with MariaDB and the desired configuration, we may want to create an image by ourselves, which fulfils our needs.
Contents
Images Architecture
The source code of a Docker image is a Dockerfile. A Dockerfile is written in Docker specific language, and can be compiled into an image by the docker
binary, using the docker build
command.
Most images are based on another image. The base image is specified at the beginning of the Dockerfile, with the FROM
directive. If the base image is not present in the local system, it is downloaded from Docker Hub. For example, we can build a mariadb-rocksdb:10.5
image starting from the debian:13
image. In this way, we'll have all the software included in a standard Debian image, and we'll add MariaDB and its configuration upon that image.
All the following Dockerfile directives are compiled into a new Docker image, identified by an MD5 string. Each of these images is based on the image compiled from the previous directive. A physical compiled image can serve as a base for any number of images. This mechanism saves a lot of disk space, download time and build time.
The following diagram shows the relationship between Dockerfiles, images and containers:
Dockerfile Syntax
Here's a simple Dockerfile example:
FROM ubuntu:20.04 RUN apt-get update RUN apt-get install -y mariadb-server EXPOSE 3306 LABEL version="1.0" LABEL description="MariaDB Server" CMD ["mysqld"]
This example is not very good for practical purposes, but it shows what a Dockerfile looks like.
First, we declare that the base image to use is ubuntu:20.04
.
Then we run some commands to install MariaDB from the Ubuntu default repositories and stop the MariaDB service.
We define some metadata about the image with LABEL
. Any label is valid.
We declare that the port 3306 (MariaDB default port) should be exposed. However, this has no effect if the port is not exposed at container creation.
Finally, we start the container command: mysqld. This command is run when a container based on this image starts. When the process stops or crashes, the container will immediately stop.
Note that, in a container, we normally run mysqld directly, rather than running mysqld_safe or running MariaDB as a service. Containers restart can be handled. by Docker. See automatic restart.
See the documentation links below to learn the syntax allowed in a Dockerfile.
Using Variables
It is possible to use variables in a Dockerfile. This allows us, for example, to install different packages, install different versions of a package, or configure software differently depending on how variables are set, without modifying the Dockerfile itself.
To use a variable, we can do something like this:
FROM ubuntu:20.04 ARG MARIADB_CONFIG_FILE ... RUN mysqld --defaults-file=$MARIADB_CONFIG_FILE
Here ARG
is used after the FROM
directive, thus the variable cannot be used in FROM
. It is also possible to declare a variable before FROM
, so we can use a variable to select the base image to use or its tag, but in this case the variable cannot be used after the FROM
directive. Here is an example:
ARG UBUNTU_VERSION FROM ubuntu:$UBUNTU_VERSION # But this will cause a build error: RUN echo 'Ubuntu version: $UBUNTU_VERSION' > /var/build_log
We'll have to assign variables a value when we build the Dockerfile, in this way:
docker build --build-arg UBUNTU_VERSION=20.04 .
Note that Dockerfile variables are just placeholders for values. Dockerfiles do not support assignment, conditionals or loops.
Versioning and Deploying Images
Dockerfiles are normally versioned, as well as the files that are copied to the images.
Once an image is built, it can be pushed to a Docker registry. Whenever an image is needed on a host to start containers from it, it is pulled from the registry.
Docker registries
The default registry for Docker images is Docker Hub. It contains images created by both the Docker team and the community. Any individual or organization can open an account and push images to Docker Hub. Most Docker images are open source: the Dockerfiles and the needed files to build the images are usually on GitHub.
It is also possible to setup a self-hosted registry. Images can be pushed to that registry and pulled from it, instead of using Docker Hub. If the registry is not publicly accessible, it can be used to store images used by the organization without making them publicly available.
But a self-hosted registry can also be useful for open source images: if an image is available on Docker Hub and also on a self-hosted registry, in case Docker Hub is down or not reachable, it will still be possible to pull images.
Choosing Image Names and Tags
The names of images developed by the community follow this schema:
maintainer/technology
It doesn't matter if the maintainer is an individual or an organization. For images available on Docker Hub, the maintainer is the name of a Docker Hub account.
Official images maintained don't contain the name of the maintainer. For example, the official MariaDB image is called mariadb
.
All images have a tag, which identifies the version or the variant of an image. For example, all MariaDB versions available on Docker are used as image tags. MariaDB 10.5 is called mariadb:10.5
.
By conversion, tags form a hierarchy. So for example, there is a 10.1.1
tag whose meaning will not change over time. 10.5
will always identify the latest version in the 10.5 branch. For some time it was 10.5.1
, then it became 10.5.2
, and so on.
When we pull an image without specifying a tag (ie, docker pull mariadb
), we are implicitly requiring the image with the latest
tag. This is even more mutable: at different periods of time, it pointed to the latest 10.0
version, to the latest 10.1
version, and so on.
In production, it is always better to know for sure which version we are installing/. Therefore it is better to specify a tag whose meaning won't change over time, like 10.5.1
.
Pushing and Pulling Images
To pull an image from Docker Hub or a self-hosted registry, we use the docker pull
command. For example:
docker pull mariadb:10.5
This command downloads the specified image if it is not already present in the system, or if the local version is not up to date.
After modifying a Dockerfile, we can build an image in this way:
docker build .
This step can be automated by services like Docker Hub and GitHub. Check those service's documentation to find out how this feature works.
Once an image is created, it can be pushed to a registry. We can do it in this way:
docker push <image_name>:<tag>
Docker Content Trust
Docker has a feature called Docker Content Trust (DCT). It is a system used to digitally sign images, based on PEM keys. For environments where security is a major concern, it is important to sign images before pushing them. This can be done with both Docker Hub and self-hosted registries.
Good Practices and Caveats
As mentioned, a Dockerfile is built by creating a new image for each directive that follows FROM
. This leads to some considerations.
- Sometimes it can be a good idea to run several shell commands in a single
RUN
directive to avoid creating images that are not useful. - Modifying a directive means that all subsequent directives also need to be rebuilt. When possible, directives that are expected to change often should follow directives that will change seldom.
- Directives like
LABEL
orEXPOSE
should be placed close to the end of Dockerfiles. In this way they will be rebuilt often, but this operation is cheap. On the other side, changing a label should not trigger a long rebuild process. - Variables should be used to avoid Dockerfiles proliferation. But if a variable is used, changing its value should be tested. So, be sure not to use variables without a good reason.
- Writing logic into a Dockerfile is impossible or very hard. Call shell scripts instead, and write your logic into them. For example, in a shell script it is easy to perform a certain operation only if a variable is set to a certain value.
- If you need MariaDB containers with different configurations or different sets of plugins, use the method explained above. Do not create several Dockerfiles, with different tags, for each desired configuration or plugin set. This may lead to undesired code duplication and increased maintenance costs.
References
More details can be found in the Docker documentation:
- Dockerfile reference.
- docker build.
- Repositories.
- Deploy a registry server.
- Content trust in Docker.
See also:
- Privacy-Enhanced Mail on Wikipedia.
Content initially contributed by Vettabase Ltd.