Creating a Custom Docker Image
Docker containers are created from images. An image contains software that can be launched, including the underlying system. A container is an instance of that software.
When we want to automate MariaDB, creating an image with MariaDB and the desired configuration, we may want to create an image by ourselves, which fulfils our needs.
The source code of a Docker image is a Dockerfile. A Dockerfile is written in Docker specific language, and can be compiled into an image by the
docker binary, using the
docker build command.
Most images are based on another image. The base image is specified at the beginning of the Dockerfile, with the
FROM directive. If the base image is not present in the local system, it is downloaded from Docker Hub. For example, we can build a
mariadb-rocksdb:10.5 image starting from the
debian:13 image. In this way, we'll have all the software included in a standard Debian image, and we'll add MariaDB and its configuration upon that image.
All the following Dockerfile directives are compiled into a new Docker image, identified by an MD5 string. Each of these images is based on the image compiled from the previous directive. A physical compiled image can serve as a base for any number of images. This mechanism saves a lot of disk space, download time and build time.
The following diagram shows the relationship between Dockerfiles, images and containers:
Here's a simple Dockerfile example:
FROM ubuntu:20.04 RUN apt-get update RUN apt-get install -y mariadb-server EXPOSE 3306 LABEL version="1.0" LABEL description="MariaDB Server" HEALTHCHECK --start-period=5m \ CMD mariadb -e 'SELECT @@datadir;' || exit 1 CMD ["mysqld"]
This example is not very good for practical purposes, but it shows what a Dockerfile looks like.
First, we declare that the base image to use is
Then we run some commands to install MariaDB from the Ubuntu default repositories and stop the MariaDB service.
We define some metadata about the image with
LABEL. Any label is valid.
We declare that the port 3306 (MariaDB default port) should be exposed. However, this has no effect if the port is not exposed at container creation.
We also define a healthcheck. This is a command that is run to check if the container is healthy. If the return code is 0 the healthcheck succeeds, if it's 1 it fails. In the MariaDB specific case, we want to check that it's running and able to answer a simple query. This is better than just checking that MariaDB process is running, because MariaDB could be running but unable to respond, for example because max_connections was reached or data si corrupted. We read a system variable, because we should not assume that any user-created table exists. We also specify
--start-period to allow some time for MariaDB to start, keeping in mind that restarting it may take some time if some data is corrupted. Note that there can be only one healthcheck: if the command is specified multiple times, only the last occurrence will take effect.
Finally, we start the container command: mysqld. This command is run when a container based on this image starts. When the process stops or crashes, the container will immediately stop.
See the documentation links below to learn the syntax allowed in a Dockerfile.
It is possible to use variables in a Dockerfile. This allows us, for example, to install different packages, install different versions of a package, or configure software differently depending on how variables are set, without modifying the Dockerfile itself.
To use a variable, we can do something like this:
FROM ubuntu:20.04 ARG MARIADB_CONFIG_FILE ... RUN mysqld --defaults-file=$MARIADB_CONFIG_FILE
ARG is used after the
FROM directive, thus the variable cannot be used in
FROM. It is also possible to declare a variable before
FROM, so we can use a variable to select the base image to use or its tag, but in this case the variable cannot be used after the
FROM directive. Here is an example:
ARG UBUNTU_VERSION FROM ubuntu:$UBUNTU_VERSION # But this will cause a build error: RUN echo 'Ubuntu version: $UBUNTU_VERSION' > /var/build_log
We'll have to assign variables a value when we build the Dockerfile, in this way:
docker build --build-arg UBUNTU_VERSION=20.04 .
Note that Dockerfile variables are just placeholders for values. Dockerfiles do not support assignment, conditionals or loops.
Versioning and Deploying Images
Dockerfiles are normally versioned, as well as the files that are copied to the images.
Once an image is built, it can be pushed to a Docker registry. Whenever an image is needed on a host to start containers from it, it is pulled from the registry.
The default registry for Docker images is Docker Hub. It contains images created by both the Docker team and the community. Any individual or organization can open an account and push images to Docker Hub. Most Docker images are open source: the Dockerfiles and the needed files to build the images are usually on GitHub.
It is also possible to setup a self-hosted registry. Images can be pushed to that registry and pulled from it, instead of using Docker Hub. If the registry is not publicly accessible, it can be used to store images used by the organization without making them publicly available.
But a self-hosted registry can also be useful for open source images: if an image is available on Docker Hub and also on a self-hosted registry, in case Docker Hub is down or not reachable, it will still be possible to pull images.
Choosing Image Names and Tags
The names of images developed by the community follow this schema:
It doesn't matter if the maintainer is an individual or an organization. For images available on Docker Hub, the maintainer is the name of a Docker Hub account.
Official images maintained don't contain the name of the maintainer. For example, the official MariaDB image is called
All images have a tag, which identifies the version or the variant of an image. For example, all MariaDB versions available on Docker are used as image tags. MariaDB 10.5 is called
By conversion, tags form a hierarchy. So for example, there is a
10.1.1 tag whose meaning will not change over time.
10.5 will always identify the latest version in the 10.5 branch. For some time it was
10.5.1, then it became
10.5.2, and so on.
When we pull an image without specifying a tag (ie,
docker pull mariadb), we are implicitly requiring the image with the
latest tag. This is even more mutable: at different periods of time, it pointed to the latest
10.0 version, to the latest
10.1 version, and so on.
In production, it is always better to know for sure which version we are installing/. Therefore it is better to specify a tag whose meaning won't change over time, like
Pushing and Pulling Images
To pull an image from Docker Hub or a self-hosted registry, we use the
docker pull command. For example:
docker pull mariadb:10.5
This command downloads the specified image if it is not already present in the system, or if the local version is not up to date.
After modifying a Dockerfile, we can build an image in this way:
docker build .
This step can be automated by services like Docker Hub and GitHub. Check those service's documentation to find out how this feature works.
Once an image is created, it can be pushed to a registry. We can do it in this way:
docker push <image_name>:<tag>
Docker Content Trust
Docker has a feature called Docker Content Trust (DCT). It is a system used to digitally sign images, based on PEM keys. For environments where security is a major concern, it is important to sign images before pushing them. This can be done with both Docker Hub and self-hosted registries.
Good Practices and Caveats
As mentioned, a Dockerfile is built by creating a new image for each directive that follows
FROM. This leads to some considerations.
- Sometimes it can be a good idea to run several shell commands in a single
RUNdirective to avoid creating images that are not useful.
- Modifying a directive means that all subsequent directives also need to be rebuilt. When possible, directives that are expected to change often should follow directives that will change seldom.
- Directives like
EXPOSEshould be placed close to the end of Dockerfiles. In this way they will be rebuilt often, but this operation is cheap. On the other side, changing a label should not trigger a long rebuild process.
- Variables should be used to avoid Dockerfiles proliferation. But if a variable is used, changing its value should be tested. So, be sure not to use variables without a good reason.
- Writing logic into a Dockerfile is impossible or very hard. Call shell scripts instead, and write your logic into them. For example, in a shell script it is easy to perform a certain operation only if a variable is set to a certain value.
- If you need MariaDB containers with different configurations or different sets of plugins, use the method explained above. Do not create several Dockerfiles, with different tags, for each desired configuration or plugin set. This may lead to undesired code duplication and increased maintenance costs.
More details can be found in the Docker documentation:
- Dockerfile reference.
- docker build.
- Deploy a registry server.
- Content trust in Docker.
- Privacy-Enhanced Mail on Wikipedia.
Content initially contributed by Vettabase Ltd.