Build Docker Images With GitLab CI/CD
Make good use of your free computer minutes from GitLab.

If you can write YAML files to config GitLab CI/CD, you can have it to do things for you. It’s like getting a free computer (but only a few hours every month for free users). The first time I used it was to generate and host this website, then I tried to modify the pipeline so that I can earn some crypto with my blog. Now it’s time to do something like a real developer: use GitLab to automatically build docker container images, and push them to the GitLab container registry.

The Official Template

All you need to do is to commit a .gitlab-ci.yml file to your repository. An exact copy of the content from the official example for Docker will work. By the time of writing, the content (sans the comments) is:

docker-build: 
  image: docker:latest
  stage: build
  services:
    - docker:dind
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY 
  script:
    - |
      if [[ "$CI_COMMIT_BRANCH" == "$CI_DEFAULT_BRANCH" ]]; then
        tag=""
        echo "Running on default branch '$CI_DEFAULT_BRANCH': tag = 'latest'"
      else
        tag=":$CI_COMMIT_REF_SLUG"
        echo "Running on branch '$CI_COMMIT_BRANCH': tag = $tag"
      fi      
    - docker build --pull -t "$CI_REGISTRY_IMAGE${tag}" .
    - docker push "$CI_REGISTRY_IMAGE${tag}" 
  rules:
    - if: $CI_COMMIT_BRANCH
      exists:
        - Dockerfile

But of course the repository itself must contain everything necessary for building a docker image: the Dockerfile, plus the additional files (if any). Then, whenever you push a new commit to GitLab, GitLab will try to build an image according to the updated files in the repository, tag it according to the branch or tag name, and push the image to the repository’s container registry. If this is all you need, you can stop reading now. The rest of this post is about how to understand this cryptic file, and how I tailor made one for my specific needs.

Understanding the Pipeline

A pipeline has one or more stages, each of which contains one or more jobs. Jobs in the same stage can execute in parallel, but the stages will be executed sequentially. When executing a pipeline, GitLab assigns a runner, and runs all jobs in a containerized environment using the specified image as the base. The official template only contains one stage (‘build’), which has only one job ‘docker-build’.

To understand the template, you can find here for the complete .gitlab-ci.yml reference. But here I’ll try explaining it line by line with my own comments:

docker-build:			# names a 'job' of the pipeline as 'docker-build'
  image: docker:latest  # use the latest image of docker
  stage: build          # this 'job' belogs to the 'build stage', 
  services:				# specify an additional Docker image to run scripts in
    - docker:dind		# dind: Docker-in-Docker, a way of using docker to build docker images
  before_script:        # defines the commands that should run before each job’s 'script'
    # A Docker command that logs in to the GitLab CI registry, using The credentials from predefined variables.
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY 
  script:				# list the commands that will be executed 
    # Shell script for deciding what the image tag should be
    # If the current change is on the main branch, tag as 'latest'
    - |
      if [[ "$CI_COMMIT_BRANCH" == "$CI_DEFAULT_BRANCH" ]]; then
        tag=""
        echo "Running on default branch '$CI_DEFAULT_BRANCH': tag = 'latest'"      
    # If it's not the main branch, tag using the '$CI_COMMIT_REF_SLUG' variable
      else
        tag=":$CI_COMMIT_REF_SLUG"
        echo "Running on branch '$CI_COMMIT_BRANCH': tag = $tag"
      fi
    # Docker command to build the image with the tag as specified above
    - docker build --pull -t "$CI_REGISTRY_IMAGE${tag}" .
    # Docker command to push the image to the GitLab registry
    - docker push "$CI_REGISTRY_IMAGE${tag}" 
  rules:				# this job 'docker-build' will be executed only if the following rules are satisfied
    # If there is a `Dockerfile` among the files of the current branch
    - if: $CI_COMMIT_BRANCH
      exists:
        - Dockerfile

So now it’s obvious that the example template builds, tags, and pushes all right. But it does not completely satisfy my needs. For example, for every new commit to a branch, it builds a new image and only tag it with one tag (’latest’ for the default branch, $CI_COMMIT_REF_SLUG 1 for the other branches). This means that whenever there is a new commit, the old image would loose its tag, making it difficult to debug an old image related to an old commit. This can be partially avoided by using git tag to tag the releases. Pushing the tags will also trigger the building process, and make images with almost the same tag (except it’s $CI_COMMIT_REF_SLUG, meaning that special characters like symbols will be replaced by ‘-’, e .g., git tag ‘v1.0’ will be changed into image tag ‘v1-0’).

My Adapted Template

In my case, I want to build an image whenever there is a commit (to which ever branch) or tag, push the image to the container registry of the same repository, and tag the images with multiple tags:

  1. The commit short hash, so that I can easily find the corresponding commit
  2. If it’s on the default branch, add the ’latest’ tag, so that I’ll get it by default when I pull without specifying the tag
  3. If it’s not the default branch, add the branch name, so that I know it’s the latest non-master image
  4. If it’s a git tag, add the tag as it is, so that I can keep track of the import versions

So here is the one that I ended up using for a simple project. It’s a pipeline with only one stage (‘build’), which has two jobs (‘build-master’ and ‘build-other’) for different conditions (as specified by the ‘rules’).

image: docker:20 # freeze major version for reproducibility

build-master:
  stage: build
  services:
    - docker:dind
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  script:
    # tag by short sha, ref slug (branch or tag name), also 'latest'
    - docker build --pull -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA -t $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME -t $CI_REGISTRY_IMAGE:latest .
    # push all the tags
    - docker image push --all-tags $CI_REGISTRY_IMAGE
  rules:
    # Run this if it's the default branch
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

build-other:
  stage: build
  services:
    - docker:dind
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  script:
    # tag by short sha, ref slug (branch or tag name)
    - docker build --pull -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA -t $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME .
    - docker image push --all-tags $CI_REGISTRY_IMAGE
  rules:
    # Run this if it's not the default branch
    - if: '$CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH'

Why

Automation boosts productivity. Building a Docker image and pushing it to a registry can be a long wait. As a data analyst who writes Python, it gives me as good an excuse for slacking off as my developer colleagues’ “my code is compiling”. Automatically build images after git commits does not save the building time, but it saves the typing of Docker commands, and overhead for context switching of my mindset.

Compiling

Also my local network environment is making things worse. Whenever there is network IO such as apt update or pip install during the building process, timeouts or simple connection failures are expected unless I have configured mirrors or proxies. Change these settings in the Dockerfile is cumbersome (instead of a simple one line FROM Ubuntu, now it needs an additional line to run sed for changing the sources.list), and sometimes causes bugs (it once happended that I put the latest version of a package in the requirements.txt, but it was so new that my PyPI hasn’t copied it yet). Having GitLab to build the images in their cloud, where I can safely assume nice access to the common resources in the Internet, avoids this headache.

Alternatives to Docker in Docker

In the example here we were using Docker in Docker (remember the docker:dind line under services?) to build images. Apparently Docker in Docker is hacky and slow. The more fashionable way is to use Kaniko. But I’ll leave this as an exercise for the reader (or the future me).

An Idea for Some Shady Abuses

This blog has already covered how to use Docker for running any executables, now that we know GitLab let’s you run Docker. Thus it is totally possible for you run a malicious container. Except I’m not going to risk my account trying this.


  1. All these $ variables are called predefined variables, they are accessible during the pipeline as environment variables. Read here for a complete list of predefined variables provided by GitLab. If you need customized environment variables, you can use protected environments↩︎


Last modified on 2021-12-14