DevSecOps Course Labs

Building Container Images

Images are the packages which containers run from. You'll build an image for each component of your application, and the image has all the pre-requisites installed and configured, ready to run.

You can think of images as the filesystem for the container, plus some metadata which tells Docker which command to run when the container starts.

Reference

CLI overview

You use the image commands to work with images. The build command also has aliases:

docker image --help

docker build --help

docker history --help

Base images

Images can be built in a hierarchy - you may start with an OS image which sets up a particular Linux distro, then build on top of that to add your application runtime.

Before we build any images we'll set the Docker to use the original build engine:

# on macOS or Linux:
export DOCKER_BUILDKIT=0

# OR with PowerShell:
$env:DOCKER_BUILDKIT=0

BuildKit is the new Docker build engine. It produces the same images as the original engine, but the printed output doesn't show the Dockerfile instructions being executed. We're using the original Engine so you can see all the steps.

We'll build a really simple base image:

docker build -t courselabs/base ./labs/images/base

📋 List all the images you have - then filter them for images starting with 'courselabs'.

Not sure how?
# list all local images:
docker image ls

# and filter for the courselabs images:
docker image ls 'courselabs/*'

These are the images stored in your local Docker engine cache.

The new base image doesn't add anything to the official Ubuntu image, which is available in lots of different versions.

📋 Pull the main Ubuntu image, then pull the image for Ubuntu version 20.04.

Not sure how?
docker pull ubuntu

# image versions are set in the tag name:
docker pull ubuntu:20.04

List all your Ubuntu images and your own base image:

docker image ls --filter reference=ubuntu --filter reference=courselabs/base

You'll see they all have the same ID - they're actually all aliases for a single image

Commands and entrypoints

The Dockerfile syntax is straightforward to learn:

Here's a simple example which installs the curl tool:

📋 Build an image called courselabs/curl from the labs/images/curl Dockerfile.

Not sure how?
docker build -t courselabs/curl ./labs/images/curl

Now you can run a container from the image, but it might not behave as you expect:

# just runs curl:
docker run courselabs/curl 

# doesn't pass the URL to curl:
docker run courselabs/curl docker.courselabs.co

# to use curl you need to specify the curl command:
docker run courselabs/curl curl --head docker.courselabs.co

The CMD instruction sets the exact command for a container to run. You can't pass options to the container command - but you can override it completely.

This updated Dockerfile makes a more usable image:

Build a v2 image from that Dockerfile:

docker build -t courselabs/curl:v2 -f labs/images/curl/Dockerfile.v2 labs/images/curl

You can run containers from this image with more logical syntax:

docker run courselabs/curl:v2 --head docker.courselabs.co

The --head argument and URL in the container command gets passed to the entrypoint

📋 List all the courselabs/curl images to compare sizes.

Not sure how?
docker image ls courselabs/curl

The v2 image is smaller - which means it has less stuff in the filesystem and a smaller attack surface.

Image hierarchy

You don't typically use OS images as the base in your FROM image. You want to get as many of your app's pre-requisites already installed for you.

You should use official images, which are application and runtime images which are maintained by the project teams.

This Dockerfile bundles some custom HTML content on top of the official Nginx image:

docker build -t courselabs/web ./labs/images/web --pull

The pull argument tells Docker to download the latest version of the FROM image before it starts the build.

📋 Run a container from your new image, publishing port 80, and browse to it.

Not sure how?
# use any free local port, e.g. 8090:
docker run -d -p 8090:80 courselabs/web

curl localhost:8090

The container serves your HTML document, using the Nginx setup configured in the official image

Docker images are composed of layers. An image is one logical package, but it's physically stored as multiple small files, which are the layers.

Layers are read-only and they can be shared between images - if you build your images correctly you'll get very efficient use of disk space and network bandwidth.

Inspect your web image and you'll see the layer IDs at the end of the output:

docker inspect courselabs/web

📋 Inspect the image which your web image is based from. Do they have any shared layers? How about the image that image is based on?

Not sure how?

The base image is Nginx running on Alpine:

docker pull nginx:1.21.4-alpine

docker inspect nginx:1.21.4-alpine

And you can check Alpine too:

docker pull alpine:3.14

docker inspect alpine:3.14

You can see all the shared layers - the web image builds on top of the Nginx layers, and the Nginx image builds on top of the Alpine layers.

Here's the full image hierarchy:

alpine:3.14
└─ nginx:1.21.4-alpine
    └─ courselabs/web 

Some images include the full audit in their tag - e.g. golang:1.17.3-alpine3.15. The Nginx tag doesn't include the Alpine OS version, so you need to figure that out by trial and error (or knowing which OS version was current when the image was built).

Lab

Your turn to write a Dockerfile.

There's a simple Java app in this folder which has already been compiled into the file labs/images/java/HelloWorld.class.

Build a Docker image which packages that app, and run a container to confirm it's working. The command your container needs to run is java HelloWorld.

Stuck? Try hints or check the solution.


Cleanup

Cleanup by removing all containers:

docker rm -f $(docker ps -aq)