Published on

Docker for Great Good

Authors

What is Docker?

Docker is a container system. It allows you to run code in a predefined environment that will Run Anywhere™.

So how is it different from a virtual machine? To start a VM, you allocate resources (X bytes of memory, Y CPU cores, etc) and these allocations are absolute. That is, if the VM only needs half its allocated memory or just a few CPU cycles, you can't remove/add it dynamically.

That creates a lot of waste! It means that your services always use their maximum number of resources. In addition, you have the overhead of emulating their operating system and their hardware.

For the most part, as developers, we really want only a few things:

  • Make sure processes can't affect the host operating system. We want our containers to be a jail.

  • Make sure processes can't affect one another. So give me isolated memory addresses and file systems.

  • Give me hard memory/CPU limits, so only use what you need until a certain limit to make sure it doesn't affect other processes.

That's containers in a nut shell. Essentially Docker provides this by using:

  • cgroups: A Linux kernel feature that isolates and limits the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.

  • apparmor: A Linux security module that allows you to restrict per-program network access, raw socket access or file path access.

  • aufs (Another Union File System): Imagine a file system that works off of diffs. So every single change is just a diff layered on top of an existing file system. It allows you to "fork" other containers very easily.

  • Many more cool Linux modules and features

This is why Docker originally could only run on Ubuntu. Only recently can you run it on OSX/Windows without VirtualBox!

Downsides:

  • Difficult to get it working natively on host OSs

  • Security issues

  • Limited OS choice. (All official Docker images are using Ubuntu distro. Soon they'll use Alpine Linux)

Actually using Docker

Okay, so first you need to install Docker. This installs the Docker daemon which controls all of the containers running on your computer.

Assuming your Docker daemon is running, you can now pull the base Ubuntu image:

$ docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
b3e1c725a85f: Pull complete
4daad8bdde31: Pull complete
63fe8c0068a8: Pull complete
4a70713c436f: Pull complete
bd842a2105a8: Pull complete
Digest: sha256:7a64bc9c8843b0a8c8b8a7e4715b7615e4e1b0d8ca3c7e7a76ec8250899c397a

Images are snapshots of a filesystem. You can push/pull images from a central repository. Docker, being a private company, made the default repo for pulling its own servers. You'll learn how to push/pull from private container repos later.

Now, we can use the base Ubuntu image to run a command:

$ docker run ubuntu echo 'hello'
hello

That was really fast (slightly longer than a second for me)! So what happened?

  • The Docker CLI parsed our command a realized that we wanted to run "echo 'hello'" on the Ubuntu image. It passes that information to the Docker daemon.

  • The Docker daemon started a process with all of the voodoo magic that isolates it.

  • It made sure that the process had access to a file system that we pulled (the Ubuntu image).

  • That process ran our echo 'hello' command

We can run any other bash command! For example, we could use ls to explore the filesystem.

$ docker run ubuntu ls
bin
boot
dev
etc
home
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var

Neat, right? So it feels like an actual Linux virtual machine!

What if we want to actually use bash within the container? We can use the '-i' (interactive, keeps STDIN open) and '-t' (pseudo-tty) options:

$ docker run -it ubuntu bash
root@71c819b308a6:/# echo 'hello'
hello
root@71c819b308a6:/# exit

Pulling a prebuilt image and advanced options

Now we can have some fun. First, let's pull the official Docker image for Redis:

$ docker pull redis
Using default tag: latest
latest: Pulling from library/redis

5040bd298390: Pull complete
996f41e871db: Pull complete
a40484248761: Pull complete
a97af2bf2ee7: Pull complete
010c454d55e5: Pull complete
142d4cb3dc08: Pull complete
6666ac0e527e: Pull complete
Digest: sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73
Status: Downloaded newer image for redis

Now we can run Redis:

$ docker run -p 6379:6379 -d redis
ce59c7fa1ed003366d39b93137d3713f286017dd6c264b12fe987b805dc4d067

$ redis-cli
127.0.0.1:6379> set 1 1
OK
127.0.0.1:6379> get 1
"1"

You can see our currently running containers:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
ce59c7fa1ed0        redis               "docker-entrypoint.sh"   7 seconds ago       Up 5 seconds        0.0.0.0:6379->6379/tcp   sad_sinoussi

And you can stop a running container:

$ docker stop ce59c7fa1ed003366d39b93137d3713f286017dd6c264b12fe987b805dc4d067
# You can also use `docker stop sad_sinoussi`, the randomly generated name.

It worth mentioning that docker keeps old containers around! For example:

$ # We can also do docker start ce59c7fa1ed003366d39b93137d3713f286017dd6c264b12fe987b805dc4d067
$ docker start sad_sinoussi

This means two things:

  1. Docker takes up more and more space. Use docker rm $(docker ps -a -q).

  2. If you assign a name to docker containers during run, you might see a name conflict. So use --rm

Making our own Docker container

Okay, now things can start to get fun. Let's say we want to make a Docker container that runs a little flask app. First, let's make a directory called test-app and make ourselves a little app:

#!/usr/bin/python -u
# app.py

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello World!"

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0')

Next, let's make a Dockerfile. Dockerfiles are files that define a container. They use some set commands defined by Docker:

# Dockerfile
FROM python

RUN pip install flask

# Sets the working directory for any RUN/CMD/COPY/ADD instructions
WORKDIR /code/
ADD ./app.py /code/

CMD ["python", "./app.py"]

It's worth mentioning that we're using the offical "Python" image as our base. This installs pip and other goodies for us.

So now we can build our actual Docker image:

$ docker build -t test-app .
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM python
latest: Pulling from library/python

5040bd298390: Already exists
fce5728aad85: Pull complete
76610ec20bf5: Pull complete
52f3db4b5710: Pull complete
45b2a7e03e44: Pull complete
75ef15b2048b: Pull complete
e41da2f0bac3: Pull complete
Digest: sha256:cba517218b4342514e000557e6e9100018f980cda866420ff61bfa9628ced1dc
Status: Downloaded newer image for python:latest
 ---> 775dae9b960e
Step 2 : RUN pip install flask
 ---> Running in 6d8dae25e191
Collecting flask
  Downloading Flask-0.12-py2.py3-none-any.whl (82kB)
Collecting click>=2.0 (from flask)
  Downloading click-6.7-py2.py3-none-any.whl (71kB)
Collecting itsdangerous>=0.21 (from flask)
  Downloading itsdangerous-0.24.tar.gz (46kB)
Collecting Jinja2>=2.4 (from flask)
  Downloading Jinja2-2.9.4-py2.py3-none-any.whl (274kB)
Collecting Werkzeug>=0.7 (from flask)
  Downloading Werkzeug-0.11.15-py2.py3-none-any.whl (307kB)
Collecting MarkupSafe>=0.23 (from Jinja2>=2.4->flask)
  Downloading MarkupSafe-0.23.tar.gz
Installing collected packages: click, itsdangerous, MarkupSafe, Jinja2, Werkzeug, flask
  Running setup.py install for itsdangerous: started
    Running setup.py install for itsdangerous: finished with status 'done'
  Running setup.py install for MarkupSafe: started
    Running setup.py install for MarkupSafe: finished with status 'done'
Successfully installed Jinja2-2.9.4 MarkupSafe-0.23 Werkzeug-0.11.15 click-6.7 flask-0.12 itsdangerous-0.24
 ---> 2882709cbb5b
Removing intermediate container 6d8dae25e191
Step 3 : WORKDIR /code/
 ---> Running in da733427a997
 ---> 2b0774947e31
Removing intermediate container da733427a997
Step 4 : ADD ./app.py /code/
 ---> a44e95d734c5
Removing intermediate container 4bb2afe5864a
Step 5 : CMD python ./app.py
 ---> Running in 5f12142bf674
 ---> 7ae77de3b7b9
Removing intermediate container 5f12142bf674
Successfully built 7ae77de3b7b9

You'll notice that after each command, it makes an intermediate container. Remember: Docker uses AUFS. As you customize your container instances' file paths, your changes will be layers on top of this base image.

Also, we ran with a -t option. This "tags" our container so we can reference it more easily. We can see it here:

$ docker images test-app
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
test-app            latest              7ae77de3b7b9        5 minutes ago       697.4 MB

Now we can run it:

$ docker run --rm -p 5000:5000 test-app
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger pin code: 100-366-731
172.17.0.1 - - [20/Jan/2017 17:26:27] "GET / HTTP/1.1" 200 -
172.17.0.1 - - [20/Jan/2017 17:26:27] "GET /favicon.ico HTTP/1.1" 404 -

To make a change, simply run docker build again! But that's an annoying dev cycle...

Okay, let's say we want our Docker container to link to our host file system:

$ docker run --rm -v `pwd`/app.py:/code/app.py -p 5000:5000 test-app

Ka-pow! Volumes are layer on top of a container, which means that we overwrite the previous version of app.py.

Now let's try to make our Flask app talk to Redis. First let's run a redis container:

$ docker run --name my-redis -d redis
55a0fb5de6136d3cb4fa772be0a5376a5a33b39d75ed7d08dc158b38ae55165c
$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
55a0fb5de613        redis               "docker-entrypoint.sh"   5 seconds ago       Up 4 seconds        6379/tcp   my-redis

Cool, now let's change app.py to talk to redis. First add the pip install for python-redis in our Dockerfile:

##  Dockerfile
FROM python

RUN pip install flask redis

WORKDIR /code/
ADD ./app.py /code/

CMD ["python", "./app.py"]

Then we build it again:

$ docker build -t test-app .
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM python
 ---> 775dae9b960e
Step 2 : RUN pip install flask redis
 ---> Running in 248d0b09139f
Collecting flask
  Downloading Flask-0.12-py2.py3-none-any.whl (82kB)
Collecting redis
  Downloading redis-2.10.5-py2.py3-none-any.whl (60kB)
Collecting click>=2.0 (from flask)
  Downloading click-6.7-py2.py3-none-any.whl (71kB)
Collecting Jinja2>=2.4 (from flask)
  Downloading Jinja2-2.9.4-py2.py3-none-any.whl (274kB)
Collecting Werkzeug>=0.7 (from flask)
  Downloading Werkzeug-0.11.15-py2.py3-none-any.whl (307kB)
Collecting itsdangerous>=0.21 (from flask)
  Downloading itsdangerous-0.24.tar.gz (46kB)
Collecting MarkupSafe>=0.23 (from Jinja2>=2.4->flask)
  Downloading MarkupSafe-0.23.tar.gz
Installing collected packages: click, MarkupSafe, Jinja2, Werkzeug, itsdangerous, flask, redis
  Running setup.py install for MarkupSafe: started
    Running setup.py install for MarkupSafe: finished with status 'done'
  Running setup.py install for itsdangerous: started
    Running setup.py install for itsdangerous: finished with status 'done'
Successfully installed Jinja2-2.9.4 MarkupSafe-0.23 Werkzeug-0.11.15 click-6.7 flask-0.12 itsdangerous-0.24 redis-2.10.5
 ---> d91f2e311d7e
Removing intermediate container 248d0b09139f
Step 3 : WORKDIR /code/
 ---> Running in 0b54d809edc6
 ---> 2a10bad9b3a6
Removing intermediate container 0b54d809edc6
Step 4 : ADD ./app.py /code/
 ---> 0328f86aa26c
Removing intermediate container 016781d53f5e
Step 5 : CMD python ./app.py
 ---> Running in f2bb51177fd8
 ---> 9ed492fd28ab
Removing intermediate container f2bb51177fd8
Successfully built 9ed492fd28ab

Then we run our Docker container while linking to the existing redis container:

$ docker run --rm -v `pwd`/app.py:/code/app.py -p 5000:5000 --link my-redis:redis test-app

Now we can change our app.py to talk to redis:

#!/usr/bin/python -u
# app.py

from flask import Flask
import redis

app = Flask(__name__)

@app.route("/")
def hello():
    r = redis.StrictRedis(host='redis', port=6379, db=0)
    num = r.get('count')
    num = int(num) + 1 if num else 1
    r.set('count', num)
    return "Hello {}!".format(num)

if __name__ == "__main__":
    app.run(debug=True, host='0.0.0.0')

docker-compose

Some people realized that all of this Docker stuff could be made simpler so they made a Python library to do that called "fig". It was so successful that it became part of docker as "Docker Compose".

Essentially, it allows you to run several Docker containers at once:

# docker-compose.yml

web:
    build: .
    command: python app.py
    volumes:
    - ./app.py:/code/app.py
    links:
    - redis
    ports:
    - "5000:5000"
redis:
    image: redis

Now we can run docker-compose up and everything will be running.

Docker in production

Docker containers are powerful for development, but they're a really powerful idea for deployment as well for a couple reasons:

  1. Allows you to make micro-services super easily

  2. Very easy clustering (docker-swarm, ECS, Kubernetes)

  3. Easy ops: blue-green deployment and rollbacks are easy

  4. Autoscaling

There's still no "set" way of doing things, so here's an example of a task definition in ECS:

{
  "family": "logstash-production",
  "taskRoleArn": "arn:aws:iam::317260457025:role/pro-logstash-task",
  "networkMode": "bridge",
  "containerDefinitions": [
    {
      "name": "logstash",
      "image": "317260457025.dkr.ecr.us-east-1.amazonaws.com/search/logstash:1.0.1",
      "cpu": 512,
      "memory": 1000,
      "essential": true,
      "command": ["-f", "/src/logstash.production.conf"],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "logstash-production",
          "awslogs-region": "us-east-1"
        }
      }
    }
  ],
  "placementConstraints": [],
  "volumes": []
}

And the work flow is simply:

  • Build your image: docker build -t $ECR_URL/<my_role_name>:VERSION .

  • Push your image to ECR: docker push $ECR_URL/<my_role_name>:VERSION

  • Update your task definition: aws ecs register-task-definition --cli-input-json file://production.json

  • Tell your AWS cluster to use the new task definition (either in the UI or the CLI).

It comes with a couple headaches:

  • Making security definitions and IAM roles

  • Making your actual cluster instances (different IAM role for this!)

  • Why on earth aren't there CNAMEs for ECR urls?