Docker is a container system. It allows you to run code in a predefined environment that will Run Anywhere™.
So how is it different from a virtual machine? To start a VM, you allocate resources (X bytes of memory, Y CPU cores, etc) and these allocations are absolute. That is, if the VM only needs half its allocated memory or just a few CPU cycles, you can’t remove/add it dynamically.
That creates a lot of waste! It means that your services always use their maximum number of resources. In addition, you have the overhead of emulating their operating system and their hardware.
For the most part, as developers, we really want only a few things:
Make sure processes can’t affect the host operating system. We want our containers to be a jail.
Make sure processes can’t affect one another. So give me isolated memory addresses and file systems.
Give me hard memory/CPU limits, so only use what you need until a certain limit to make sure it doesn’t affect other processes.
That’s containers in a nut shell. Essentially Docker provides this by using:
cgroups: A Linux kernel feature that isolates and limits the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.
apparmor: A Linux security module that allows you to restrict per-program network access, raw socket access or file path access.
aufs (Another Union File System): Imagine a file system that works off of diffs. So every single change is just a diff layered on top of an existing file system. It allows you to “fork” other containers very easily.
Many more cool Linux modules and features
This is why Docker originally could only run on Ubuntu. Only recently can you run it on OSX/Windows without VirtualBox!
Difficult to get it working natively on host OSs
Limited OS choice. (All official Docker images are using Ubuntu distro. Soon they’ll use Alpine Linux)
Okay, so first you need to install Docker. This installs the Docker daemon which controls all of the containers running on your computer.
Assuming your Docker daemon is running, you can now pull the base Ubuntu image:
Images are snapshots of a filesystem. You can push/pull images from a central repository. Docker, being a private company, made the default repo for pulling its own servers. You’ll learn how to push/pull from private container repos later.
Now, we can use the base Ubuntu image to run a command:
That was really fast (slightly longer than a second for me)! So what happened?
The Docker CLI parsed our command a realized that we wanted to run “echo ‘hello’” on the Ubuntu image. It passes that information to the Docker daemon.
The Docker daemon started a process with all of the voodoo magic that isolates it.
It made sure that the process had access to a file system that we pulled (the Ubuntu image).
That process ran our
We can run any other bash command! For example, we could use
ls to explore the filesystem.
Neat, right? So it feels like an actual Linux virtual machine!
What if we want to actually use bash within the container? We can use the ‘-i’ (interactive, keeps STDIN open) and ‘-t’ (pseudo-tty) options:
Now we can have some fun. First, let’s pull the official Docker image for Redis:
Now we can run Redis:
You can see our currently running containers:
And you can stop a running container:
It worth mentioning that docker keeps old containers around! For example:
This means two things:
Docker takes up more and more space. Use
docker rm $(docker ps -a -q).
If you assign a name to docker containers during run, you might see a name conflict. So use
Okay, now things can start to get fun. Let’s say we want to make a Docker container that runs a little flask app. First, let’s make a directory called test-app and make ourselves a little app:
Next, let’s make a Dockerfile. Dockerfiles are files that define a container. They use some set commands defined by Docker:
It’s worth mentioning that we’re using the offical “Python” image as our base. This installs
pip and other goodies for us.
So now we can build our actual Docker image:
You’ll notice that after each command, it makes an intermediate container. Remember: Docker uses AUFS. As you customize your container instances’ file paths, your changes will be layers on top of this base image.
Also, we ran with a
-t option. This “tags” our container so we can reference it more easily. We can see it here:
Now we can run it:
To make a change, simply run docker build again! But that’s an annoying dev cycle…
Okay, let’s say we want our Docker container to link to our host file system:
Ka-pow! Volumes are layer on top of a container, which means that we overwrite the previous version of app.py.
Now let’s try to make our Flask app talk to Redis. First let’s run a redis container:
Cool, now let’s change app.py to talk to redis. First add the pip install for python-redis in our Dockerfile:
Then we build it again:
Then we run our Docker container while linking to the existing redis container:
Now we can change our app.py to talk to redis:
Some people realized that all of this Docker stuff could be made simpler so they made a Python library to do that called “fig”. It was so successful that it became part of docker as “Docker Compose”.
Essentially, it allows you to run several Docker containers at once:
Now we can run
docker-compose up and everything will be running.
Docker containers are powerful for development, but they’re a really powerful idea for deployment as well for a couple reasons:
Allows you to make micro-services super easily
Very easy clustering (docker-swarm, ECS, Kubernetes)
Easy ops: blue-green deployment and rollbacks are easy
There’s still no “set” way of doing things, so here’s an example of a task definition in ECS:
“command”: [“-f”, “/src/logstash.production.conf”],
And the work flow is simply:
Build your image:
docker build -t $ECR_URL/<my_role_name>:VERSION .
Push your image to ECR:
docker push $ECR_URL/<my_role_name>:VERSION
Update your task definition:
aws ecs register-task-definition --cli-input-json file://production.json
Tell your AWS cluster to use the new task definition (either in the UI or the CLI).
It comes with a couple headaches:
Making security definitions and IAM roles
Making your actual cluster instances (different IAM role for this!)
Why on earth aren’t there CNAMEs for ECR urls?