Docker and Bitbucket Pipelines

There are two motivations behind this post: the first is that I wanted to learn more about Docker. The second is that Bitbucket recently released a feature called Pipelines, which allows you to add continuous integration to your projects. I already had a test suite backing ProjectNom, so the idea of automating that into a CI tool was an attractive goal.

So let’s start at the beginning: Docker.

I’d already heard enough about Docker to understand the general concept. If you’re new to Docker, they have a good introduction on their website. While my end goal was to use Docker as a container for running ProjectNom tests, I also wanted to see if I could use it as a fully-functional, isolated development environment.

ProjectNom runs on a standard LAMP stack, so the first thing I did was look for an image that could get me as close to that environment as possible. Luckily, such a thing existed in the form of linode/lamp.

With the core technology in place, I just needed to hook up the site to Apache and the database to MySQL.

Apache
Ideally, the site mounted in Apache should be the latest code from git. Luckily, this is an easy task in Docker. When you start a Docker image, you can mount any local directory to the container’s file system. For example, starting a Docker image like this:

docker run --volume=/Users/perrina/Documents/ProjectNom:/localDebugRepo projectnom

… would mount the local path /Users/perrina/Documents/ProjectNom to the Docker path /localDebugRepo. This allows the Docker container to use whatever code is on the local disk — whether it’s a stable release in master, or an experimental branch that implements a new feature.

That’s all well and good, but there is one complication to this setup: Apache needs to serve up /localDebugRepo instead of whatever path linode/lamp serves up by default.

This opens up the first core concept in Docker: images should be lightweight and generic. This allows for a library of standard starting points which can then be customized on-the-fly as needed. This customization is achieved with a Dockerfile — a simple script that describes how to morph a generic image into something better suited for the task at hand. The important note here is that nothing is permanently altered: the original image remains as-is, and is always the starting point whenever the image is started.

But, that said, there are times when you truly do need to modify the original image — when you need to fork from the original to enhance its core functionality. In this case, the Apache configuration needed to be modified to point to the new mounted directory.

First, I had to start Docker in a way that let me interact directly with the container¹ instead of running in the background:

docker run -it --volume=/Users/perrina/Documents/ProjectNom:/localDebugRepo projectnom

Note the -it option added here. That instructs Docker to make the container interactive, and attach a TTY. Once inside the container itself, we can navigate to Apache’s configuration, modify the document root, and exit back to the host OS. To save our changes, we commit the modification:

docker commit -m "Modifying Apache configuration." -a "Andrew Perrin" 984d3c873dc9

There’s a couple things going on here:
Commit? Just like with source control, changes can be committed at specific points to track changes and rollback as necessary.
-m: A free text commit message — again, this is just like source control. Ideally this describes the changes you made so that it’s easier to track the image’s history.
-a: The author of the changes.
984d3c873dc9: The command docker ps will list currently running containers, each with a unique container ID — in this case it’s 984d3c873dc9. Referencing the container ID tells Docker exactly what changes you want to commit.

¹ Side note: You may have noticed both the term image and container being used. What’s the difference? An image describes an executable package that includes everything needed to run a piece of software. A container is a running instance of a particular image. When we commit a change, what we’re actually doing is saving a container’s changes back to the image.

Setting up Apache was easy enough. Now for something a bit more complicated: the database.

MySQL
If we want to follow Docker best practices, we’d want to treat our MySQL database the same way we treat Apache: the actual data is mounted from the outside when the container is started². The primary reason for this is that Docker containers are ephemeral: when they stop, all changes to the container are lost (unless committed to an image, as described earlier). In most real-world use cases, you wouldn’t want your site’s files reverting to an earlier state, and you certainly wouldn’t want your database rolling back.

² Side note: Strictly speaking, if we want to follow another of Docker’s core concepts, we’d break up this linode/lamp container into separate containers that each cater to their own concern — e.g., a container that runs Apache and another container that runs MySQL. But, having everything in one container is a good way to get familiar with Docker’s environment and usage.

My use case is slightly different, though: this image will be used primarily for running tests and, if these tests affect the database, then we don’t actually want to persist the database changes. So I decided to save the MySQL data in the image itself. Once a representative set of data was added and committed, anything added or removed by tests wouldn’t be preserved, allowing the tests to run predictably even in the event of a failure in an earlier run.

Making this change to our image is similar to modifying the Apache configuration file — start a container in interactive mode, make the necessary changes to MySQL, and then commit the changes. One additional tool that can help with this process is adding files to the container. If you already have MySQL data files, or a script to initialize a new MySQL database, then you can add that to the container while it’s running. Here’s how you would copy over a SQL script, for example:

docker cp /Users/perrina/Documents/pn-db-init.sql projectnom:~/pn-db-init.sql

Up to this point, we have a Docker image that allows the site to run with sample data. If the database ever gets messed up, a simple restart of the container restores everything to how it was.

Bitbucket Pipelines Compatibility
There were a few more steps that I had to take in order to get this image working with my unit tests and Bitbucket Pipelines.

The majority of ProjectNom’s tests are powered by Javascript, and use a combination of PhantomJS/CasperJS to simulate interaction with the site and assert various behaviors. These behaviors also made some assumptions that had to be added to the image. Here’s a summary of how the image was changed to meet these requirements:

Obviously, PhantomJS and CasperJS had to be installed in the image. Luckily, both were easy enough to copy over from my MacOS install to the Ubuntu image that linode/lamp is based on.
There are various resources the tests rely on, such as certain images existing on the file system (to test image upload and cached resources). Pipelines allows you to define a bitbucket-pipelines.yml file to describe how tests should be run, so after committing the necessary files to the Docker image, I just ensure those items get copied to the right place before tests are run.
Earlier, we mounted the site’s code to /localDebugRepo. That name actually came from Bitbucket’s “how to test your Docker image locally”, but isn’t what Bitbucket’s build agents use. They mount to a different place on the file system, and so I have a separate Apache configuration file just for Bitbucket’s build agents to use. It is copied over before the server is started and tests are run.
Since linode/lamp is an Ubuntu-based image, it’s possible to use apt-get in order to install or update packages. I take advantage of this for installing NodeJS and Gulp so that the front-end code can be compiled.
Finally, the yml was updated to start Apache and MySQL before the test script is run.

Where to go from here
Setting up this Docker image was exactly what I needed to get familiar with how it works. As someone who spends a lot of time using VMs for development, Docker can take some getting used to. Docker containers are more fluid and practically invisible, which is completely counter to how I interact with my VMs. It’s also worth noting that while the concept of “image” and “container” makes sense logically, the Docker CLI is very poor at specifying which commands affect images, and which commands affect containers. As a result, the learning curve is higher than it needs to be.

I readily admit that my final solution isn’t ideal. Long term, I intend to swap out the linode/lamp image I’m currently using for something more maintainable: a barebones Ubuntu image with a couple of Dockerfiles. Instead of the one container, there would be several: one for Apache, one for MySQL, and one with the dependencies needed to run the CasperJS tests. Each of these containers would be on the same virtual network, allowing them to communicate with each other. It would also be nice for ProjectNom’s tests to not have any external dependencies, but when you’re testing something like image uploads, that can be difficult to achieve.

I will be sure to write a follow up post with anything else I learn as I make these changes.

andrewperr.in blog

curious engineer

andrewperr.in blog

Docker and Bitbucket Pipelines

Leave a Reply Cancel reply