Managing data in volumes

Kontena 1.2.0 introduced support for managing data in named volumes. This blog post will cover some of the basics around Docker volume concepts and give some details about how new volume support works in Kontena v. 1.2.0 onward.

Want to find out more about this topic? Watch the Introduction to Kontena Volumes Tutorial here.

What is a volume?

Managing persistent data in containers has been somewhat a problem since the inception of the containerization era. Since the early days Docker has provided support for managing data outside of containers by using volumes. A volume is essentially a directory (or a file) that lives outside of the container union file system. As they live outside the union file system they are essentially just normal files and directories on the host. It has been possible to define volumes either in containers, using the -v option or even in images using VOLUME declaration in the Dockerfile.

Data volume containers

Before Docker came up with the concept of named volumes, the only sensible way to decouple the life-cycle of a container and persistent data was to use a pattern called "data volume container". In this pattern there was a dedicated container to only handle the data, in volumes, which could be re-used for multiple containers. The biggest advantage is/was that the life-cycle of application containers is de-coupled from the data. This essentially means that you can upgrade your apps without losing the data they've stored.

Named volumes

Docker 1.8 introduced support for named volumes. A named volume behaves pretty much like the data container, only it's not a container. Essentially it is a re-usable holder for data that can be used in many containers and it naturally de-couples also the life-cycles of containers and data. The cool thing with named volumes is that Docker supports the extensible model where one can use different drivers for volumes. By using different drivers, data persistence can be integrated to different external storage systems, such as AWS S3 for example. This essentially means that data persisted in named volumes is not tied anymore even to a single host and can be possibly used across many hosts.

Volume drivers

In certain cases it's essential that the different containers do not see the same data and the data is replicated at the application level. In some other cases it may be desired that multiple containers in different hosts see the same exact data. To cater to the different needs for volume behaviour, Docker supports a pluggable model for volume drivers. There is a plethora of different volume driver plugins available to be used. Each of the drivers integrate to different external storage systems and thus provide different options. This of course means that it depends on the specific use case which driver can be used and which of the drivers fulfil the requirements.

Volumes with Kontena

From 1.2.0, the Kontena Platform comes with the capability to manage named volumes. It's currently labeled as an experimental feature. Experimental in practice means that some of the details in the APIs and commands might still change a bit. However, we wanted to make the feature available already so we can get feedback from the community and find potential issues and rough corners.

Managing volumes with CLI

Kontena's CLI comes with subcommands to manage volumes. There are the usual commands to create, show details of and to remove a volume. Currently the volumes are managed outside of Kontena Stacks, but the "externally" managed volumes can then be used in different stacks.

Creating a volume

A volume can be created with the kontena volume create --scope instance --driver rexray my-volume command. This creates the needed volume configuration on Kontena Master, which can then be used in services. This does not actually create any Docker named volumes, they are created "on-the-fly" when services using a volume are scheduled to a certain node.

The driver option needs to be specified. It defines the volume driver to be used when actually creating the Docker named volume.

The scope option defines how the volume is created when multiple services and/or service instances use the same volume. Different options for the scopes are described in the documentation in detail.

Using a volume

After a volume has been created with the kontena volume create command, it can be used in services. When a stack wants to use a volume, it needs to declare the use in the volumes section of the stack file:

stack: jussi/redis  
description: Just a simple Redis stack with volume  
version: 0.0.1  
services:  
  redis:
    image: redis:3.2-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis-data:/data

volumes:  
  redis-data:
    external: true

This takes the redis-data named volume into use within the stack.

Alternatively, the volume name within the stack file can be mapped to a different external name using:

volumes:  
  redis-data:
    external:
      name: other-named-vol

Show volume details

kontena volume show my-volume show volume details such as where it is used:

$ kontena volume show redis-data
[EXPERIMENTAL] The `kontena volume` commands are still experimental in Kontena 1.2, and may change in future releases
redis-data:  
  id: test/redis-data
  created: 2017-04-20T12:31:25.199Z
  scope: instance
  driver: local
  driver_opts:
  instances:
    - name: redis.redis-data-1
      node: polished-frost-64
    - name: redis.redis-data-2
      node: proud-breeze-32
    - name: redis.redis-data-3
      node: wandering-rain-51
  services:
    - test/null/redis

Remove a volume

A volume can be removed only after all the services that are using it are removed. This is to prevent accidental data loss.

A volume can be removed with kontena volume rm my-volume.

Scheduling of volumes

When a volume is created with the kontena volume create command, not much actually happens. The volume information is stored on the masters db and that's about it. The actual Docker named volumes are created only when a service using a given volume is deployed to some of the nodes. Naturally, not all the nodes in the grid have all the possible volume drivers installed. When a service using a volume is scheduled, the list of possible nodes is filtered based on the needed volume drivers.

One aspect that also affects volume and service scheduling is the scope of the volume. The scope defines how many instances (actual Docker named volumes) out of the single volume will be created. When the instance scope is used, each of the service instances will get their own volume created from the same definition. This affects the scheduling so that the service instance will be scheduled always on the same node where the related volume instance lives on. This behaves quite similarly to stateful services, but also gives the option to use a specific volume driver for the data.

Future plans

As the experimental flag suggests, this is only the first step towards managed volumes. With the experiences and feedback we get from the volume support, we'll make the necessary adjustments. For example, the Docker volume naming probably needs some fine tuning as there seems to be severe restrictions in certain environments and drivers around the naming of volumes.

One other thing that we have been thinking of, is to provide some kind of option which defines how a volume can, or if it can in the first place, be re-scheduled across nodes. Some of the drivers support cases where a volume can be moved from one node to another. But usually there are some constraints when doing this. For example, with the RexRay driver, AWS EBS volumes can move only within the same AZ where they have been created, so there needs to be some general and flexible enough control options to take these into account while scheduling services and volumes.

Want to find out more about this topic? Watch the Introduction to Kontena Volumes Tutorial here.

Image Credits: Containers by Erwan Hesry.

About Kontena

Want to learn about real life use cases of Kontena, case studies, best practices, tips & tricks? Need some help with your project? Want to contribute to a project or help other people? Join Kontena Forums to discuss more about Kontena Platform, chat with other happy developers on our Slack discussion channel or meet people in person at one of our Meetup groups located all around the world. Check Kontena Community for more details.

Jussi Nummelin

Read more posts by this author.

Tampere, Finland
comments powered by Disqus