This week, one of our testing servers got clogged and stopped responding: the disk was full and it crashed some services. This machine is used only for running Docker containers, so it must be a Docker problem, right?

Kind of.

The Problem

This test server has 100 GiB of disk, and it was full. As it is a “Docker only” server, the first obvious thing to check is how much space Docker used:

$ docker system df
TYPE             TOTAL    ACTIVE    SIZE       RECLAIMABLE
Images           21       18        16GB       10.07GB (60%)
Containers       18       17        812.8MB    15.24MB (1%)
Local Volumes    69       14        6.428GB    395.9MB (6%)
Build Cache      0        0         0B         0B

So, Docker is using about 23 GB. And the volumes are not to blame, they are using only 6.4 GB. Pruning it would free some space and in fact saved the machine. But a few hours later, the disk was full again. Something flooded the disk, and it was not in any Docker container or volume.

I checked the machine logs, and they were small:

$ du -h -d 0 /var/log
1.6G    /var/log

I dug deeper and saw something weird:

$ du -h -d 0 /var/
99G     var/

And then, the guilty service appeared:

$ du -h -d 0 /var/lib/
97.6G   /var/lib
$ du -h -d 0 /var/lib/docker
96.7G   /var/lib/docer

So it was Docker the whole time. But, what else Docker has, besides images, layers, containers, volumes and build cache, as reported by docker system df?

Docker has logs.

By default, Docker saves the output of each container in a json file in /var/lib/docker/containers/<container_ID>/<container_ID>-json.log:

$ du -h /var/lib/docker/containers/*/*.log

315M    /var/lib/docker/containers/016.../016...-json.log
164K    /var/lib/docker/containers/0d1.../0d1...-json.log
51G     /var/lib/docker/containers/0f7.../0f7...-json.log
6G      /var/lib/docker/containers/0f8.../0f8...-json.log
...

Yeah, there were huge log files…

By default:

  • Docker saves the output of each container in json format. That is the json-file log driver.
  • Each *-json.log can grow without any limits.
  • There can be any number of log files.

Possible solutions

There are some ways to solve this problem:

  • periodically remove/trim the logs, with a cron job or logrotate.
  • use an external log aggregator like fluentd.
  • configure Docker to not flood the disk.

First option is nice and generic. Second option is the most robust, but involves setting up another server and configure things. For this server, the third option is good enough.

Let’s tell Docker to keep log files with a maximum of 1GB and to have a maximum of 3 log files per service. To do so, we need to configure the docker daemon. It is as simple as adding these lines to /etc/docker/daemon.json:

{
        "log-driver": "json-file",
        "log-opts": {"max-size": "1g", "max-file": "3"}
}

Now it is time to restart docker. This automagically trimmed all the logs and the machine was usable again. Services were running smoothly and well.

Conclusion

Always rotate your logs.