2

docker build is very slow here in a - what I would call - standard setup environment. Environment is like this:

  • Virtual Machine
  • docker CE 18.09.0, git commit 4d60db4 running on a linux amd64 (Ubuntu 16.04.5 LTS)
  • Backing file system is ext4
  • Storage driver is overlay2
  • The machine has 4 GB of RAM, 2 vCPUs (Xeon E7-8880 v4)

What I do is I run the following Dockerfile

FROM ubuntu:18.04

ARG uid=1000
ARG gid=1000
RUN echo "${gid} ${uid}"

# Some further ENV stuff

RUN DEBIAN_FRONTEND=noninteractive && \
   apt-get update && apt-get -y install -y tzdata && \
   ln -fs /usr/share/zoneinfo/UTC /etc/localtime && \
   dpkg-reconfigure --frontend noninteractive tzdata && \
   apt-get -y install -y apt-transport-https ca-certificates \
     wget curl unzip vim git less mc jed \
     python python-openstackclient  && \
   apt-get -q autoremove && \
   apt-get -q clean -y && \
   rm -f /var/lib/apt/lists/*

# There are further RUN commands below, but they don't matter

I then build the docker image by calling

$ docker build -t test:latest .

It is obvious that installing python (and other stuff) will create a large number of small files on the build container/image. Yet, what happens is the following:

  • The FROM/ARGs and the first RUN command goes through like nothing (as expected)
  • Hitting the second RUN command, it is expected that this will take a little, as we install quite a large bunch of applications via the package manager. apt and all the tools roughly require two minutes to complete (a runtime I would be totally okay with).

After having written the last line to the console

0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded

the system takes ages to complete. I already kept it running for 4+ hours, but the next layer (the next RUN) command had not being reached, yet.

In that state, checking the system load, you see that the entire system is idling (loadavg < 0.1). There is more than 1GB of free RAM. Once in a very while you see dockerd briefly doing something, but with <2% CPU load. Moreover, checking IO load (using iotop) I see both Total DISK READ and Total DISK WRITE to be around 30 K/s (which is nothing). The underlying disk beneath the backing file system is known to be SSD (thus, we can safely assume that I/O throughput of 80+ MB/s should be possible). Also, if I manually test-read and test-wrote to /var/lib/docker (where the backing file system is mounted), I get write speeds of 200+ MB/s. The backing file system still has more than 10 GB of free space.

Also checking ps auxw | grep docker I see that docker build is in a blocked state and the sub-process

docker-untar /var/lib/docker/overlay2/0475df....07/diff

is running, but in state "stopped". In sum after 2+ hrs of wall-clock time, this process has consumed less than 2 CPU-seconds so far. Looking into the diff dir mentioned by docker-untar above, making several snapshots of du -sh ., I can see that the data volume in that directory increases very slowly (rate ~1 MB / 3 minutes, that's roughly 5,5 kb/s - even ancient floppy disks were faster than this!). At the time of writing, there were 103 MB in it. A find . | wc -l resulted in the value of 5157, equally only going up very slowly (162 files had been written during those 3 minutes it took to generate 1 MB of data).

Running containers on the same machine and reading/writing to the container's attached image is without problem.

I also have access to a very similar virtual machine (with even more CPU/RAM). However, also there, the same picture can be observed.

Has anyone an idea what could be misconfigured here, such that docker is not using the free CPU&I/O capacity of the machine?

8
  • If you manually run each command. Does it behave the same?
    – Ramhound
    Commented Jun 1, 2019 at 14:26
  • Yes, it does. Apparently, it only gets stuck on committing the layer after all command parts (of the second RUN) have been completed. I also can adjust the Dockerfile (e.g. adding/removing packages), but ending up with the same problem. Only if I make the RUN command trivial, then also the commit of the layer is fast. Commented Jun 1, 2019 at 14:29
  • Have you considered just reporting the issue as a bug?
    – Ramhound
    Commented Jun 1, 2019 at 14:32
  • No, not yet (considered raising a bug) Commented Jun 1, 2019 at 14:34
  • 1
    FWIW, runs in about 3mns on my (native) Linux machine (after fixing the final rm that fails because there are directories...)(Ubuntu 16.04, docker 18.06.0-ce). Generated layer is 380MB.
    – xenoid
    Commented Jun 1, 2019 at 21:03

1 Answer 1

0

Just for the sake of completeness here: After endless hours of debugging, it turned out that the McAfee virus scanner used on the server had a resource leak, which made it slow down in general. I did not notice that in first place, as there were no other I/O-intensive operations running on the server. The workaround I applied was to "cron" a restart of the virus scanner every night. Since then this issue never surfaced again.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .