10

We've been experiencing a long-standing networking issue. In short, one container cannot ping (or ssh) another. Does anybody have an extra moment to think along with me?

Our setup:

  • Docker CE 18.06.03 (while trying to fix the issue, we've upgraded from 17.03, but it has not helped)
  • Swarm Classic (Standalone) 1.2.9
  • Consul as a Swarm backend, running with members on five nodes
  • Seven nodes in total, six of which host containers
  • Each container is connected to an overlay network when it is started

What we've tried so far:

This issue has largely stumped us. We've spent a lot of time on it and done much of the basic troubleshooting, and some more advanced troubleshooting (happy to elaborate). (But I don't expect that I've exhausted our options, so please don't hesitate to suggest anything you may think will work.) It's inconsistent (happening to different images, different nodes), intermittent, and long-standing (several months). We've made two changes, one of which was a workaround for MAC address assignment (explained here: https://github.com/docker/libnetwork/pull/2380; the actual workaround: https://github.com/systemd/systemd/issues/3374#issuecomment-452718898), which did improve the situation, including removing MAC address assignment errors from the logs. We also upgraded to get this fix (https://github.com/docker/libnetwork/pull/1935), which deals with IP reuse. This also decreased the problem (at the time, no containers could communicate). I've also run through some basics tests using the netshoot container (let me know if you want more info on that).

We have a workaround for a given container that is broken: we delete the Consul data for this container and then stop and restart it. From what I can tell, it does not seem to be an issue with the Consul data per se but instead comes from Docker/Swarm resetting several network configurations when the container is started (I can say more if this seems to trigger a thought for anybody reading). Then, the container can often ping other containers, but not always.

Specific question:

It seems like there's a window of time during which this can be worse. It's not necessarily tied to starting several containers at once, but there's a somewhat clear pattern: during some windows of time, containers do not get configured properly to communicate with each other. What troubleshooting steps come to mind for you?

The content below is the output from trying to ping one container (82afb0dccbcc) from two other containers. It fails at first, but then is successful.

The first time I try to ping the container, at 2019-12-10T23:57:52+00:00:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
82afb0dccbcc: user___92397089 crccheck/hello-world
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PING 82afb0dccbcc (172.24.0.165) 56(84) bytes of data.^M
^M
--- 82afb0dccbcc ping statistics ---^M
4 packets transmitted, 0 received, 100% packet loss, time 3033ms^M
^M
PING 82afb0dccbcc (172.24.0.165) 56(84) bytes of data.^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=2 ttl=64 time=0.083 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=3 ttl=64 time=0.072 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=4 ttl=64 time=0.073 ms^M
^M
--- 82afb0dccbcc ping statistics ---^M
4 packets transmitted, 3 received, 25% packet loss, time 3023ms^M
rtt min/avg/max/mdev = 0.072/0.076/0.083/0.005 ms^M
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

In this first ping test, above, we note that the packet loss from the first container is 100% and from the second container, it is 25%.

A few minutes later (2019-12-10T23:57:52+00:00), however, 82afb0dccbcc can be successfully pinged from both containers:

82afb0dccbcc: user___92397089 crccheck/hello-world
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ping from ansible-provisioner:
PING 82afb0dccbcc (172.24.0.165) 56(84) bytes of data.^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=1 ttl=64 time=0.056 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=2 ttl=64 time=0.073 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=3 ttl=64 time=0.077 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=4 ttl=64 time=0.087 ms^M
^M
--- 82afb0dccbcc ping statistics ---^M
4 packets transmitted, 4 received, 0% packet loss, time 3063ms^M
rtt min/avg/max/mdev = 0.056/0.073/0.087/0.012 ms^M
ping from ansible_container:
PING 82afb0dccbcc (172.24.0.165) 56(84) bytes of data.^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=1 ttl=64 time=0.055 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=2 ttl=64 time=0.055 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=3 ttl=64 time=0.060 ms^M
64 bytes from user___92397089.wharf (172.24.0.165): icmp_seq=4 ttl=64 time=0.085 ms^M
^M
--- 82afb0dccbcc ping statistics ---^M
4 packets transmitted, 4 received, 0% packet loss, time 3062ms^M
rtt min/avg/max/mdev = 0.055/0.063/0.085/0.015 ms^M
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2
  • Do you have any idea why "ping": executable file not found in $PATH"? It seems more of a filesystem issue in which ping executable is not mounted properly than a networking issue. What do you think? Commented Dec 11, 2019 at 5:27
  • @OmarAl-Ithawi: That line was misleading; I removed it.
    – Brian Dant
    Commented Dec 11, 2019 at 12:24

4 Answers 4

13

You need to create a network and connect both the containers to that network.

The Docker embedded DNS server enables name resolution for containers connected to a given network. This means that any connected container can ping another container on the same network by its container name.

From within container1, you can ping container2 by name.So, its important to explicitly specify names for the containers otherwise this would not work.

Create two containers:

docker run -d --name container1 -p 8001:80 test/apache-php
docker run -d --name container2 -p 8002:80 test/apache-php

Now create a network:

docker network create myNetwork

After that connect your containers to the network:

docker network connect myNetwork container1
docker network connect myNetwork container2

Check if your containers are part of the new network:

docker network inspect myNetwork

Now test the connection, you will be able to ping container2 from container1:

docker exec -ti container1 ping container2
4
  • Thanks @shivani , so Docker doesn't provide guarantee for inter-container communications without creating a network explicitly? Commented Dec 11, 2019 at 9:40
  • 1
    @shivani: Thanks for your answer! We do indeed connect all containers to an overlay network. Sorry to not be more clear about this in my question.
    – Brian Dant
    Commented Dec 11, 2019 at 12:26
  • 2
    @OmarAl-Ithawi - Docker creates three networks automatically on install: bridge, none, and host.By default a container will be created with one network attached.If no network is specified then this will be the default docker0 network. After the container has been created more networks can be attached to a container using the "docker network connect" command. In order to isolate that both the containers are running over the same network I thought to create a new network and connect both of my containers to it.
    – Shivani
    Commented Dec 12, 2019 at 3:25
  • @OmarAl-Ithawi It is not possible to have inter-container communications without creating a network explicitly
    – New Bee
    Commented Jun 21, 2021 at 23:56
3

If you're reaching this answer because the others didn't help, try rebooting your PC :)

I was experiencing something similar right after reinstalling docker. To be more precise, I'm running Ubuntu and had Docker installed using Snap. For some reason I needed it to be installed using apt, so I removed from snap and installed with apt.

  • Started a fresh terminal
  • docker run hello-world runs fine
  • Docker DNS seemed to be working, containers attempted to connect using hostname, correct internal IP was resolved
  • To double check: docker exec -it mycontainer ping othercontainer would show that it tried to ping an internal 172.x.x.x IP
  • I cross referenced the IP with docker inspect - IP correct
  • Ping never came back - timed out

Hail mary: reboot my laptop - works 🎉

1
  • This was exactly my scenario! I removed the docker snap, installed via apt, do hours of setup, and then no containers could talk to each other. It's been driving me nuts.
    – Lambart
    Commented Feb 20 at 1:40
0

I actually ran into this issue randomly, but in my case both containers were already on the same network so it was puzzling me why one container couldn't ping another.

until I ran docker network inspect myNetwork and randomly noticed that for some reason both containers were assigned the SAME mac address... no idea why that happened or even how. Obviously that would preclude pinging since on a LAN mac addresses are used by switching logic to route traffic.

I had to stop and remove the container then recreate it to change the mac address.

0

In case, if there is any webapp is running on any one of your container and you want to ping/call any endpoint from another container and want to use response then you can follow steps as mentioned below -

First establish inter-container communications using docker network

1. docker network create dockerContainerCommunication

Now connect containers to network dockerContainerCommunication

2. docker network connect dockerContainerCommunication container1
3. docker network connect dockerContainerCommunication container2

Now start your containers (if not started)

4. docker start container1
5. docker start container2
  1. Inspect your network. Here you can also find out IP address of the containers. docker network inspect dockerContainerCommunication

  2. Now attach to any one of the container from where you want to use web application, then ping other container using curl + IP address you found out in step 6.

or

docker attach container1 

OR

docker attach container2

and then run curl command

curl http://IP_ADDRESS:PORT_ON_WHICH_APP_IS_RUNNING/api/endpointPath

I hope it helps.

Not the answer you're looking for? Browse other questions tagged or ask your own question.