0

Two Linux machines (Ubuntu 20.04 & 18.04.4 running on vSphere) behind NAT are not able to complete the TLS handshake to https://microsoft.com and https://mcr.microsoft.com, while they are able to connect to https://docs.microsoft.com and other websites. I manage the firewall and NAT myself (virtual VyOS router). Another system behind the same NAT is able to access https://microsoft.com. I already tried to update all packages, and used curl --resolve to try different Microsoft IPs.

I found this when trying to use the container registry:

 Pulling docker image mcr.microsoft.com/dotnet/core/sdk:3.1 ...
ERROR: Preparation failed: Error response from daemon: Get https://mcr.microsoft.com/v2/: net/http: TLS handshake timeout (executor_docker.go:188:10s)

curl -i https://docs.microsoft.com
HTTP/2 301
location: /en-us/
(...)

curl -i http://microsoft.com
HTTP/1.1 301 Moved Permanently
(...)

curl -vv https://microsoft.com
* Rebuilt URL to: https://microsoft.com/
*   Trying 104.215.148.63...
* TCP_NODELAY set
* Connected to microsoft.com (104.215.148.63) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
^C

Debugging with openssl shows that it hangs.

openssl s_client -connect microsoft.com:443 -debug
CONNECTED(00000003)
write to 0x563308e92000 [0x563308ea2500] (305 bytes => 305 (0x131))
0000 - 16 03 01 01 2c 01 00 01-28 03 03 68 a0 18 e1 33   ....,...(..h...3
0010 - 14 d3 6f ef 02 17 72 c0-8e fd 4a 94 bd 03 38 ba   ..o...r...J...8.
0020 - 7b b3 91 a0 67 6c 85 94-e8 2b fb 20 a6 15 74 4c   {...gl...+. ..tL
0030 - 8c 8a 3f e8 31 05 a1 0d-f5 65 dc 70 f9 96 b0 03   ..?.1....e.p....
0040 - 33 32 12 32 1e 72 29 0f-ef 30 80 19 00 3e 13 02   32.2.r)..0...>..
0050 - 13 03 13 01 c0 2c c0 30-00 9f cc a9 cc a8 cc aa   .....,.0........
0060 - c0 2b c0 2f 00 9e c0 24-c0 28 00 6b c0 23 c0 27   .+./...$.(.k.#.'
0070 - 00 67 c0 0a c0 14 00 39-c0 09 c0 13 00 33 00 9d   .g.....9.....3..
0080 - 00 9c 00 3d 00 3c 00 35-00 2f 00 ff 01 00 00 a1   ...=.<.5./......
0090 - 00 00 00 12 00 10 00 00-0d 6d 69 63 72 6f 73 6f   .........microso
00a0 - 66 74 2e 63 6f 6d 00 0b-00 04 03 00 01 02 00 0a   ft.com..........
00b0 - 00 0c 00 0a 00 1d 00 17-00 1e 00 19 00 18 00 23   ...............#
00c0 - 00 00 00 16 00 00 00 17-00 00 00 0d 00 2a 00 28   .............*.(
00d0 - 04 03 05 03 06 03 08 07-08 08 08 09 08 0a 08 0b   ................
00e0 - 08 04 08 05 08 06 04 01-05 01 06 01 03 03 03 01   ................
00f0 - 03 02 04 02 05 02 06 02-00 2b 00 05 04 03 04 03   .........+......
0100 - 03 00 2d 00 02 01 01 00-33 00 26 00 24 00 1d 00   ..-.....3.&.$...
0110 - 20 ce b9 90 2d 17 37 38-46 47 83 cd 06 b5 82 25    ...-.78FG.....%
0120 - 91 ee c1 5a d5 e2 53 62-26 6d 19 59 48 c8 f0 2f   ...Z..Sb&m.YH../
0130 - 6f                                                o
# Here it hangs for a few minutes
^C

Meanwhile I ran tcpdump 'host microsoft.com', see this WireShark view.

What can I do to debug this?

1 Answer 1

0

Configuring TCP-MSS Clamping in the VyOS firewall solves the problem. I suppose the WireGuard tunnel makes the maximum segment size smaller, causing PMTU discovery to fail.

set firewall options interface wg02 adjust-mss '1372'

This link is a related question with an pppoe interface. That discussion helped me to find the solution.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .