1

I have device A (10.10.25.1), which is connected to device B(10.10.25.52) via unmanaged switch. Device A is a computer. Device B is measurement instrument. Device A is continuously(every 1sec) sending command and device B is responding with values. The traffic is not high. There are 5 such measurement devices connected to device A.

Occasionally I have an error on device A for random device B. After I made network capture I see some retransmissions on the problematic connection. On A side, there was no apparent error on socket send and there was also no data received, which caused timeout and transmission halt.

What I am trying to understand is whether the problem is on device A side or B. I've attached captured file (Wireshark) and the problem is from frame 22186.

enter image description here

https://drive.google.com/file/d/1EA9fCFK7hK0hfA5gSpgwWaEgVozstbVi/view?usp=sharing

UPDATE:

enter image description here

3
  • You neglect to mention the Ethernet speed. FWIW a NIC will discard bad Ethernet frames it receives, so Wireshark cannot report such receive problems. Presumably you are using Wireshark on computer A, and end up with a very limited perspective. A while back I had a similar project, and swapped out the switch for a (real) hub and lowered the speed down to 10Base-T. Attached another PC to the network to capture all the frames transmitted by all hosts. The biggest problem encountered was a NAPI rotting-packet issue/bug in a Linux Ethernet driver.
    – sawdust
    Commented Oct 3, 2023 at 22:11
  • A has Gb network interface and running Windows 10. It's an industrial PC. Indeed capture is made on A. Switch is also Gb. All B devices are 100Mb. Tomorrow I'm going to change the switch, which is also industrial, with 24V power supply to regular unmanaged one (Netgear/Cisco) to rule out probability of faulty switch. If no progress, I'll certainly connect 2nd host to capture all traffic.
    – Pablo
    Commented Oct 3, 2023 at 22:37
  • "I'll certainly connect 2nd host to capture all traffic." -- Easier said than done. You may need a managed switch; an unmanaged switch definitely won't do. A 3rd-party host will not get frames destined to a MAC address that the switch recognizes. Whereas a real (& dumb) hub sends every frame out to every port.
    – sawdust
    Commented Oct 3, 2023 at 23:01

1 Answer 1

0

That's impossible to say. The segment may have been lost on the source device, on the switch, or even on Device A's NIC. The latter is likely just counted on Device A (as bad frame) but might not be captured.

The most likely reason is a drop due to failing checksum check (FCS) and the most likely reason for that is bad cabling or massive interference.

With a managed switch you could check whether the port error counters match the retransmissions you see. Another option is to rigidly test your cabling using full-speed, full-duplex transmission, e.g. using iperf3.

3
  • The cables from A to switch and switch to B are no more than 1M each. They are tested with professional cable quality tester device. Also it happens randomly on all 5 devices. There is also no other device to interfere. All devices are situated in 19" rack cabinet. My doubts are 1. TCP/IP client on device A 2. Bug in firmware of B devices. I will also check with iperf3, that's a good idea.
    – Pablo
    Commented Oct 3, 2023 at 10:18
  • They are tested with professional cable quality tester device. - are you sure? A professional cable tester/certifier costs at least $800. But if you've already checked/swapped cables you can likely rule them out.
    – Zac67
    Commented Oct 3, 2023 at 10:32
  • FLUKE liq-100 is what is used for cables here.
    – Pablo
    Commented Oct 3, 2023 at 22:57

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .