I tried to acquire 10GBit/s of UDP traffic most of the time lossless (the measurement device is not capable to hold so much data in the own cache, so TCP is not a solution here).
To test if I can acquire 10GBit/s of UDP traffic I bought 2 Intel X550-T2 NIC's and connected them directly over an 2 Meter Cat7 Cable. But I couldn't bring the receiver side to use more than one CPU to acquire the data over UDP. Using TCP strangely uses all 8 configured RSS queues also I get 10 GBit/s of traffic.
Using Ubuntu Linux 20.04 works also very well (10GBit/s UDP traffic 0% loss).
My question is now, how can I use the RSS also for UDP on Windows 10? And if it does not work, why?
Receiver hardware:
Processor: AMD Ryzen 3900X
Motherboard: X570a-pro
RAM: 64 GB
Configuration:
Windows 10 Professional
Hyperthreading / SMT: Disabled (I don't know if this is required, but Microsoft states, that RCC only uses non SMT processors)
DMA-Coalescing: OFF
Receivebuffer: 4096
Flow control: Rx+Tx
Speed and Duplex: Auto
Interrupt reduction: Enabled
Interrupt reduction rate: Adaptive
IPsec Offload: OFF
IPv4 - checksum offload: Rx + Tx
Jumbo packets: 9014
Large-Send-Offload V2 (IPv4+v6): Enabled
Max. Rss Queues: 8 (I tested also 16, but that changed nothing)
Package priority and VLAN: Deactivated both (activating both changed nothing)
RSS: Enabled
TCP - Checksum offload (IPv4+v6): Rx + Tx
UDP - Checksum offload (IPv4+v6): Rx + Tx
Sending buffer: 16384
Connection event log: Enabled
Method of testing:
Since RSS is based on hashing of S/R IP + S/R Port, I started 10 receiving iperf3 servers on the receiver, each listening on different ports and started 10 sending clients on the sender.
Sending UDP packages occupies one whole CPU/core of the receiver, bot non of the other cores. And only 5GBit/s could be acquired.
Sending TCP packages occupies all the specified 8 receiving queues and 10Gbit's could be acquired.