1

I have a cable modem connected to a Linksys AX4200 Mesh WiFi system.

For the past week, we've noticed intermittent loss of internet. I'm fairly sure it's not a WiFi issue - my phone, for example, will show strong WiFi signal but flags the network as having no internet. The intermittent nature makes it hard to resolve with my ISP.

I would like to capture some data to better characterize the frequency and duration of interruptions. AFAIK there is no utility I can run directly on the router or modem for this purpose.

What are some good approaches to figure this out?

4
  • 1
    Perhaps your router has logs. That is how I approach this.
    – anon
    Commented Jul 21, 2023 at 15:21
  • 3
    Router logs may not be particularly useful unless the device is a modem/router combo. things like signal strength in the various bands, Signal-to-noise, negotiated layer1 protocols, etc will only be available on the Modem itself. router logs may help you find our WHEN problems occur, but they wont tell you anything about the nature of the problem. One tip I have is to call the ISP while the issue is present, so they can look at the modem telemetry, but the devices won't maintain a long log history, and the ISP will not collect/retain them upstream. Commented Jul 21, 2023 at 16:04
  • 1
    Do you have a Raspberry Pi lying around? Or any other Linux system that is always on?
    – Daniel B
    Commented Jul 21, 2023 at 18:44
  • 1
    @DanielB I do have a RasPi but not sure it's functional. Are you thinking to run a script like Preston's answer or have something else in mind?
    – jake
    Commented Jul 21, 2023 at 20:50

2 Answers 2

2

I am in the process of doing this to document that the circuit I'm on with my ISP is oversubscribed. Intermittently throughout the day, I'll get heavy packet loss that interrupts zoom calls (and anything else, really; it's just most irritating when on low-latency, near-real-time applications like zoom).

Presently, I have a cron job that runs mtr every minute to 8.8.8.8:

preston@neo:~$ crontab -l | grep 'monitoring' -A 1
# network monitoring
* * * * * /home/preston/netmon.sh
preston@neo:~$ cat /home/preston/netmon.sh 
#!/bin/bash

# -r report mode
# -b both IP and hostname
# -C output CSV form
# -c 60 pings
/usr/bin/mtr -r -b -C -c 30 8.8.8.8 > /home/preston/network-monitoring/$(date +"%Y-%m-%dT%H:%M:%S%:z").csv
preston@neo:~$ find /home/preston/network-monitoring/ -type f -exec cat {} + | grep -v "???" | grep -v "Mtr_Version" | awk -F',' {'print $2","$5","$6","$7'} | grep -v "0.00" | grep "charter" | wc -l
2411

Whenever there is a problem, it can look something like this:

10:38:44 preston@neo:~$ find /home/preston/network-monitoring/ -type f -exec cat {} + | grep -v "???" | grep -v "Mtr_Version" | awk -F',' {'print $2","$5","$6","$7'} | grep -v "0.00" | tail -n 30
1689940356,7,72.14.197.124,36.67
1689940356,9,192.178.44.39,33.33
1689940356,10,dns.google (8.8.8.8),36.67
1689940417,4,lag-21.rcr01ftwptxzp.netops.charter.com (96.34.112.174),96.67
1689940417,5,lag-806.bbr01dllstx.netops.charter.com (96.34.2.32),86.67
1689940417,6,lag-801.prr01dllstx.netops.charter.com (96.34.3.69),93.33
1689940417,8,108.170.225.149,93.33
1689940417,10,dns.google (8.8.8.8),86.67
1689940477,3,lag-58.hcr09ftwbtxff.netops.charter.com (96.34.112.164),23.33
1689940477,4,lag-21.rcr01ftwptxzp.netops.charter.com (96.34.112.174),23.33
1689940477,5,lag-806.bbr01dllstx.netops.charter.com (96.34.2.32),16.67
1689940477,6,lag-801.prr01dllstx.netops.charter.com (96.34.3.69),13.33
1689940477,7,72.14.197.124,16.67
1689940477,8,108.170.225.149,23.33
1689940477,9,192.178.44.39,16.67
1689940477,10,dns.google (8.8.8.8),23.33
1689944136,7,72.14.197.124,3.33
1689946536,7,72.14.197.124,23.33
1689948637,7,72.14.197.124,3.33
1689948696,7,72.14.197.124,36.67
1689949296,7,72.14.197.124,3.33
1689950496,7,72.14.197.124,36.67
1689951036,7,72.14.197.124,6.67
1689951636,7,72.14.197.124,63.33
1689951696,7,72.14.197.124,3.33
1689951936,7,72.14.197.124,53.33
1689952237,7,72.14.197.124,53.33
1689952296,7,72.14.197.124,33.33
1689952836,7,72.14.197.124,73.33
1689953437,7,72.14.197.124,76.67

Not all of these sessions actually indicate problems with Charter Communications specifically, but the 1689940417 and 1689940477 ones certainly do. E.g.,

1689940477,3,lag-58.hcr09ftwbtxff.netops.charter.com (96.34.112.164),23.33
1689940477,4,lag-21.rcr01ftwptxzp.netops.charter.com (96.34.112.174),23.33
1689940477,5,lag-806.bbr01dllstx.netops.charter.com (96.34.2.32),16.67
1689940477,6,lag-801.prr01dllstx.netops.charter.com (96.34.3.69),13.33
1689940477,7,72.14.197.124,16.67
1689940477,8,108.170.225.149,23.33
1689940477,9,192.178.44.39,16.67
1689940477,10,dns.google (8.8.8.8),23.33

This session shows that, from the third hop onward -- where the 3rd hop is lag-58.hcr09ftwbtxff.netops.charter.com, packet loss persists to the destination -- dns.google (8.8.8.8). Ergo, it is not merely Control Plane Policing (CoPP).

EDIT: And thanks to getting nerd-sniped here, I finally got around to writing that python script for analyzing all of the individual CSVs that mtr dumps out :)

https://codeberg.org/aspensmonster/packetloss_analysis

2

Instead of rolling your own, I recommend using SmokePing, hence my comment question. It helped me in uncovering many occurrences where my internet access was degraded: Packet loss, high latency, IPv6 outage, (local) Cloudflare outage…

To run SmokePing, you need a Linux/Unix system. It could probably also run on an OpenWrt router, if you have that. Otherwise, a Raspberry Pi will do just fine. Setup is a little convoluted because of bespoke config file formats and whatnot, but is absolutely solid.

You can browse the “official” demo instance here.

In my configuration (derived from the default config), I have the following under *** Targets ***:

probe = FPing

menu = Top
title = Network Latency Grapher

+ targets
menu = Targets
title = Targets

++ googledns

menu = Google DNS
title = Google DNS via IPv4
host = 8.8.8.8

++ cfdns

menu = Cloudflare DNS
title = Cloudflare DNS via IPv4
host = 1.1.1.1

++ cfdns6

menu = Cloudflare DNS (IPv6)
title = Cloudflare DNS via IPv6
probe = FPing6
host = 2606:4700:4700::1111

There are many other probes available besides FPing (Flood Ping) and FPing6.

I have the SmokePing web UI running as a FastCGI server. It is wired to Nginx like this:

location = /smokeping/ {
    fastcgi_pass unix:/run/smokeping-fcgi.sock;
    include /etc/nginx/fastcgi_params;
}

The server is started using this unit file (which I may have copied from somewhere, I don’t remember):

[Unit]
Description=SmokePing FastCGI Service
After=network.target smokeping.service
Wants=smokeping.service

[Service]
StandardOutput=null
StandardError=journal
ExecStart=/usr/bin/spawn-fcgi -u smokeping -s /run/smokeping-fcgi.sock \
          -M 600 -n -U http -- /srv/http/smokeping/smokeping.fcgi
Restart=always

[Install]
WantedBy=multi-user.target

smokeping.fcgi contains this:

#!/bin/sh
exec /usr/bin/smokeping_cgi /etc/smokeping/config
1
  • 1
    This looks like a really neat way to go about the task, especially if you have a spare Raspberry Pi or other always-on machine laying around. It's certainly more comprehensive than my cron job mtr approach :) Commented Jul 24, 2023 at 15:14

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .