1

I have a Raspberry Pi 4 with an SSD running Raspberry OS Lite. I run Docker containers for Home Assistant, Pi-Hole, and NetDaemon. I've been experiencing random hangs where the system becomes unresponsive.

Here's a detailed breakdown of what I've tried and observed:

Symptoms Observed:

  • The first sign of an issue is usually the internet going offline indicating Pi-Hole is unresponsive.

  • My Home Assistant app shows "disconnected," and automations stop working, indicating that the container has stopped running.

  • I cannot log into Pi-Hole.

  • When trying to connect via SSH, I receive a "connection closed/refused" error (I can't remember the exact error message).

  • It is quicker and easier to just pull the power and restart the system.

Initial Suspicion:

  • I initially suspected a brute force attack on SSH might be causing the issue.

  • Action Taken: Installed fail2ban to mitigate potential SSH attacks.

  • Outcome: Seemed to improve initially, but the issue resurfaced after a few weeks.

OS Upgrade:

  • Action Taken: Upgraded the OS to Bullseye.

  • Outcome: Continued to experience the same problem, and additionally encountered issues switching from a 32-bit to a 64-bit kernel.

Recent Activity:

  • I went on holiday and within three weeks, the system hung six times, requiring manual restarts.

  • Action Taken: Made a backup of my config and data, then reflashed the SSD with a new image of Bookworm 64-bit Lite.

  • Outcome: Reinstalled everything on Sunday, but the system hung again on Wednesday morning between 4am and 5am for no apparent reason.

Request for Assistance

I am looking for guidance on how to effectively troubleshoot this issue. From what I understand, Linux systems are usually very stable, and some people run them for years without needing to reboot. Here are some specific questions and areas where I need help:

  • Hardware Issues:

  • Could this be a hardware problem with my Raspberry Pi 4 or SSD?

  • Log Files:

  • Are there specific logs I should check to identify what is causing these hangs?

  • Telemetry and Monitoring:

Are there tools or methods to log telemetry data such as CPU temperatures, memory usage, etc., to help narrow down the problem?

Also posted here https://www.reddit.com/r/raspberry_pi/comments/1de9q05/rpi_4_keeps_hanging_docker_services_ssh_fail/

0

You must log in to answer this question.

Browse other questions tagged .