3

Strange thing happens:

via systemctl I cannot start SSHD:

SERVER:~ # systemctl status sshd
● sshd.service - OpenSSH Daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

May 29 18:31:38 linux-uw9h systemd[1]: Stopped OpenSSH Daemon.
May 29 18:45:19 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 18:48:09 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:04:23 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:09:51 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:11:22 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:12:53 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:13:58 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:15:09 SERVER systemd[1]: Stopped OpenSSH Daemon.
May 29 19:24:41 SERVER systemd[1]: Stopped OpenSSH Daemon.
SERVER:~ #
SERVER:~ # systemctl restart sshd

... it just hangs

but if I manually just type "/usr/sbin/sshd" it just starts great!

The Q: how can I debug this issue?

SERVER:~ # rpm -qf /usr/sbin/sshd
openssh-7.2p2-74.16.3.x86_64
SERVER:~ # rpm -V openssh-7.2p2-74.16.3.x86_64
SERVER:~ # echo $?
0
SERVER:~ #
  • dmesg says nothing special
  • /var/log/* says nothing special
  • journalctl -xe says nothing special
  • zypper in -f openssh didn't helped
  • no FS is on 100%
  • console doesn't show HW issues
  • rebooted twice already
  • networks/IPs looks OK, working if SSHD runs.
  • tried to "systemctl disable sshd" and enable it, didn't helped.

It is like systemctl cannot start it, but manually I can..

SLES 12.3.

UPDATE on 2019 May 30:

cksum is the same for sshd.service file as on other working nodes:

SERVER:~ # cat /usr/lib/systemd/system/sshd.service
[Unit]
Description=OpenSSH Daemon
After=network.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/ssh
ExecStartPre=/usr/sbin/sshd-gen-keys-start
ExecStartPre=/usr/sbin/sshd -t $SSHD_OPTS
ExecStart=/usr/sbin/sshd -D $SSHD_OPTS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always
TasksMax=infinity

[Install]
WantedBy=multi-user.target
SERVER:~ # ls -lah /usr/lib/systemd/system/sshd.service
-rw-r--r-- 1 root root 361 Jan 30 15:46 /usr/lib/systemd/system/sshd.service
SERVER:~ #

In worst case I will have to put a cronjob to check sshd in every minute, so it would start it if systemctl cannot.

UPDATE on 2019 may 31:

SERVER:~ # strace systemctl restart sshd
execve("/usr/bin/systemctl", ["systemctl", "restart", "sshd"], [/* 57 vars */]) = 0
brk(0)                                  = 0x562494677000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=102550, ...}) = 0
...
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1H\0\0\0\3\0\0\0\206\0\0\0\1\1o\0!\0\0\0", 24}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/systemd1/job/22"..., 200}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 200
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0012\0\0\0\4\0\0\0\17\0\0\0\5\1u\0\2\0\0\0", 24}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\1o\0\0-\0\0\0/org/freedesktop/sys"..., 58}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 58
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"l\1\4\0019\0\0\0\3\0\0\0\240\0\0\0\1\1o\0-\0\0\0/org/fre"..., 176}, {"\35\0\0\0org.freedesktop.systemd1.Uni"..., 57}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 233
recvmsg(3, 0x7ffc4c442360, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, {24, 999977000}, NULL, 8) = 1 ([{fd=3, revents=POLLIN}], left {24, 999901280})
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\1\10\0\0\0\5\0\0\0\17\0\0\0\5\1u\0\3\0\0\0", 24}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24
recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\1v\0\0\1b\0\0\0\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 16
recvmsg(3, 0x7ffc4c442410, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8

and it just hangs here.. CTRL+C'ed it after a few hours. sshd isn't starting via systemctl, only manually, strange

9
  • Please append to your question content of /usr/lib/systemd/system/sshd.service file. Try to execute sshd server in the same way that doing this systemd from service. Commented May 29, 2019 at 19:54
  • you could try systemd-analyze log-level debug, try again then look in the log messages, it might help distinguish whether systemd has a problem spawning sshd, or sshd has a problem after it is spawned. when the sshd process hangs, does it use 100% CPU in top? if not, you could probably get a kernel backtrace from sudo cat /proc/PID/stack.
    – sourcejedi
    Commented May 29, 2019 at 20:33
  • 3
    systemctl cat sshd.service might be a better way to dump the service file, e.g. in case there is a drop-in file that overrides it to do something wrong.
    – sourcejedi
    Commented May 29, 2019 at 20:33
  • Pssst!
    – JdeBP
    Commented May 29, 2019 at 23:44
  • can you try to use strace like strace systemctl restart sshd and paste where it stuck at least last 10-15 lines?
    – asktyagi
    Commented May 30, 2019 at 2:45

1 Answer 1

1

You can try and use a self written sshd.service file to test it, place it in /etc/systemd/system and call it my-ssh.service and use this content

# /usr/lib/systemd/system/sshd.service
[Unit]
Description=OpenSSH server daemon
After=network.target

[Service]
Type=notify
#EnvironmentFile=-/etc/sysconfig/sshd
#ExecStart=/usr/sbin/sshd -D $OPTIONS $CRYPTO_POLICY
ExecStart=/usr/sbin/sshd -Dd
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=42s

[Install]
WantedBy=multi-user.target

I took the above service from one of my Fedora stations, and replaced the ExecStart and added -d for debug. Create a file called /etc/systemd/system/my-ssh.service and put the above snippet into it and reload systemd with

systemctl daemon-reload 

and then try to run the service with

systemctl start my-ssh ; journalctl -f --unit=my-ssh

and look for the logs with journalctl -f --unit=my-ssh

1
  • actually I did a "zypper up" before trying the sshd debug mode written here, but lol... after the zypper up, the SSHD is restarting via systemctl! so the best I can think of that there was some dependency RPM, which was corrupted somehow... many thanks, accepting this as answer, since this was the only posted answer.
    – niving6473
    Commented Jun 3, 2019 at 14:50

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .