3

I have found a ton of questions answered about debugging why one cannot connect via SSH, but they all seem to require that you can still access the system - or say that without that nothing can be done. In my case, I cannot access the system directly, but I do have access to the filesystem using a recovery console.

So this is the situation: My provider made some kernel update today and in the process also rebooted my server. For some reason, I cannot connect via SSH anymore, but instead get a ssh: connect to host mydomain.de port 22: Connection refused

I do not know whether sshd is just not running, or whether something (e.g. iptables) blocks my ssh connection attempts. I looked at the logfiles, none of the files in /var/log contain any mentioning on ssh, and /var/log/auth.log is empty. Before the kernel update, I could log in just fine and used certificates so that I would not need a password everytime I connect from my local machine.

What I tried so far:

  1. I looked in /etc/rc*.d/ for a link to the /etc/init.d/ssh script and found none. So I am expecting that sshd is not started properly on boot. Since I cannot run any programs in my system, I cannot use update-rc to change this. I tried to make a link manually using ln -s /etc/init.d/ssh /etc/rc6.d/K09sshd and restarted the server - this did not fix the problem. I do not know wether it is at all possible to do it like this and whether it is correct to create it in rc6.d and whether the K09 is correct. I just copied that from apache.

  2. I also tried to change my /etc/iptables.rules file to allow everything:

# Generated by iptables-save v1.4.0 on Thu Dec 10 18:05:32 2009
*mangle
:PREROUTING ACCEPT [7468813:1758703692]
:INPUT ACCEPT [7468810:1758703548]
:FORWARD ACCEPT [3:144]
:OUTPUT ACCEPT [7935930:3682829426]
:POSTROUTING ACCEPT [7935933:3682829570]
COMMIT
# Completed on Thu Dec 10 18:05:32 2009
# Generated by iptables-save v1.4.0 on Thu Dec 10 18:05:32 2009
*filter
:INPUT ACCEPT [7339662:1665166559]
:FORWARD ACCEPT [3:144]
:OUTPUT ACCEPT [7935930:3682829426]
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m tcp --dport 25 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 993 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 143 -j ACCEPT
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 8080 -s localhost -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied: " --log-level 7
-A INPUT -j ACCEPT
-A FORWARD -j ACCEPT
-A OUTPUT -j ACCEPT
COMMIT
# Completed on Thu Dec 10 18:05:32 2009
# Generated by iptables-save v1.4.0 on Thu Dec 10 18:05:32 2009
*nat
:PREROUTING ACCEPT [101662:5379853]
:POSTROUTING ACCEPT [393275:25394346]
:OUTPUT ACCEPT [393273:25394250]
COMMIT
# Completed on Thu Dec 10 18:05:32 2009

I am not sure this is done correctly or has any effect at all. I also did not find any mentioning of iptables in any file in /var/log.

So what else can I do? Thank you for your help.

p.s.: After adding the crontab line as suggested by lain, I can find the following in the file /var/logauth.log

Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: sshd version OpenSSH_5.3p1 Debian-3ubuntu5
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: read PEM private key done: type RSA
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: Checking blacklist file /usr/share/ssh/blacklist.RSA-2048
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: Checking blacklist file /etc/ssh/blacklist.RSA-2048
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: private host key: #0 type 1 RSA
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: read PEM private key done: type DSA
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: Checking blacklist file /usr/share/ssh/blacklist.DSA-1024
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: Checking blacklist file /etc/ssh/blacklist.DSA-1024
Mar  7 21:13:58 mysubdomain sshd[64900]: debug1: private host key: #1 type 2 DSA
2
  • Which OS/distro are you using ?
    – user9517
    Commented Mar 7, 2011 at 11:56
  • I am using Ubuntu 10.04 LTS. The newly installed kernel is supposed to be 2.6.18-028stab079.1.
    – olrehm
    Commented Mar 7, 2011 at 13:10

6 Answers 6

5

Connection refused suggests that sshd is not running. Try running sshd in debug mode from the command line to see if there are any error messages.

/usr/sbin/sshd -f /etc/ssh/sshd_config -D -d

EDIT:

As you don't have access to run the above try putting it in a @reboot cron job

Add

@reboot root /bin/mkdir -p -m0755 /var/run/sshd && /usr/sbin/sshd -f /etc/ssh/sshd_config -d &>/var/log/sshd_debug

into /etc/crontab

Ubuntu sshd requires that the /var/run/sshd directory exists.

16
  • I suppose the asker has no access to interactive shell, right Ole? He would have to wrap this command in some dummy script run on system start and log the output. Commented Mar 7, 2011 at 12:49
  • @Karol Piczak: OP says they can read log files and change Firewall rules so why can they not run a command interactively ?
    – user9517
    Commented Mar 7, 2011 at 12:52
  • From what I get he's using a recovery distro, so basically full access to the filesystem while the original system is offline. Maybe I misinterpreted this part though: "but I do have access to the filesystem using a recovery console". Ole could clarify this. Commented Mar 7, 2011 at 12:59
  • @Karol Piczak: You are probably right but it's unclear.
    – user9517
    Commented Mar 7, 2011 at 13:02
  • It is exactly as Karol understood it: I am using some recovery system set up by my provider which mounts my filesystem, so I can read and change files, but I cannot check whether stuff is running on my system or do any manual starting of programs.
    – olrehm
    Commented Mar 7, 2011 at 13:04
1

You can try adding logging (SSH traffic) to your firewall rules and restart the machine. This way you should be able to verify if your packets reach the destination (hence sshd is not working).

As to this part:

I tried to make a link manually using ln -s /etc/init.d/ssh /etc/rc6.d/K09sshd and restarted the server - this did not fix the problem.

Runlevel 6 is for reboot, so that's not what you're interested in. And K in K09sshd is for kill. ;-) Try using the solution mentioned by odk and see what happens.

1

I tried to make a link manually using ln -s /etc/init.d/ssh /etc/rc6.d/K09sshd and restarted the server - this did not fix the problem

You almost got it right. K* names cause a service to be stopped. S* names cause the service to be started.

rc6.d is for controlling subsystems when entering runlevel 6, which means "reboot".

You want to have the service started at runlevels 3 (normal multiuser) and 5 (multiuser + X). Make /etc/rc.3d/S90sshd and /etc/rc5.d/S90sshd symlinks to /etc/init.d/sshd This should cause sshd to be started on system boot.

3
  • What does the number mean? And my init.d script is named ssh, but still seems to be made for the daemon from what I can tell by looking at it.
    – olrehm
    Commented Mar 7, 2011 at 13:12
  • @olrehm The rc#.d number only has a couple of special numbers, the rest mean whatever you want them to mean (6: reboot 1: single user mode 0:halt, some distributions use "S" for all system scripts like drive encryption and driver loading). Check /etc/inittab for the line reading id:#:initdefault: to see which # you should be using here. The number after S in the symlink is just used to sort the scripts in the order they should be started.
    – DerfK
    Commented Mar 7, 2011 at 13:41
  • @olrehm If you have /etc/init.d/ssh and no /etc/init.d/sshd, then the former seems a good candidate for a symlink target :). When system enters any given runlevel it runs all the K* scripts for this runlevel (which stop services) in the order they are numbered (from 00 to 99), then all S* scripts (which start services). Scripts for runlevel X reside in /etc/rcX.d/ This is done by /etc/rc on Red Hat systems. Commented Mar 7, 2011 at 16:23
1

I got it. First of all, big thanks to everybody that was trying to help!

For those of you browsing the archives with a similar problem: The initial problem was that the /etc/rc*.d/ links were not set, this sshd was not run on startup. My numerous attempts fixing this were spoiled because of the way ln works: When I create the links doing

$ cd /repair/etc/rc3.d/
$ ln -s ../init.d/ssh S20sshd

and similar for all runlevels, the links create looked totally perfect in the recovery mode. However, when I was finally able to log back in using the hack described above, I could see that all the links were broken, i.e.

$ ls /etc/rc3.d/
...
lrwxrwxrwx 1 root root  10 2011-03-08 09:51 S20sshd -> init.d/ssh
...

so the relative link now points somewhere wrong.

To fix it, I added to the top of another init script the line (ATTENTION: HACK!!!)

/etc/init.d/ssh start

which worked fine and allowed me to log back in. I then removed all the broken links and created new ones, using update.rc.

Again thank you very much for all the help!

0

I tried to make a link manually using ln -s /etc/init.d/ssh /etc/rc6.d/K09sshd and restarted the server - this did not fix the problem

Try adding link not to the /etc/rc6.d but to the rc3.d ( ln -s /etc/init.d/ssh /etc/rc3.d/S99sshd )

5
  • Okay, thank you, I was just guessing :-) Well now I did it as suggested by @odk and removed my old link, but it is still not working, i.e. I still get the same error message. Is there any way I can verify it started correctly, and make it write to some log?
    – olrehm
    Commented Mar 7, 2011 at 13:06
  • /etc/init.d/ssh is normal shell script. You could edit it and add sonething like echo 'starting' >> /home/<you>/sshd_script_log to check if start script is executing at all.
    – odk
    Commented Mar 7, 2011 at 13:22
  • I added the line as suggested - and after a reboot the logfile does not exist.. This seems to indicate the script is never run - and I just had an idea why: When I created the link, I was on the repair console, where my filesystem is mounted as /repair, which means it is linking to /repair/etc/init.d/ssh. I don't know exactly how these links work: When the filesystem is mounted normally, i.e. without /repair, does the link then point to /etc/init.d/ssh?
    – olrehm
    Commented Mar 7, 2011 at 20:09
  • Okay, I fixed this by using a relative path (../init.d/ssh). But that does not change anything - it seems the script is not called anyway.
    – olrehm
    Commented Mar 7, 2011 at 20:19
  • I have been trying some more: I have checked rc0.d to rc6.d and made exactly the same softlinks as for nginx. That is S20ssh/K20ssh, they have the same permissions, the same owner and use the same relative paths. I then added, directly below the comment block (provides, Required-Start, Required-Stop...) the same line to write to two different log files in the same folder with 777 permissions. When I start the server, then go back to repair mode, the nginx script has written its line - the ssh script has not. Can it maybe have to do with: this comment block or with using SOFT links?
    – olrehm
    Commented Mar 8, 2011 at 9:14
0

You wrote that you have write access to file system and can change files and modify firewall rules.

I assume that you can also perform a reboot of your system into normal operating mode.

In such a case, I suggest at first omitting trying to make SSH server work right, and exposing a simple remote shell on a high port, using the NetCat utility.

You must:

  1. check whether your system has netcat installed at all and which version (e.g. Fedora derived distributions have a completely different version than Debian ones, and they significantly differ in command line options)
  2. if not, statically compile it from sources (a static binary won't require any additional libraries) and upload it to your filesystem. Compile it with GAPING_SECURITY_HOLE compile option so that the "-e" option is available
  3. choose a port number it should listen on (say, 1234) and add a firewall rule that only allows access to that port from your client IP (otherwise you'd be opening a gaping security hole)
  4. make it start from an init script (rc.local?) and run in background exposing the Bash shell, e.g.: nohup nc -l -p 1234 -e '/bin/bash' &

This way, you'll get a simple remote root shell with no authentication whatsoever (dangerous!), to which you can connect remotely using netcat or telnet (nc your.server.address 1234).

Just remember to kill it and remove it from system startup scripts once you get your SSH server working!

2
  • Thanks for this suggestion. I checked, there is a binary nohup in /usr/bin. The version is nohup (GNU coreutils) 7.4. I added a line to rc.local, I opened the port using iptables.rules file and then tried to connect (after booting the system, of course). netcat with the command line options as suggested just retruns on the client, without saying anything. How do I have to format the address? I used nc domain.com 7711. Is maybe the -e not available since it was installed and did not do the manual compile?
    – olrehm
    Commented Mar 8, 2011 at 13:04
  • @olrehm, first check on an interactive shell session on your own machine that this netcat binary's nc -l -p 1234 -e '/bin/bash' invocation actually works without errors. I mean copy the nc binary to your host and test it there. Your version of nc might need slightly different arguments. When you get it to work and verify that you indeed can connect from a client, only then add it to rc.local on the remote host. Commented Mar 29, 2011 at 7:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .