I have 2 AWS EC2 instances with RHEL 6 and 8 respectively. I followed this procedure to migrate all the users from the oldest server to the new one. In the process I had to delete some lines from all the .bak files created from the procedure (because they would conflict with default existing users on the new server). Another thing I had to do was changing the UID and GID for the ssm-user and ec2-user on the new server, since they conflicted with existing users on the old server and I was afraid on making changes on the old one and risk damaging something.
After all of this I rebooted the new server but couldn't access it from SSH anymore. After many tries the only difference I've found is that 22 port seems to be not open.
Results of the lsof command can be seen below
Server before migrating users
# lsof -i -P -n
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 719 chrony 6u IPv4 20809 0t0 UDP 127.0.0.1:323
chronyd 719 chrony 7u IPv6 20810 0t0 UDP [::1]:323
NetworkMa 766 root 27u IPv4 22927 0t0 UDP <serverip>:68-><ip>>67
ssm-agent 1302 root 10u IPv4 51028 0t0 TCP <serverip>:41646-><ip>>443 (ESTABLISHED)
ssm-agent 1302 root 15u IPv4 28155 0t0 TCP <serverip>:47220-><ip>>443 (ESTABLISHED)
sshd 10532 root 5u IPv4 48231 0t0 TCP *:22 (LISTEN)
sshd 10532 root 7u IPv6 48233 0t0 TCP *:22 (LISTEN)
Server after migrating users
# lsof -i -P -n
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 717 chrony 6u IPv4 20118 0t0 UDP 127.0.0.1:323
chronyd 717 chrony 7u IPv6 20119 0t0 UDP [::1]:323
NetworkMa 764 root 27u IPv4 22164 0t0 UDP <serverip>:68-><ip>>67
ssm-agent 1324 root 10u IPv4 29098 0t0 TCP <serverip>:36630-><ip>>443 (ESTABLISHED)
ssm-agent 1324 root 15u IPv4 27154 0t0 TCP <serverip>:58756-><ip>>443 (ESTABLISHED)
ssm-sessi 1449 root 16u IPv4 27463 0t0 TCP <serverip>:33702-><ip>>443 (ESTABLISHED)
I'm accessing the servers without a key and I need to repeat this process at least on 8 more servers, so I would like how to solve this and how to avoid it when repeating the process on another server.
ssm-user
is very likely the cause of your access trouble. (perhaps also theec2-user
). The AWS SSM mangement application software was installed on the EC2 server to run under that UID/GID, and any security-related subsystems (such as SELinux) were configured to expect it to run that way. It's likely the SSM application could not start after the change to the UID/GID and the application is what provides your ssh access to the server.