Bug 1585325

Summary: x86, ppc64le: SSH Key Generation fails when run as systemd service on RHEL 7.6 VMs
Product: Red Hat Enterprise Linux 7 Reporter: Karen Mezick <kmezick>
Component: selinux-policyAssignee: Lukas Vrabec <lvrabec>
Status: CLOSED ERRATA QA Contact: Milos Malik <mmalik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.6CC: crosa, ddepaula, fweimer, jen, ldoktor, lvrabec, mgrepl, mmalik, mrezanin, plautrba, ssekidde, wainersm, xiliang, ymankad
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 10:05:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Karen Mezick 2018-06-01 21:09:59 UTC
Description of problem:

An error occurs during SSH key generation (when run as a systemd service only) on a x86 or ppc64le RHEL 7.6 VMs:

  ->  [[1;31mFAILED[0m] Failed to start OpenSSH Server Key Generation. See 'systemctl status sshd-keygen.service' for details.

Commands:   (1) systemctl status sshd-keygen.service
            (2) systemctl status sshd.service       
                
This issue is causing all avocado-vt tests to fail on x86 and ppc64le RHEL 7.6 VMs with qemu-kvm-rhev-2.12.0* installed. 

Please Note: This issue is *not* occurring on aarch64 RHEL 7.6 VMs.

Please Note: If the  /usr/sbin/sshd-keygen command is run first to create the keys then execution of the 'systemctl status sshd-keygen.service' command does not produce the above error.

Versions:
    Red Hat Enterprise Linux Server release 7.6 Beta (Maipo)
    KVM version: 3.10.0-889.el7.x86_64
    KVM userspace version: 2.12.0 (qemu-kvm-rhev-2.12.0-2.el7.kmezick201805301602)
    openssh-server-7.4p1-16.el7.x86_64

$ systemctl status sshd-keygen.service
  • sshd-keygen.service - OpenSSH Server Key Generation
    Loaded: loaded (/usr/lib/systemd/systeMsshd-keygen.service_ static_vendor preset: disabled)
    Active: failed (Result: exit-code) since Thu 2018-05-31 19:23:37 EDT_3min 3s ago
    Process: 25639 EiecStart=/usr/sbin/sshd-keygen (code=exited, status=1/FAILURE)
    Main PID: 25639 (code=exited, status=1/FAILURE)

 0y 31 19:23:37 localhost.localdomain systemd[1]: Starting OpenSSH Server Key Generation...
 0y 31 19:23:37 localhost.localdomain sshd-keygen[25639]: Generating SSH2 RSA host key: /usr/sbin/sshd-keygen: line 52: 25643 Killed  $KEYGEN -q -t rsa -f $RSA_KEY -C '' -N '' &>/dev/null
 0y 31 19:23:37 localhost.localdomain sshd-keygen[25639]: [FAILED]
 0y 31 19:23:37 localhost.localdomain systemd[1]: sshd-keygen.service: main process exited, code=exited, status=1/FAILURE
 0y 31 19:23:37 localhost.localdomain systemd[1]: Failed to start OpenSSH Server Key Generation.
 0y 31 19:23:37 localhost.localdomain systemd[1]: Unit sshd-keygen.service entered failed state.
 0y 31 19:23:37 localhost.localdomain systemd[1]: sshd-keygen.service failed.

However, when running this manually (/usr/sbin/sshd-keygen):

Generating SSH2 RSA host key:                              [  OK   ]
Generating SSH2 ECDSA host key:                            [  OK   ]
Generating SSH2 ED25519 host key:                          [  OK   ]

After that the error does not occur:

$ systemctl start sshd.service

 • sshd.service - OpenSSH server daemon
    Loaded: loaded (/usr/lib/systemd/systeMsshd.service_ enabled_ vendor preset: enabled)
    Active: active (running) since Thu 2018-05-31 19:30:28 EDT_ 20s ago
      Docs: man:sshd(8)
               man:sshd_config(5)
  Main PID: 25756 (sshd)
     Tasks: 1
    CGroup: /system.slice/sshd.service
           L25756 /usr/sbin/sshd _D

  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug3: oom_adjust_setup
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug1: Set /proc/self/oom_score_adj from 0 to -1000
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug2: fd 3 setting O_NONBLOCK
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug1: Bind to port 22 on 0.0.0.0.
  0y 31 19:30:28 localhost.localdomain sshd[25756]: Server listening on 0.0.0.0 port 22.
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug2: fd 4 setting O_NONBLOCK
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug3: sock_set_v6only: set socket 4 IPlJ6_lJ6ONLY
  0y 31 19:30:28 localhost.localdomain sshd[25756]: debug1: Bind to port 22 on ::.
  0y 31 19:30:28 localhost.localdomain sshd[25756]: Server listening on :: port 22.
  0y 31 19:30:28 localhost.localdomain systemd[1]: Started OpenSSH server daemon.


Version-Release number of selected component (if applicable):

    Red Hat Enterprise Linux Server release 7.6 Beta (Maipo)
    KVM version: 3.10.0-889.el7.x86_64
    KVM userspace version: 2.12.0 (qemu-kvm-rhev-2.12.0-2.el7.kmezick201805301602)
    openssh-server-7.4p1-16.el7.x86_64

How reproducible: Consistently reproducible


Steps to Reproduce:

1. With Virt-Manager create a VM using the latest x86 rhel7 compose (rhel7.6): http://download-node-02.eng.bos.redhat.com/composes/latest-RHEL7/compose/Server/x86_64/os/

2. Run these commands to see the errors described above:

       systemctl status sshd-keygen.service

       systemctl status sshd.service

3. Run /usr/sbin/sshd-keygen to create the keys then run 'systemctl status sshd-keygen.service' again and observe that the error no longer occurs.

Actual results:

   Error: [[1;31mFAILED[0m] Failed to start OpenSSH Server Key Generation

Expected results: 

   SSH keys are generated and OpenSSH server daemon is started

Additional info:

Comment 2 Jakub Jelen 2018-06-04 08:29:54 UTC
> If the  /usr/sbin/sshd-keygen command is run first to create the keys then execution of the 'systemctl status sshd-keygen.service' command does not produce the above error.

If the sshd-keygen is executed manually, the keys are created and the service is not ran at all from systemd (see the content of the service file):

  ConditionFileNotEmpty=|!/etc/ssh/ssh_host_rsa_key

We did not update OpenSSH in RHEL7.6 so this must be something changed in systemd, in openssl or elsewhere.

Also I can not reproduce it on my virtual machine with the latest repo with the following test:

/CoreOS/openssh/Sanity/autocreate-server-keys-sanity-test

I have the following configuration of my VM:

Linux 3.10.0-862.3.2.el7.x86_64 #1 SMP Tue May 15 18:22:15 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
openssh-7.4p1-16.el7.x86_64
systemd-219-57.el7.x86_64
openssl-1.0.2k-12.el7.x86_64

Can you modify the sshd-keygen in your installation to produce more verbose logs (set -x) and rerun the test to get some more debug information?

Comment 3 Frank Liang 2018-06-05 06:23:03 UTC
Looks like I hit the same problem on my xen hvm guest.

I installed RHEL-7.6-20180528.n.0 and sshd did not start too.
After ran 'sshd-keygen' manually, then sshd can start normally.

[root@dhcp-3-19 ~]# systemctl status sshd -l
● sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2018-06-05 14:08:49 HKT; 7s ago
     Docs: man:sshd(8)
           man:sshd_config(5)
  Process: 3797 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=1/FAILURE)
 Main PID: 3797 (code=exited, status=1/FAILURE)

Jun 05 14:08:49 dhcp-3-19.nay.redhat.com systemd[1]: sshd.service: main[ 5447.398309] serial8250: too much work for irq4
 process exited, code=exited, status=1/FAILURE
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com sshd[3797]: Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com sshd[3797]: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com sshd[3797]: sshd: no hostkeys available -- exiting.
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com systemd[1]: Failed to start OpenSSH server daemon.
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com systemd[ 5447.504361] serial8250: too much work for irq4
[1]: Unit sshd.service entered failed state.
Jun 05 14:08:49 dhcp-3-19.nay.redhat.com systemd[1]: sshd.service failed.
[root@dhcp-3-19 ~]# sshd-keygen 
Generating SSH2 RSA host key: [  OK  ]
Generating SSH2 ECDSA host key: [  OK  ]
Generating SSH2 ED25519 host key: [  OK  ]
[root@dhcp-3-19 ~]# systemctl restart sshd
[root@dhcp-3-19 ~]# systemctl status sshd -l
● sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2018-06-05 14:12:18 HKT; 3min 0s ago
     Docs: man:sshd(8)
           man:sshd_config(5)
 Main PID: 3896 (sshd)
    Tasks: 1
   CGroup: /system.slice/sshd.service
           └─3896 /usr/sbin/sshd -D

Jun 05 14:12:18 dhcp-3-19.nay.redhat.com systemd[1]: Starting OpenSSH server daemon...
Jun 05 14:12:18 dhcp[ 5828.566147] serial8250: too much work for irq4
-3-19.nay.redhat.com sshd[3896]: Server listening on 0.0.0.0 port 22.
Jun 05 14:12:18 dhcp-3-19.nay.redhat.com sshd[3896]: Server listening on :: port 22.
Jun 05 14:12:18 dhcp-3-19.nay.redhat.com systemd[1]: Started OpenSSH server daemon.
[root@dhcp-3-19 ~]# uname -a
Linux dhcp-3-19.nay.redhat.com 3.10.0-893.el7.x86_64 #1 SMP Thu May 24 21:37:14 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Comment 4 Jakub Jelen 2018-06-05 08:15:41 UTC
Can somebody attach some useful debug information or provide me a reproducer (for example with beaker)? Does it work with SELinux permissive? Are there any AVCs?

Comment 6 Karen Mezick 2018-06-05 13:58:21 UTC
Jakub - In the meantime, I am trying to set up a system so that you can troubleshoot this issue. Please stay tuned. Thanks - Karen

Comment 9 Jakub Jelen 2018-06-06 09:21:36 UTC
The SELinux is preventing us to do that. Additionally, even the auditd was not successfully started when I connected to the testing machine provided by Xiao (thanks!).

So after going to permissive, I was able to start both auditd and sshd. Then removing the keys in /etc/sshd/ssh_host_*, switching back to the enforcing and trying to start the service again, the following AVC appear.

# ausearch -m AVC
----
time->Wed Jun  6 05:15:34 2018
type=PROCTITLE msg=audit(1528276534.738:2065): proctitle="(null)"
type=SYSCALL msg=audit(1528276534.738:2065): arch=c000003e syscall=59 success=no exit=-13 a0=1dd0480 a1=1d91f50 a2=1dcbc90 a3=7ffe941ceae0 items=0 ppid=14720 pid=14724 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="ssh-keygen" exe="/usr/bin/ssh-keygen" subj=system_u:system_r:sshd_keygen_t:s0 key=(null)
type=AVC msg=audit(1528276534.738:2065): avc:  denied  { map } for  pid=14724 comm="ssh-keygen" path="/usr/bin/ssh-keygen" dev="dm-0" ino=12845996 scontext=system_u:system_r:sshd_keygen_t:s0 tcontext=system_u:object_r:ssh_keygen_exec_t:s0 tclass=file permissive=0
----
time->Wed Jun  6 05:15:34 2018
type=PROCTITLE msg=audit(1528276534.738:2066): proctitle=2F62696E2F62617368002F7573722F7362696E2F737368642D6B657967656E
type=SYSCALL msg=audit(1528276534.738:2066): arch=c000003e syscall=269 success=no exit=-13 a0=ffffffffffffff9c a1=1dd47b0 a2=1 a3=7ffe941ce160 items=0 ppid=1 pid=14720 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sshd-keygen" exe="/usr/bin/bash" subj=system_u:system_r:sshd_keygen_t:s0 key=(null)
type=AVC msg=audit(1528276534.738:2066): avc:  denied  { execute } for  pid=14720 comm="sshd-keygen" name="plymouth" dev="dm-0" ino=12846391 scontext=system_u:system_r:sshd_keygen_t:s0 tcontext=system_u:object_r:plymouth_exec_t:s0 tclass=file permissive=0


I saw this issue in bug #1583531, but it was missing reproducer. Lukas, what can we try to figure what is going on there further?

Comment 10 Milos Malik 2018-06-07 12:58:07 UTC
# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)
# grep /functions `which sshd-keygen `
. /etc/rc.d/init.d/functions
# grep plymouth /etc/rc.d/init.d/functions
    if [ -x /bin/plymouth ]; then
        /bin/plymouth --update="$1"
    [ -x /bin/plymouth ] && /bin/plymouth --details
#

Comment 15 errata-xmlrpc 2018-10-30 10:05:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3111