Bug 1406666

Summary: [x86_64] sshd dies with SIGSYS
Product: [Fedora] Fedora Reporter: Shawn Starr <shawn.starr>
Component: opensshAssignee: Jakub Jelen <jjelen>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: extras-qa, jjelen, mattias.ellert, mgrepl, mjuszkie, pbrobinson, plautrba, rjones, tmraz
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1197051 Environment:
Last Closed: 2017-04-28 12:48:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1197051, 1398370    
Bug Blocks:    
Attachments:
Description Flags
audit logs from sshd failing sig 31
none
sshd with LogLevel DEBUG output from /var/log/secure from sshd run
none
Kickstart file none

Description Shawn Starr 2016-12-21 07:46:46 UTC
+++ This bug was initially created as a clone of Bug #1197051 +++

Description of problem:

With the latest sshd in Rawhide, you can no longer log in
over ssh.

The client side dies with:

[spstarr@fedora-qa ~]$ ssh -v 127.0.0.1
OpenSSH_7.3p1, OpenSSL 1.1.0c-fips  10 Nov 2016
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Reading configuration data /etc/ssh/ssh_config.d/05-redhat.conf
debug1: /etc/ssh/ssh_config.d/05-redhat.conf line 2: include /etc/crypto-policies/back-ends/openssh.txt matched no files
debug1: /etc/ssh/ssh_config.d/05-redhat.conf line 8: Applying options for *
debug1: Connecting to 127.0.0.1 [127.0.0.1] port 22.
debug1: Connection established.
[...]
debug1: Next authentication method: publickey
debug1: Trying private key: /home/spstarr/.ssh/id_rsa
debug1: Trying private key: /home/spstarr/.ssh/id_dsa
debug1: Trying private key: /home/spstarr/.ssh/id_ecdsa
debug1: Trying private key: /home/spstarr/.ssh/id_ed25519
debug1: Next authentication method: password
spstarr.0.1's password: 
debug1: Authentication succeeded (password).
Authenticated to 127.0.0.1 ([127.0.0.1]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions
debug1: Entering interactive session.
debug1: pledge: network
packet_write_wait: Connection to 127.0.0.1 port 22: Broken pipe

/var/log/secure shows:

Dec 21 02:41:05 fedora-qa sshd[3458]: Accepted password for spstarr from 127.0.0.1 port 59216 ssh2
Dec 21 02:41:05 fedora-qa sshd[3458]: fatal: privsep_preauth: preauth child terminated by signal 31


I straced the server, and the sshd subprocess dies with SIGSYS:

[...]
[pid  3480] writev(2, [{iov_base="Inconsistency detected by ld.so:"..., iov_len=33}, {iov_base="dl-close.c", iov_len=10}, {iov_base=": ", iov_len=2}, {iov_base="811", iov_len=3}, {iov_base=": ", iov_len=2}, {iov_base="_dl_close", iov_len=9}, {iov_base=": ", iov_len=2}, {iov_base="Assertion `", iov_len=11}, {iov_base="map->l_init_called", iov_len=18}, {iov_base="' failed!\n", iov_len=10}], 10) = ?
[pid  3479] <... read resumed> "", 4)   = 0
[pid  3480] +++ killed by SIGSYS +++
[pid  3479] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3480, si_uid=74, si_status=SIGSYS, si_utime=1, si_stime=0} ---
[pid  3479] close(7)                    = 0
[pid  3479] close(6)                    = 0
[pid  3479] close(-1)                   = -1 EBADF (Bad file descriptor)
[pid  3479] mmap(NULL, 1310720, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = 0x7fa98f235000
[pid  3479] munmap(0x7fa9985b0000, 65536) = 0
[pid  3479] wait4(3480, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSYS}], 0, NULL) = 3480
[pid  3479] getpid()                    = 3479
[pid  3479] socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
[pid  3479] connect(4, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
[pid  3479] sendto(4, "<82>Dec 21 02:42:26 sshd[3479]: "..., 93, MSG_NOSIGNAL, NULL, 0) = 93
[pid  3479] close(4)                    = 0
[pid  3479] getpid()                    = 3479
[pid  3479] getuid()                    = 0
[pid  3479] socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) = -1 EPROTONOSUPPORT (Protocol not supported)
[pid  3479] socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) = -1 EPROTONOSUPPORT (Protocol not supported)
[pid  3479] socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT) = -1 EPROTONOSUPPORT (Protocol not supported)
[pid  3479] exit_group(255)             = ?
[pid  3323] <... select resumed> )      = 1 (in [6])
[pid  3323] close(6)                    = 0
[pid  3479] +++ exited with 255 +++
select(7, [3 4], NULL, NULL, NULL)      = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3479, si_uid=0, si_status=255, si_utime=2, si_stime=2} ---
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], WNOHANG, NULL) = 3479
wait4(-1, 0x7ffd3aff7484, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigaction(SIGCHLD, NULL, {sa_handler=0x5569c51596b0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f4fa4c91b20}, 8) = 0
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)

Version-Release number of selected component (if applicable):

openssh-7.3p1-7.fc26.x86_64
openssh-clients-7.3p1-7.fc26.x86_64
openssh-server-7.3p1-7.fc26.x86_64

Whats unclear to me is why this works fine on Fedora 25 with same RPMs, I even downgraded kernel to match what Fedora 25 ships with.

How reproducible:

100%

Steps to Reproduce:
1. Install openssh-server
2. Try to ssh to the machine from another or from localhost

Comment 1 Shawn Starr 2016-12-21 07:47:20 UTC
Created attachment 1234254 [details]
audit logs from sshd failing sig 31

Comment 2 Shawn Starr 2016-12-21 07:47:54 UTC
Both kernels tested:

kernel-4.8.6-300.fc25.x86_64
kernel-4.10.0-0.rc0.git4.1.fc26.x86_64

Comment 3 Shawn Starr 2016-12-21 07:53:24 UTC
In Fedora 25 in: 

/etc/crypto-policies/back-ends

I do not have openssh.config symlinked to 
/usr/share/crypto-policies/DEFAULT/openssh.txt


Rawhide: 
crypto-policies-20161111-1.gita2363ce.fc26.noarch

Fedora 25 does not have openssh.config packaged
crypto-policies-20160921-2.git75b9b04.fc25.noarch

Comment 4 Shawn Starr 2016-12-21 07:57:47 UTC
*** Bug 1406665 has been marked as a duplicate of this bug. ***

Comment 5 Petr Lautrbach 2016-12-21 08:07:50 UTC
Works for me with openssh-7.3p1-7.fc26.x86_64 glibc-2.24.90-24.fc26.x86_64 kernel-4.10.0-0.rc0.git2.2.fc26.x86_64

Please attach a relevant part of /var/log/secure from the server with LogLevel at least DEBUG.

Comment 6 Shawn Starr 2016-12-21 08:18:24 UTC
Created attachment 1234260 [details]
sshd with LogLevel DEBUG output from /var/log/secure from sshd run

SELinux is disabled in both Fedora 25 and the Rawhide servers, I've also disabled audit in kernel for now but can turn it back on.

Comment 7 Marcin Juszkiewicz 2016-12-21 08:24:07 UTC
(In reply to Shawn Starr from comment #1)
> Created attachment 1234254 [details]
> audit logs from sshd failing sig 31

type=SECCOMP msg=audit(1482303545.271:344): auid=4294967295 uid=74 gid=74 ses=4294967295 pid=4564 comm="sshd" exe="/usr/sbin/sshd" sig=31 arch=c000003e syscall=20 compat=0 ip=0x7f89e7135328 code=0x0

According to my syscalls table [1] this is writev() syscall.


1. https://fedora.juszkiewicz.com.pl/syscalls.html

Comment 8 Shawn Starr 2016-12-21 21:25:40 UTC
Whats also interesting is as known in bug #1398370 for Fedora 25, but I do not get SECCOMP triggering this signal kill.

So, what is different in rawhide vs Fedora 25 where the bug above does not cause this to trip?

Comment 9 Shawn Starr 2016-12-23 01:27:26 UTC
Reproduced this with today/yesterday rawhide in same machine and in a KVM 64bit instance.

I've attached my kickstart reproduction

Comment 10 Shawn Starr 2016-12-23 01:30:49 UTC
Created attachment 1234943 [details]
Kickstart file

Comment 11 Shawn Starr 2016-12-23 01:31:35 UTC
If you disable the custom repos such as rpmfusion/google the problem remains.

Comment 12 Shawn Starr 2016-12-23 01:35:08 UTC
Some packages in the kickstart are missing/obsolete/custom and can be skipped to complete the installation.

Comment 13 Jakub Jelen 2016-12-23 11:16:13 UTC
Fedora rawhide and 25 are the same in git now. The only significant difference is OpenSSL which is it build against (1.1.0 and 1.0.2 or so), which is using different code paths. But this is not related to this.

I can reproduce the same when I install gssntlmssp package (trigger also for the bug #1389881 -- duplicate of your referenced bug #1398370).

I don't think there is anything we can do about it in OpenSSH. We have workaround (disable GSSAPIAuthentication and GSSAPIKeyExchange in sshd_config; uninstall gssntlmssp), but it needs to get resolved in glibc.

Comment 14 Shawn Starr 2016-12-23 20:46:58 UTC
Thanks, this workaround will do until GNU libc is fixed up.

Comment 15 Shawn Starr 2016-12-26 20:49:25 UTC
Confirmed on Rawhide this PASSES, no error using:

glibc-2.24.90-25.fc26.x86_64 

This is now resolved in rawhide.

Comment 16 Fedora End Of Life 2017-02-28 10:49:58 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.