Bug 1102324

Summary: test fails when gnutls is compiled by gcc-4.9.0-1.fc21
Product: [Fedora] Fedora Reporter: Dan Horák <dan>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jakub, jorton, law, mtoman, nmavrogi, tmraz
Target Milestone: ---   
Target Release: ---   
Hardware: s390   
OS: Unspecified   
Whiteboard:
Fixed In Version: gcc-4.9.1-2.fc21.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-26 21:05:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 467765    
Attachments:
Description Flags
Preprocessed gnutls_str.c
none
gnutls.c none

Description Dan Horák 2014-05-28 18:32:11 UTC
The hostname-check test fails on s390 (32-bit) when gnutls is compiled by gcc-4.9.0-1.fc21 (or gcc-4.9.0-0.10.fc21), the test passes when gcc-4.8.2-14.fc21 is used. I'll add more details including the result when built with the latest gcc 4.9, when I get them, no need to react now.


Version-Release number of selected component (if applicable):
gcc-4.9.0-1.fc21

Comment 1 Dan Horák 2014-06-04 07:00:52 UTC
same problem is with gcc-4.9.0-6.fc21 and when -fno-delete-null-pointer-checks is added to CFLAGS, the test passes when -O0 or -O1 is used

Comment 2 Jeff Law 2014-06-04 07:02:01 UTC
Probably something passing an invalid (NULL) argument to one of the mem* or str* functions.

Comment 3 Jeff Law 2014-06-04 07:03:07 UTC
Meaning the bug is likely in the gnutls code.  Just to be clear.  If you've got a box provisioned, I can take a quick looksie.

Comment 4 Dan Horák 2014-06-04 07:08:00 UTC
I guess comment https://bugzilla.redhat.com/show_bug.cgi?id=1094975#c2 also applies here as a guideline how to debug the problem. We will continue in narrowing the problem.

Comment 5 Jeff Law 2014-06-04 07:11:30 UTC
There's actually a newer memstomp that can catch some (but certainly not all) of the null pointers going into mem* and str* routines.  I don't know if the builders have picked it up for s390, but the source bits are certainly in rawhide.

Getting that latest version installed and running the test with libmemstomp.so preloaded (I like to do it via /etc/ld.preload) might cut down the debugging time considerably.

Comment 6 Dan Horák 2014-06-04 08:14:19 UTC
thanks a lot Jeff, I think we are there :-)

<mock-chroot>[root@devel6 tests]# LD_PRELOAD=/usr/lib/libmemstomp.so ./hostname-check 
memstomp: Application appears to be compiled without -rdynamic. It might be a
memstomp: good idea to recompile with -rdynamic enabled since this produces more
memstomp: useful stack traces.

memstomp: 0.1.4 successfully initialized for process hostname-check (pid 65466).
Segmentation fault (core dumped)


(gdb) where
#0  0x7cfbfb34 in strcasestr () from /lib/libc.so.6
#1  0x7d0fe582 in _gnutls_fbase64_decode (header=0x7d1a3d8a "CERTIFICATE", 
    data=0x409a72 <wildcards> "-----BEGIN CERTIFICATE-----MIICwDCCAimgAwIBAgICPd8wDQYJKoZIhvcNAQELBQAwVTEOMAwGA1UEAwwFKi5jb20xETAPBgNVBAsTCENBIGRlcHQuMRIwEAYDVQQKEwlLb2tvIGluYy4xDzANBgNVBAgTBkF0dGlraTELMAkGA1UEBhMCR1IwIhgPMjAxNDAzM"..., data_size=996, result=0x7fa238ec) at x509_b64.c:296
#2  0x7d154f66 in gnutls_x509_crt_import (cert=0x4b75c8, data=data@entry=0x7fa23964, format=format@entry=GNUTLS_X509_FMT_PEM) at x509.c:183
#3  0x00400ea2 in doit () at hostname-check.c:687
#4  0x00401b5e in main (argc=<optimized out>, argv=0x7fa23b84) at utils.c:146

Comment 7 Jeff Law 2014-06-04 08:25:02 UTC
Just looking over that function I don't immediately see how the data or pem_header could be NULL.

Comment 8 Jeff Law 2014-06-04 08:26:41 UTC
memstomp is reporting this failure as strcasestr, but it's really memmem.  Cut-n-paste error in memstomp which I'll fix momentarily.

Comment 9 Nikos Mavrogiannopoulos 2014-06-04 08:35:03 UTC
x509_b64.c:296 contains a call to:
memmem(data, data_size, pem_header, strlen(pem_header));

and none of its arguments are null (based on the arguments I see in #1). However, what is suspicious is the fact that the crash is at strcasestr() which is not called at all (and shouldn't as memmem has different semantics).

Neither address sanitizer or valgrind report anything there.

Comment 10 Jeff Law 2014-06-04 08:42:39 UTC
With a fixed memstomp, it no longer fails.  Sorry for the wild goose chase.

Comment 11 Dan Horák 2014-06-04 12:29:26 UTC
I see similar problem (a failing test on s390 with gcc 4.9) in openssl-1.0.1g-1.fc21, I'm leaning to switch this back to gcc

Comment 12 Dan Horák 2014-06-04 12:34:09 UTC
(In reply to Dan Horák from comment #11)
> I see similar problem (a failing test on s390 with gcc 4.9) in
> openssl-1.0.1g-1.fc21, I'm leaning to switch this back to gcc

and probably worth to explicitly mention that this happens only on 32-bit s390, not on the 64-bit s390x

Comment 13 Nikos Mavrogiannopoulos 2014-06-05 07:40:58 UTC
Hello,
 There is nothing in the report that indicates a bug in gnutls. Please reassign it.

Comment 14 Jeff Law 2014-06-16 19:58:30 UTC
Dan,

Can you get the most recent GCC SRPM build for s390/s390x?  There's a pretty good chance this bug is fixed in the more recent compilers.

While I can reproduce the failure when gnutls is built with the -6 build, I can not reproduce using the current gcc-4.9 branch.  

I'll note the -9 build failed for s390 due to some texlive lameness:

http://s390.koji.fedoraproject.org/koji/buildinfo?buildID=245158

http://s390.koji.fedoraproject.org/kojifiles/work/tasks/7789/1427789/root.log

Comment 15 Dan Horák 2014-06-16 20:10:23 UTC
Jeff,

(In reply to Jeff Law from comment #14)
> Dan,
> 
> Can you get the most recent GCC SRPM build for s390/s390x?  There's a pretty
> good chance this bug is fixed in the more recent compilers.
>
> While I can reproduce the failure when gnutls is built with the -6 build, I
> can not reproduce using the current gcc-4.9 branch.  
> 
> I'll note the -9 build failed for s390 due to some texlive lameness:
> 
> http://s390.koji.fedoraproject.org/koji/buildinfo?buildID=245158
> 
> http://s390.koji.fedoraproject.org/kojifiles/work/tasks/7789/1427789/root.log

sure, will do when I get the texlive issue resolved (or temporarily workarounded), unfortunately the required texlive version failed to build on s390 which I still need to investigate

Comment 16 Jakub Jelinek 2014-06-24 16:06:08 UTC
Any news on this?

Comment 17 Dan Horák 2014-06-25 20:31:25 UTC
No change when gcc-4.9.0-12.fc21 is used, the test is still failing. BTW the texlive problem is also gcc 4.9 related ...

Comment 18 Michal Toman 2014-07-01 13:11:21 UTC
The problem seems to be in function _gnutls_hostname_compare in lib/gnutls_str.c
Prefixing it with __attribute__((optimize (0))) fixes the build.

http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=1434518

Comment 19 Jakub Jelinek 2014-07-01 13:21:27 UTC
Is _gnutls_hostname_compare inlined?  If not, can you find out with what arguments it is being called when it misbehaves and supply preprocessed source with the _gnutls_hostname_compare definition?

Comment 20 Michal Toman 2014-07-02 11:46:31 UTC
The function is not inlined and is called with

_gnutls_hostname_compare("*.example.net", 13, "www.example.net", 0);

With -O0 and -O1 it returns 1 (expected) while with -O2 it returns 0.
Attaching the preprocessed source.

Comment 21 Michal Toman 2014-07-02 11:47:32 UTC
Created attachment 914138 [details]
Preprocessed gnutls_str.c

Comment 22 Jakub Jelinek 2014-07-02 12:05:28 UTC
Created attachment 914143 [details]
gnutls.c

Do you also get abort when you compile following self-contained testcase with -O2 (and not with -O0/-O1)?

Comment 23 Michal Toman 2014-07-02 12:11:33 UTC
Confirmed, -O2 aborts while -O0/-O1 does not.

Comment 24 Jakub Jelinek 2014-07-02 12:54:44 UTC
Thanks.  Can you please also try different -march/-mtune options for -O2?
Say -march=g5 (g6, z900, z990, z9-109), for -mtune= also z9-ec, z10, z196 and zEC12 ?

Comment 25 Jakub Jelinek 2014-07-02 15:20:53 UTC
Untested fix in upstream PR61673.

Comment 26 Dan Horák 2014-07-02 19:22:35 UTC
scratch build with the patch applied is at
http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=1435212 

I have another crypto library whose test fails on s390 (libtomcrypt), so I'll retry it with this build.

Comment 27 Dan Horák 2014-07-02 19:44:48 UTC
And I can confirm both gnutls and libtomcrypt now build and pass their test-suites successfully on s390. Thanks guys.