Hide Forgot
tests are failing on s390(x) ... Executing(%check): /bin/sh -e /var/tmp/rpm-tmp.piSV6w + umask 022 + cd /builddir/build/BUILD + cd Net-SSLeay-1.58 + make test PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'inc', 'blib/lib', 'blib/arch')" t/local/*.t t/handle/local/*.t t/handle/local/05_use.t ................ Failed 1/1 subtests t/local/01_pod.t ....................... ok t/local/02_pod_coverage.t .............. skipped: these tests are for only for release candidate testing. Enable with RELEASE_TESTING=1 t/local/03_use.t ....................... Failed 1/1 subtests t/local/04_basic.t ..................... Failed 6/6 subtests t/local/05_passwd_cb.t ................. Failed 13/13 subtests t/local/06_tcpecho.t ................... No subtests run t/local/07_sslecho.t ................... No subtests run t/local/08_pipe.t ...................... No subtests run t/local/15_bio.t ....................... Failed 7/7 subtests t/local/20_autoload.t .................. No subtests run t/local/21_constants.t ................. No subtests run t/local/30_error.t ..................... No subtests run t/local/31_rsa_generate_key.t .......... No subtests run t/local/32_x509_get_cert_info.t ........ Failed 1243/1243 subtests t/local/33_x509_create_cert.t .......... Failed 124/124 subtests t/local/34_x509_crl.t .................. Failed 41/41 subtests t/local/35_ephemeral.t ................. Failed 3/3 subtests t/local/36_verify.t .................... Failed 25/25 subtests t/local/37_asn1_time.t ................. Failed 10/10 subtests t/local/38_priv-key.t .................. Failed 10/10 subtests t/local/39_pkcs12.t .................... Failed 19/19 subtests t/local/40_npn_support.t ............... No subtests run t/local/41_alpn_support.t .............. No subtests run t/local/50_digest.t .................... Failed 230/230 subtests t/local/61_threads-cb-crash.t .......... No subtests run t/local/62_threads-ctx_new-deadlock.t .. No subtests run t/local/kwalitee.t ..................... skipped: these tests are for only for release candidate testing. Enable with RELEASE_TESTING=1 Test Summary Report ------------------- t/handle/local/05_use.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 1 tests but ran 0. t/local/03_use.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 1 tests but ran 0. t/local/04_basic.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 6 tests but ran 0. t/local/05_passwd_cb.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 13 tests but ran 0. t/local/06_tcpecho.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/07_sslecho.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/08_pipe.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/15_bio.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 7 tests but ran 0. t/local/20_autoload.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/21_constants.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/30_error.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/31_rsa_generate_key.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/32_x509_get_cert_info.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 1243 tests but ran 0. t/local/33_x509_create_cert.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 124 tests but ran 0. t/local/34_x509_crl.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 41 tests but ran 0. t/local/35_ephemeral.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 3 tests but ran 0. t/local/36_verify.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 25 tests but ran 0. t/local/37_asn1_time.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 10 tests but ran 0. t/local/38_priv-key.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 10 tests but ran 0. t/local/39_pkcs12.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 19 tests but ran 0. t/local/40_npn_support.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/41_alpn_support.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/50_digest.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: Bad plan. You planned 230 tests but ran 0. t/local/61_threads-cb-crash.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output t/local/62_threads-ctx_new-deadlock.t (Wstat: 11 Tests: 0 Failed: 0) Non-zero wait status: 11 Parse errors: No plan found in TAP output Files=28, Tests=2, 1 wallclock secs ( 0.07 usr 0.02 sys + 0.68 cusr 0.12 csys = 0.89 CPU) Result: FAIL Failed 25/28 test programs. 0/2 subtests failed. make: *** [test_dynamic] Error 255 error: Bad exit status from /var/tmp/rpm-tmp.piSV6w (%check) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.piSV6w (%check) Child return code was: 1 EXCEPTION: Command failed. See logs for output. # ['bash', '--login', '-c', 'rpmbuild -bb --target s390x --nodeps builddir/build/SPECS/perl-Net-SSLeay.spec'] Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 70, in trace result = func(*args, **kw) File "/usr/lib/python2.7/site-packages/mockbuild/util.py", line 376, in do raise mockbuild.exception.Error, ("Command failed. See logs for output.\n # %s" % (command,), child.returncode) Error: Command failed. See logs for output. # ['bash', '--login', '-c', 'rpmbuild -bb --target s390x --nodeps builddir/build/SPECS/perl-Net-SSLeay.spec'] LEAVE do --> EXCEPTION RAISED for full logs please see http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=1351832 Version-Release number of selected component (if applicable): perl-Net-SSLeay-1.58-1.fc21 Setting as urgent as it currently blocks any progress on Rawhide for s390(x)
What was the last version of perl-Net-SSLeay/openssl that built successfully on s390(x)?
the previous perl-Net-SSLeay-1.57-1.fc21 (with openssl-1.0.1e-37.fc21.s390x in buildroot) was OK, for full history please see http://s390.koji.fedoraproject.org/koji/packageinfo?packageID=5993
and FWIW perl-Net-SSLeay-1.58-1.fc21 rebuilds fine in F-20
Does perl-Net-SSLeay-1.57-1.fc21 still build OK? It looks like the module is failing to load at all, which looks like perhaps a toolchain issue, but there's very little difference in the buildroots.
it doesn't - http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=1351908 :-( and thanks for the hint the buildroot for perl-Net-SSLeay-1.58-1.fc21 uses the same NVRs as build on primary with the exception of pcre that is rebuilt with larger stack for tests
Can you try with the regular version of pcre?
pcre-8.34-3.fc21 is used instead of pcre-8.34-2.fc21, the difference is only in http://pkgs.fedoraproject.org/cgit/pcre.git/commit/?id=e73104aed3ff90f784f8ee2d04ede2a94c34e412 - it's only about larger stack for %check
Is there a way of testing with pcre-8.34-2.fc21, to try to isolate if that's what's causing the failure? Or otherwise bisecting the buildroot changes that caused a previously-working build to fail?
Created attachment 869010 [details] diff between good and bad buildroots
And the prime suspect is glibc, after downgrading the test suite passes again.
and fails too with glibc-2.18.90-22.fc21
and also with glibc-2.18.90-27.fc21
the last working version is glibc-2.18.90-20.fc21
and fails even with glibc-2.19.90-3.fc21, so help is needed
In the past I've had little luck getting an s390 box with rawhide on it, does someone have a box all ready and setup so I could just login and do the rpmbuild to see what's failing in the build?
Currently I can provide a rawhide mock chroot where I tried all the various glibc versions during the build. I haven't tried upgrading the F-20 guest to rawhide yet.
The upstream resync from 2.18.90-20 to -21 was essentially glibc-2.18-753-gd5780fe..glibc-2.18-788-g497b1e6. Below are the S/390 specific commits: commit 87ded0c382b835e5d7ca8b5e059a8a044a6c3976 Author: Andreas Krebbel <krebbel.ibm.com> Date: Tue Jan 7 09:40:39 2014 +0100 S/390: Remove __tls_get_addr argument cast. commit c5eebdd084b77b0b581a3aa02213fa7cc5851216 Author: Andreas Krebbel <krebbel.ibm.com> Date: Tue Jan 7 09:40:00 2014 +0100 S/390: Get rid of unused variable warning in dl-machine.h commit 05d138ef07481b16f1aaee648798cc51182ec65e Author: Andreas Krebbel <krebbel.ibm.com> Date: Tue Jan 7 09:37:31 2014 +0100 S/390: Make ucontext_t extendible. commit 93a45ff1ca6d459618bb0cf93580c4b2809a4b61 Author: Andreas Krebbel <krebbel.ibm.com> Date: Tue Jan 7 09:36:31 2014 +0100 S/390: Make jmp_buf extendible.
some new info - the module must be built with the bad glibc for the tests to fail, upgrading/downgrading after building doesn't affect the results
and I can confirm it is one (or more) of the patches from comment 17, glibc-2.18.90-21.fc21 is bad, glibc-2.18.90-21.fc21 with those 4 patches reverted is good
and commit 93a45ff1ca6d459618bb0cf93580c4b2809a4b61 Author: Andreas Krebbel <krebbel.ibm.com> Date: Tue Jan 7 09:36:31 2014 +0100 S/390: Make jmp_buf extendible. is the problem ...
reduced reproducer - install F-20 - update glibc from http://fedora.danny.cz/s390/glibc-2.18.90-20.fc21.dh.1/ - it is glibc-2.18.90-20.fc21 + commit 93a45ff1 - rpmbuild --rebuild http://fedora.danny.cz/s390/perl-Net-SSLeay-1.58-1.fc21.src.rpm for every failed test following info appears in kernel log: [ 6672.505145] User process fault: interruption code 0x6003B in SSLeay.so[3fff6650000+89000] [ 6672.505155] failing address: 0 [ 6672.505159] CPU: 0 PID: 16420 Comm: perl Not tainted 3.13.6-200.fc20.s390x #1 [ 6672.505162] task: 0000000072333c98 ti: 000000005859c000 task.ti: 000000005859c000 [ 6672.505176] User PSW : 0705000180000000 000003fff66b682c (0x3fff66b682c) [ 6672.505178] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 EA:3 User GPRS: 0000000088af88e8 0000000000000000 00000000887cc010 0000000000000000 [ 6672.505185] 000003fffcedc360 0000000000000001 0000000000000020 000003fff66db0e8 [ 6672.505188] 0000000000000000 0000000000000001 0000000088aca5a8 000003fffd1ed3a0 [ 6672.505192] 000003fffcebb000 000003fff66ceff0 000003fff66b6826 000003ffff852bd8 [ 6672.505202] User Code: 000003fff66b681a: e320b0000016 llgf %r2,0(%r11) 000003fff66b6820: c0e5fffd273c brasl %r14,3fff665b698 #000003fff66b6826: e31026b80004 lg %r1,1720(%r2) >000003fff66b682c: e31010100004 lg %r1,16(%r1) 000003fff66b6832: e32010000002 ltg %r2,0(%r1) 000003fff66b6838: e320b0000016 llgf %r2,0(%r11) 000003fff66b683e: a78402ed brc 8,3fff66b6e18 000003fff66b6842: c0e5fffd272b brasl %r14,3fff665b698 [ 6672.505324] Last Breaking-Event-Address: [ 6672.505328] [<000003fffcecf64e>] 0x3fffcecf64e
(In reply to Dan Horák from comment #20) > and > > commit 93a45ff1ca6d459618bb0cf93580c4b2809a4b61 > Author: Andreas Krebbel <krebbel.ibm.com> > Date: Tue Jan 7 09:36:31 2014 +0100 > > S/390: Make jmp_buf extendible. > > is the problem ... I've contacted Andreas upstream and asked him for help looking into this since he is the author of the patch.
(In reply to Carlos O'Donell from comment #22) > (In reply to Dan Horák from comment #20) > > and > > > > commit 93a45ff1ca6d459618bb0cf93580c4b2809a4b61 > > Author: Andreas Krebbel <krebbel.ibm.com> > > Date: Tue Jan 7 09:36:31 2014 +0100 > > > > S/390: Make jmp_buf extendible. > > > > is the problem ... > > I've contacted Andreas upstream and asked him for help looking into this > since he is the author of the patch. oh, I did the same couple days ago, but forgot to mention it here :-)
Created attachment 879767 [details] output when running the test with LD_DEBUG=versions
Running the build step by step I've found that a simple 'use Net::SSLeay' causes a segfault when used with the newer glibc. Attaching a backtrace.
Created attachment 879794 [details] backtrace
this function from SSLeay.xs UV get_my_thread_id(void) /* returns threads->tid() value */ { dSP; UV tid = 0; int count = 0; #ifdef USE_ITHREADS ENTER; SAVETMPS; PUSHMARK(SP); XPUSHs(sv_2mortal(newSVpv("threads", 0))); PUTBACK; count = call_method("tid", G_SCALAR|G_EVAL); SPAGAIN; if (SvTRUE(ERRSV) || count != 1) /* if threads not loaded or an error occurs return 0 */ tid = 0; else tid = (UV)POPi; PUTBACK; FREETMPS; LEAVE; #endif return tid; } expands to UV get_my_thread_id(void) { SV **sp = (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_sp); UV tid = 0; int count = 0; Perl_push_scope(((PerlInterpreter *)pthread_getspecific(PL_thr_key))); Perl_save_int(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), (int*)&(((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Itmps_floor)), (((PerlInterprete r *)pthread_getspecific(PL_thr_key))->Itmps_floor) = (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Itmps_ix); (void)( { if (++(((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Imarkstack_ptr) == (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Imarkstack_max)) Perl_markstack_grow(((PerlInterpreter *)pthread_getspecific(PL_thr_key))); *(((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Imarkstack_ptr) = (I32)((sp) - (((P erlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_base)); } ); ((void)(__builtin_expect((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_max) - sp < (int)(1),0) && (sp = Perl_stack_grow(((PerlInterpreter *)pthrea d_getspecific(PL_thr_key)), sp,sp,(int) (1)))), *++sp = (Perl_sv_2mortal(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), Perl_newSVpv(((PerlInterpreter *)pthrea d_getspecific(PL_thr_key)), "threads",0)))); (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_sp) = sp; count = Perl_call_method(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), "tid",2|8); sp = (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_sp); if ((((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv)))) && ((((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & 0x00200000) ? Perl_sv_2bool_flags(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), (*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))),2) : ( !(((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & (0x00000100|0x00000200|0x00000400|0x00000800| 0x00001000|0x00002000|0x00004000|0x00008000) || (((svtype)(((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & 0xff)) == SVt_REGEXP || (((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & (0xff|0x00004000|0x00008000|0x01000000)) == (SVt_PVLV|0x01000000))) ? 0 : (((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & 0x00000400) ? ( ((XPV*)(((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv)))))->sv_any) && ( ((XPV*)(((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv)))))->sv_any)->xpv_cur > 1 || ( ((XPV*)(((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv)))))->sv_any)->xpv_cur && *((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_u.svu_pv != '0' ) ) ) : (((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & (0x00000100|0x00000200)) ? ( ((((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & 0x00000100) && ((XPVIV*) ((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_any)->xiv_u.xivu_iv != 0) || ((((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_flags & 0x00000200) && ((XPVNV*) ((*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))))->sv_any)->xnv_u.xnv_nv != 0.0)) : (Perl_sv_2bool_flags(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), (*((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv ? &((0+((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv))->sv_u.svu_gp)->gp_sv) : &((0+(Perl_gv_add_by_type(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Ierrgv)),SVt_NULL))->sv_u.svu_gp)->gp_sv))),0))))) || count != 1) tid = 0; else tid = (UV)((IV)({SV *_sv = ((SV *)({ void *_p = ((*sp--)); _p; })); ((((_sv)->sv_flags & (0x00000100|0x00200000)) == 0x00000100) ? ((XPVIV*) (_sv)->sv_any)->xiv_u.xivu_iv : Perl_sv_2iv_flags(((PerlInterpreter *)pthread_getspecific(PL_thr_key)), _sv,2)); })); (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Istack_sp) = sp; if ((((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Itmps_ix) > (((PerlInterpreter *)pthread_getspecific(PL_thr_key))->Itmps_floor)) Perl_free_tmps(((PerlInterpreter *)pthread_getspecific(PL_thr_key))); Perl_pop_scope(((PerlInterpreter *)pthread_getspecific(PL_thr_key))); return tid; }
The change of the jmpbuf size requires that all packages that exchange jmpbufs are upgraded at once. The sequence you describe above: "reduced reproducer - install F-20 - update glibc from http://fedora.danny.cz/s390/glibc-2.18.90-20.fc21.dh.1/ - it is glibc-2.18.90-20.fc21 + commit 93a45ff1 - rpmbuild --rebuild http://fedora.danny.cz/s390/perl-Net-SSLeay-1.58-1.fc21.src.rpm" fails since you have the perl-Net* package built with the new jmpbuf and perl itself with the old. They both expect different sizes for the same data type. I'm not sure where the original failure came from but it probably has to do with the order in which you've upgrade the packages. I haven't figured out all the details yet but I can confirm that just commenting out the additional fields in /usr/include/bits/setjmp.h makes the problem disappear.
Thanks, Andreas, your explanation makes sense. I'm going to dig into the perl itself first.
Andreas, Let me make sure I understand. You're saying that code which exchanges jmp_bufs have to be upgraded in lock-step together or they will (possibly silently) fail? In effect what we've got here is an ABI/API break across the glibc version, right?
Yes. This is an expected result of the jmpbuf extension. I've tried to minimize the effect by versioning all the accessor functions but symbol versioning is not available for data structures. In the end there is just one header file defining the structure. Code which uses the old header file is not compatible with code using the new header file in case jmpbufs are transferred between the two. This happened in the past already. E.g. for Power with the introduction of Altivec or with the long double 64->128 bit extension.
(In reply to Andreas Krebbel from comment #31) > Yes. > > This is an expected result of the jmpbuf extension. I've tried to minimize > the effect by versioning all the accessor functions but symbol versioning is > not available for data structures. In the end there is just one header file > defining the structure. Code which uses the old header file is not > compatible with code using the new header file in case jmpbufs are > transferred between the two. > > This happened in the past already. E.g. for Power with the introduction of > Altivec or with the long double 64->128 bit extension. We can do better though :-) All of this could have been handled by using the compiler to generate a .gnu.attribute entry for the new ABI when such a structure was used. Then the static linker could generate a warning when linking mixed ABI objects (undefined + new ABI) or an error (old ABI + new ABI). This results in a much better user experience and the .gnu.attributes track which ABI components are in use (look at ARM which tracks the size of wchar_t). Nobody likes to do this because it's work and nobody has yet extended the compiler to do this kind of suppression of the "don't care" state to make objects as interoperable as possible. Background reading: Binutils documention on attributes: https://sourceware.org/binutils/docs-2.21/as/GNU-Object-Attributes.html#GNU-Object-Attributes Discussion around "don't care attributes" https://www.sourceware.org/ml/libc-alpha/2011-02/msg00130.html
(In reply to Carlos O'Donell from comment #32) > All of this could have been handled by using the compiler to generate a > .gnu.attribute entry for the new ABI when such a structure was used. Then > the static linker could generate a warning when linking mixed ABI objects > (undefined + new ABI) or an error (old ABI + new ABI). This results in a > much better user experience and the .gnu.attributes track which ABI > components are in use (look at ARM which tracks the size of wchar_t). So far this has been used solely for indicating ABI relevant changes inflicted by compiler options. What you propose would be the first use for changes of Glibc data structures. It probably requires some more work to either detect all usages of such data structures and compare their definitions within GCC to emit the proper flags - or - to provide a language level type attribute to put an abi tag on data structures which is then translated by GCC to the .gnu.attr... stuff (after tracking down all its embedded uses). While I think that mechanism would have been useful for static linking the situation with dynamic linking and Glibc data structures is a bit better since we have the accessor functions under control. Of course there might be somebody directly accessing a jmpbuf but that's hopefully a very rare case. Due to the symbol versioning of the accessor functions there are only few cases left where this is actually a problem. In general you can dynamically link two objects using different jmpbuf versions. They would use different sets of setjmp/longjmp symbols in glibc and all should be fine. Problems only occur if they pass jmpbuf objects to each other. So the mechanism above would trigger in too many cases to be useful I think. Note: In fact even passing jmpbufs between .so's isn't a problem currently since the reserved fields are never accessed. The only problem we have right now is if: 1. a jmpbuf is embedded in another data structure (not being the last element) 2. that data structure is shared among modules assuming different jmpbuf sizes
(In reply to Dan Horák from comment #29) > Thanks, Andreas, your explanation makes sense. I'm going to dig into the > perl itself first. To my understanding the problem is that a sigjmp_buf is embedded into the main perl interpreter structure. cop.h: struct jmpenv { struct jmpenv * je_prev; Sigjmp_buf je_buf; <---- jmpbuf int je_ret; bool je_mustcatch; }; typedef struct jmpenv JMPENV; intrpvar.h: ... PERLVAR(I, top_env, JMPENV *) PERLVAR(I, start_env, JMPENV) <---- !!! PERLVARI(I, errors, SV *, NULL) ... The struct interpreter is passed to many .so's involved with perl via my_perl argument. In one of the examples I've debugged the problem arose from having perl-version built with the old glibc headers and perl itself with the new version. So the /usr/lib64/perl5/vendor_perl/auto/version/vxs/vxs.so module coming from perl-version used different offsets into the my_perl structure than perl itself. If all the required perl .so files come from RPMs rebuilding all of them at once should help. What I don't know is whether perl .so files dealing with struct interpreter might come in from other sources as well like CPAN?!
Andreas, I've written up "Packaging Changes" notes for this in upstream: https://sourceware.org/glibc/wiki/Release/2.19#Packaging_Changes Could you please checkin a note to the 2.19 section of the NEWS file in upstream stating that there is an ABI even for s390/s390x, please also could you backport that to the active 2.19 branch (requires Allan McRae to sign off). This way we've covered our bases and made it clear in NEWS and release notes that there is a potential ABI issue coming down the pipe. I will work within Red Hat to get this information to all of our customers. (In reply to Andreas Krebbel from comment #33) > (In reply to Carlos O'Donell from comment #32) > > All of this could have been handled by using the compiler to generate a > > .gnu.attribute entry for the new ABI when such a structure was used. Then > > the static linker could generate a warning when linking mixed ABI objects > > (undefined + new ABI) or an error (old ABI + new ABI). This results in a > > much better user experience and the .gnu.attributes track which ABI > > components are in use (look at ARM which tracks the size of wchar_t). > > So far this has been used solely for indicating ABI relevant changes > inflicted by compiler options. What you propose would be the first use for > changes of Glibc data structures. It probably requires some more work to > either > detect all usages of such data structures and compare their definitions > within GCC to emit the proper flags > - or - > to provide a language level type attribute to put an abi tag on data > structures which is then translated by GCC to the .gnu.attr... stuff (after > tracking down all its embedded uses). That is correct. Nobody wants to be the first to attempt this :-) Worse is that this only works when building your application. At runtime if the library is updated you need to use an ELF header flag (e_flag) bit or 2 bits to annotate the ABI change and this allows ldconfig to correctly discover and handle allowing old binaries to load new modules with the new ABI. Note that this is ABI markup at the object file level for runtime diagnostics, but we really want that data to live at the function and and variable and trickle up. Keeping the ABI markup at the function level for the runtime is probably too costly. Imagine the dynamic loader comparing function ABIs as it resolves PLT entries! > While I think that mechanism would have been useful for static linking the > situation with dynamic linking and Glibc data structures is a bit better > since we have the accessor functions under control. Of course there might be > somebody directly accessing a jmpbuf but that's hopefully a very rare case. > Due to the symbol versioning of the accessor functions there are only few > cases left where this is actually a problem. In general you can dynamically > link two objects using different jmpbuf versions. They would use different > sets of setjmp/longjmp symbols in glibc and all should be fine. Problems > only occur if they pass jmpbuf objects to each other. So the mechanism above > would trigger in too many cases to be useful I think. That is correct, but this issue shows that it's actually common to run into these problems changing the size of any of the structures exported for public use by glibc. Fixing the accessor macros never works perfectly. Too many applications simply embedded the jmpbuf direclty into another structure and that is eventually used by newer compiled object code which expects the new size and it fails. I expect Ruby is going to fail also since it embeds jmp_buf similarly. > Note: In fact even passing jmpbufs between .so's isn't a problem currently > since the reserved fields are never accessed. The only problem we have right > now is if: > 1. a jmpbuf is embedded in another data structure (not being the last > element) > 2. that data structure is shared among modules assuming different jmpbuf > sizes That is correct. Unfortunately this is much more common than you think. Either way, if we need to extend jmp_buf and struct ucontext we need to do it. Our primary goals should be: * Clear communication to our customers of both the benefits and the problems. * Better diagnostics for mixing code that could result in an ABI breakage. I think we can and should be doing better on that second bullet point.
(In reply to Andreas Krebbel from comment #34) > If all the required perl .so files come from RPMs rebuilding all of them at > once should help. What I don't know is whether perl .so files dealing with > struct interpreter might come in from other sources as well like CPAN?! We can only support those modules we build ourselves and distribute with RHEL. In that case we can make sure everything is rebuilt and works. What we can't guarantee is that an old module built by a user works correctly. So any user upgrading to say RHEL8 (hypothetical) will need to rebuild all of their perl modules because of the ABI breakage.
I have restarted rawhide builds and the change seems to be more severe than I thought originally. The perl stack is mixing old and rebuilt modules too often ...
(In reply to Dan Horák from comment #37) > I have restarted rawhide builds and the change seems to be more severe than > I thought originally. The perl stack is mixing old and rebuilt modules too > often ... Do you talk about perl.spec itself or about building a Perl package in general? In the first case, this should not happen in minimal build root. In the second case, you have to do the bootstrap. I.e. to rebuild the packages in dependency order and with defined perl_bootstrap spec macro and with changed rebuild_from_scratch macro in perl.spec and you have to treat dual-living packages specially.
(In reply to Petr Pisar from comment #38) > (In reply to Dan Horák from comment #37) > > I have restarted rawhide builds and the change seems to be more severe than > > I thought originally. The perl stack is mixing old and rebuilt modules too > > often ... > > Do you talk about perl.spec itself or about building a Perl package in > general? perl-5.18.2-297.fc21 build went fine, thanks for the fix. The problem lies app that use perl (eg. automake) or additional perl modules. > In the first case, this should not happen in minimal build root. > > In the second case, you have to do the bootstrap. I.e. to rebuild the > packages in dependency order and with defined perl_bootstrap spec macro and > with changed rebuild_from_scratch macro in perl.spec and you have to treat > dual-living packages specially. Yeah, I'm thinking about some kind of bootstrap. Unfortunately doing such thing solely for a secondary arch is difficult, so I'm thinking about the options.
This is now fixed as IBM have reverted their patches and we've synchronzied with upstream.