Created attachment 1335333 [details] full boot log I have prepared a F-27 installer image (from https://kojipkgs.fedoraproject.org/compose/branched/Fedora-27-20171004.n.0/compose/Everything/s390x/os/) and now I get a segfault when kernel switches to user-space. ================================================================================ [ 3.786004] Key type big_key registered [ 3.787436] Key type encrypted registered [ 3.787720] Freeing unused kernel memory: 664K [ 3.787725] Write protected read-only-after-init data: 20k [ 3.787728] rodata_test: all tests were successful [ 3.790779] User process fault: interruption code 0013 ilc:3 in libpthread-2. 26.so[3ff93c00000+1b000] [ 3.790786] CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1 [ 3.790788] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 3.790789] task: 00000000fafc8000 task.stack: 00000000fafc4000 [ 3.790791] User PSW : 0705200180000000 000003ff93c14e70 [ 3.790792] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI: 0 EA:3 [ 3.790794] User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000 003ff93144d5e [ 3.790795] 0000000000000000 0000000000000002 0000000000000000 000 003ff00000000 [ 3.790796] 0000000000000000 0000000000000418 0000000000000000 000 003ffcc9fe770 [ 3.790797] 000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000 003ffcc9fe6d0 [ 3.790805] User Code: 000003ff93c14e62: 60e0b030 std %f14,48( %r11) [ 3.790805] 000003ff93c14e66: 60f0b038 std %f15,56( %r11) [ 3.790805] #000003ff93c14e6a: e5600000ff0e tbegin 0,65294 [ 3.790805] >000003ff93c14e70: a7740006 brc 7,3ff93c 14e7c [ 3.790805] 000003ff93c14e74: a7080000 lhi %r0,0 [ 3.790805] 000003ff93c14e78: a7f40023 brc 15,3ff93 c14ebe [ 3.790805] 000003ff93c14e7c: b2220000 ipm %r0 [ 3.790805] 000003ff93c14e80: 8800001c srl %r0,28 [ 3.790819] Last Breaking-Event-Address: [ 3.790821] [<000003ff93c14de4>] 0x3ff93c14de4 [ 3.790950] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00 000004 [ 3.790950] [ 3.790952] CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1 [ 3.790953] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 3.790953] Call Trace: RUNNING DEVEL2 I have a guest running F-27 without problem, but what changed is that z/VM 6.4 hypervisor (updated last weekend to 6.4) now exposes the Transactional Execution bit (TE) to z/VM guests. It wasn't the case with the previously installed z/VM (6.1 or 6.3?) Version-Release number of selected component (if applicable): glibc-2.26-8.fc27.s390x
I find it suspicious that this is after a 'tbegin' instruction has started executing a transactional region. Exactly what hardware is this and does it claim to support HWCAP_S390_TE?
(In reply to Carlos O'Donell from comment #1) > I find it suspicious that this is after a 'tbegin' instruction has started > executing a transactional region. > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings "Guest Transactional Execution support". That's a change in our environment since last week.
(In reply to Dan Horák from comment #2) > (In reply to Carlos O'Donell from comment #1) > > I find it suspicious that this is after a 'tbegin' instruction has started > > executing a transactional region. > > > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? > > It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings > "Guest Transactional Execution support". That's a change in our environment > since last week. Is there any way to disable TE at the hardware level so the kernel doesn't report it and then see if this fixes the boot issue? Otherwise I will have to rebuild F27 glibc for s390x with elision turned off until I get the upstream tunables in place.
(In reply to Carlos O'Donell from comment #3) > (In reply to Dan Horák from comment #2) > > (In reply to Carlos O'Donell from comment #1) > > > I find it suspicious that this is after a 'tbegin' instruction has started > > > executing a transactional region. > > > > > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? > > > > It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings > > "Guest Transactional Execution support". That's a change in our environment > > since last week. > > Is there any way to disable TE at the hardware level so the kernel doesn't > report it and then see if this fixes the boot issue? > > Otherwise I will have to rebuild F27 glibc for s390x with elision turned off > until I get the upstream tunables in place. ... if that's the issue.
so with a glibc that correctly disables the lock elision (https://koji.fedoraproject.org/koji/taskinfo?taskID=22348456) the boot of the installation image continues correctly, without the segfault ...
The installation then succeeds, it installs glibc-2.26-8.fc27.s390x from the Fedora repos and the installed system boots without an issue.
for the record, rawhide compose has the same problem (https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20171011.n.0/compose/Server/s390x/os/images/)
Fixed scratch build with --disable-lock-elision: https://koji.fedoraproject.org/koji/taskinfo?taskID=22424738
It goes without saying that we are very interested in the root cause analysis of this issue with input from IBM, since this should "just work" (tm).
Scratch build passes and final libpthread.so.0 has no tbegin/tend. Final F27 build here: https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457
(In reply to Carlos O'Donell from comment #10) > Scratch build passes and final libpthread.so.0 has no tbegin/tend. > > Final F27 build here: > https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457 Dan, Can you please check these builds and see if they work and get back to me quickly? The sooner I hear back the faster I'll put this into a Bodhi update for F27. Thanks!
glibc-2.26-14.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f
There is (only) one difference I'm aware of between the working and failing scenario - the installer boot is initiated from the CMS shell using the virtual card reader, while the installed systems starts from CP using a DASD.
Proposed as a Freeze Exception for 27-final by Fedora user sharkcz using the blocker tracking app because: The installer doesn't boot on a s390x system when glibc with lock elision is used.
(In reply to Carlos O'Donell from comment #11) > (In reply to Carlos O'Donell from comment #10) > > Scratch build passes and final libpthread.so.0 has no tbegin/tend. > > > > Final F27 build here: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457 > > Dan, > > Can you please check these builds and see if they work and get back to me > quickly? The sooner I hear back the faster I'll put this into a Bodhi update > for F27. thanks for the update, the installer image boots with your glibc build, karma will follow
glibc-2.26-14.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f
+1 FE
glibc-2.26-15.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f
glibc-2.26-15.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f
Discussed during the 2017-10-23 blocker review meeting: [1] The decision to classify this bug as an AcceptedFreezeException was made as this breaks install boot on a non-blocking arch and can't be fixed via an update. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2017-10-23/f27-blocker-review.2017-10-23-16.00.txt
glibc-2.26-15.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
Went stable, closing bug. Please re-open if anything somehow still needs doing here.