Bug 2057391 - kexec-tools built with gcc 12 will fail kexec/kdump jumping to 2nd kernel with kexec_load interface
Summary: kexec-tools built with gcc 12 will fail kexec/kdump jumping to 2nd kernel wit...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kexec-tools
Version: 36
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Baoquan He
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 2056876 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-23 10:21 UTC by Baoquan He
Modified: 2022-05-07 04:18 UTC (History)
7 users (show)

Fixed In Version: kexec-tools-2.0.23-6.fc36
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-07 04:18:42 UTC
Type: Bug
Embargoed:
bcotton: fedora_prioritized_bug-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-398 0 None None None 2022-02-23 10:28:31 UTC

Description Baoquan He 2022-02-23 10:21:05 UTC
Description of problem:
While kdump can work well with kexec_file_load. debugging to see what's going on. kexec reboot also failed to jump, and just goes to firmware to reboot.

oot@dell-pet610-01 ~]# 
[root@dell-pet610-01 ~]# [ 1284.865822] sysrq: Trigger a crash
[ 1284.869383] Kernel panic - not syncing: sysrq triggered crash
[ 1284.875163] CPU: 9 PID: 1348 Comm: bash Kdump: loaded Tainted: G          I      --------- ---  5.17.0-0.rc5.038101e6b2cd.103.test.fc36.x86_61
[ 1284.888182] Hardware name: Dell Inc. PowerEdge T610/0N028H, BIOS 6.4.0 07/23/2013
[ 1284.895651] Call Trace:
[ 1284.898093]  <TASK>
[ 1284.900191]  dump_stack_lvl+0x5d/0x78
[ 1284.903853]  panic+0x111/0x32d
[ 1284.906930]  sysrq_handle_crash+0x18/0x20
[ 1284.910936]  __handle_sysrq+0x17d/0x1e0
[ 1284.914775]  write_sysrq_trigger+0x44/0x50
[ 1284.918870]  proc_reg_write+0x47/0xa0
[ 1284.922534]  vfs_write+0x108/0x360
[ 1284.925936]  ? lock_release+0x2eb/0x410
[ 1284.929779]  ? syscall_enter_from_user_mode+0x2e/0x1c0
[ 1284.934924]  ksys_write+0x5b/0xb0
[ 1284.938246]  do_syscall_64+0x43/0x90
[ 1284.941823]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1284.946869] RIP: 0033:0x7f617ab63027
[ 1284.950451] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 4
[ 1284.969190] RSP: 002b:00007ffc3060d998 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1284.976752] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f617ab63027
[ 1284.983875] RDX: 0000000000000002 RSI: 0000563a892fbda0 RDI: 0000000000000001
[ 1284.990998] RBP: 0000563a892fbda0 R08: 0000000000000000 R09: 0000000000000073
[ 1284.998120] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
[ 1285.005242] R13: 00007f617ac555a0 R14: 0000000000000002 R15: 00007f617ac55780
[ 1285.012397]  </TASK>
�`�fϞ`�f�����������������



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Baoquan He 2022-03-08 02:14:01 UTC
After investigation, the kdump failing to switching into 2nd kernel is caused by gcc 12 upgrading. In fedora rawhide, gcc has been upgraded from 11 to 12. Then kexec-tools built with gcc 12 will fail the swithing with kexec_load interface. With kexec_file_load interface, kdump and kexec reboot all works well. The kexec reboot with kexec_load is also caused failure by this.

The root cause why kexec-tools built with gcc 12 will fail the jumping into 2nd kernel hasn't been made clear.

Comment 2 Baoquan He 2022-03-08 02:14:55 UTC
By the way, kernel built with gcc 12 doesn't matter with kdump/kexec jumping.

Comment 3 Baoquan He 2022-03-08 02:39:44 UTC
*** Bug 2056876 has been marked as a duplicate of this bug. ***

Comment 4 Dave Young 2022-03-30 07:43:22 UTC
Note:  kexec reboot also fails with a reset to bios with below test steps:


kexec -l /boot/vmlinuz-`uname -r` --initrd /boot/initramfs-`uname -r`.img --reuse-cmdline
reboot

Expect test result: reboot into new kernel without going through bios
Actual result: reset to bios and a hard reboot happens.

An upstream kexec-tools patch is posted this week, it will be merged soon.

Comment 5 Dave Young 2022-03-30 07:52:09 UTC
Patch link: http://lists.infradead.org/pipermail/kexec/2022-March/024408.html

Comment 6 Matthew Miller 2022-04-06 14:26:38 UTC
Hi Dave! Can you clarify what action would be helpful as a prioritized bug? Is it concern with getting that patch into the Fedora package, or with testing it? Does this affect F36 in addition to Rawhide (and if so, would a freeze exception be a good idea?)?

Comment 7 Ben Cotton 2022-04-06 15:07:01 UTC
In today's Prioritized Bugs meeting, we agreed to defer a decision on this bug pending the input requested in comment 6
https://meetbot.fedoraproject.org/fedora-meeting-1/2022-04-06/fedora_prioritized_bugs_and_issues.2022-04-06-14.01.log.html#l-72

Comment 8 Dave Young 2022-04-07 08:42:39 UTC
Hi Matthew,

I do not know the exact Fedora process, just add the flag so that people can be aware of this bug, and I hope we can fix the bug in F36 :)

Coiby, the kexec-tools Fedora maintainer said he has merged the fixes in Fedora 36 branch and made a build today.

Yes, it affect F36 and rawhide as well, I moved the bz to F36. For rawhide we can have a rebase later to include the fixes automatically.

An exception is good if the process requires it to be added in F36.

Thanks
Dave

Comment 9 Coiby 2022-04-07 09:00:05 UTC
Yes, kexec-tools-2.0.23-6.fc36 [1] has been released to fix this bug.

[1] https://koji.fedoraproject.org/koji/buildinfo?buildID=1941936

Comment 10 Fedora Update System 2022-04-07 09:15:20 UTC
FEDORA-2022-c7080eb130 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-c7080eb130

Comment 11 Fedora Update System 2022-04-07 18:01:27 UTC
FEDORA-2022-c7080eb130 has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-c7080eb130`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-c7080eb130

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 12 Ben Cotton 2022-04-20 14:37:32 UTC
In today's Prioritized Bugs meeting, we agreed to reject this as a Prioritized Bug as a fix is already in the updates-testing repo and it does not seem to affect a large number of Fedora Linux users. 

https://meetbot.fedoraproject.org/fedora-meeting-1/2022-04-20/fedora_prioritized_bugs_and_issues.2022-04-20-14.01.log.html#l-52

Comment 13 Fedora Update System 2022-05-07 04:18:42 UTC
FEDORA-2022-c7080eb130 has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.