Kudmp failed for kernel 6.5.0-0.rc1.11.fc39.ppc64le. [ 20.681017] systemd[1]: Starting kdump-capture.service - Kdump Vmcore Save Service... [ 20.875696] kdump.sh[429]: kdump: saving to /sysroot/var/crash/127.0.0.1-2023-07-12-03:35:02/ [ 20.935902] kdump.sh[429]: kdump: saving vmcore-dmesg.txt to /sysroot/var/crash/127.0.0.1-2023-07-12-03:35:02/ [ 20.938392] kdump.sh[474]: Cannot open /proc/vmcore: No such file or directory [ 20.940384] kdump.sh[429]: kdump: saving vmcore-dmesg.txt failed [ 20.940709] kdump.sh[429]: kdump: saving vmcore [ 20.989322] kdump.sh[476]: open_dump_memory: Can't open the dump memory(/proc/vmcore). No such file or directory [ 20.996785] kdump.sh[476]: makedumpfile Failed. [ 20.997576] kdump.sh[429]: kdump: saving vmcore failed, exitcode:1 [ 20.997868] kdump.sh[429]: kdump: saving vmcore failed [ 21.038209] kdump.sh[429]: kdump: saving the /run/initramfs/kexec-dmesg.log to /sysroot/var/crash/127.0.0.1-2023-07-12-03:35:02/// [ 21.046453] systemd[1]: kdump-capture.service: Main process exited, code=exited, status=1/FAILURE [ 21.046806] systemd[1]: kdump-capture.service: Failed with result 'exit-code'. Reproducible: Always Steps to Reproduce: 1.dnf install kexec-tools kernel-modules -y 2. reboot 3. systemctl start kdump 4. trigger kernel crash Actual Results: kdump failed to save the kernel coredump. Expected Results: kdump successfully saves the kernel coredump. Originally reported by CoreOS team https://github.com/coreos/fedora-coreos-tracker/issues/1523
In several CKI failure reports, I saw the similar problem. Besides, it has shown it may be caused by corrupted eflcorehdr as below: [ 0.148565] Warning: Core image elf header is not sane [ 0.148570] Kdump: vmcore not initialized Please see one test_console.og from one failed cki case: https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/930627185/4650966145/redhat%3A930627185/build_ppc64le_redhat%3A930627185-ppc64le-kernel/tests/2/results_0001/job.01/recipes/14221455/tasks/7/logs/test_console.log Thanks Baoquan
I did a git bisection using kernel-auto-bisect [1] and the first bad commit is 606787fed7268feb256957872586370b56af697a "powerpc/64s: Remove support for ELFv1 little endian userspace". [1] https://gitlab.com/redhat/centos-stream/src/kernel/utils/tools/-/tree/main/kernel-auto-bisect
Created attachment 1976289 [details] untested patch Thanks Coiby for bisecting. If any of you have the machine, could you try the untested patch see if it works?
Tried Fedora 38, but it has a significant bug, which panics the kernel during the compiling of kernel. I tried to re-install the baremetal with RHEL-9, then tested the latest upstream kernel but the kdump kernel experiences another type of panic. [ 21.619230] usb 2-4: new SuperSpeed USB device number 2 using xhci_hcd [ 21.671227] usb 2-4: New USB device found, idVendor=0451, idProduct=8140, bcdDevice= 1.00 [ 21.671248] usb 2-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 37.349138] watchdog: CPU 68 detected hard LOCKUP on other CPUs 0 [ 37.349156] watchdog: CPU 68 TB:10136783097847, last SMP heartbeat TB:10128591098849 (15999ms ago) [ 37.349307] watchdog: CPU 0 Hard LOCKUP [ 37.349310] watchdog: CPU 0 TB:10136783184586, last heartbeat TB:10126477817591 (20127ms ago) [ 37.349313] Modules linked in: [ 37.349317] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.5.0-rc2+ #1 [ 37.349322] Hardware name: 9006-22P POWER9 (raw) 0x4e1202 opal:skiboot-v6.0.23 PowerNV [ 37.349324] NIP: 0000000030005104 LR: c0000000080cea00 CTR: c0000000080d7360 [ 37.349327] REGS: c000000107be3d60 TRAP: 0100 Not tainted (6.5.0-rc2+) [ 37.349330] MSR: 9000000000081002 <SF,HV,ME,RI> CR: 22004484 XER: 0000005b [ 37.349340] CFAR: 000000003000510c IRQMASK: 3 [ 37.349340] GPR00: 0000000000000009 c000000107e63cd0 0000000030000000 00000000000ffff6 [ 37.349340] GPR04: c000000107e63e20 0000000000000040 3ffffffff1ae9700 000000000000000e [ 37.349340] GPR08: c00000000e516950 0000000000000000 0000000000000000 0000000000000001 [ 37.349340] GPR12: 0000000031ee0000 c000000107fef480 0000000000000000 0000000000000000 [ 37.349340] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 37.349340] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 37.349340] GPR24: 0000000000000009 0000000000000003 0000000000000000 0000000000000000 [ 37.349340] GPR28: c00000000e516950 000000000000000e 3ffffffff1ae9700 c000000107e63e20 [ 37.349394] NIP [0000000030005104] 0x30005104 [ 37.349399] LR [c0000000080cea00] opal_return+0x0/0x30 [ 37.349406] Call Trace: [ 37.349407] [c000000107e63cd0] [c0000000080cc024] opal_call+0xe4/0x1c0 (unreliable) [ 37.349416] [c000000107e63d90] [c0000000080cc468] opal_handle_interrupt+0x28/0x40 [ 37.349423] [c000000107e63e00] [c0000000080d739c] opal_interrupt+0x3c/0xa0 [ 37.349430] [c000000107e63e30] [c0000000082029f8] __handle_irq_event_percpu+0x88/0x230 [ 37.349437] [c000000107e63ed0] [c000000008202cb4] handle_irq_event+0x74/0x130 [ 37.349444] [c000000107e63f00] [c00000000820a86c] handle_fasteoi_irq+0xbc/0x350 [ 37.349450] [c000000107e63f40] [c000000008200910] generic_handle_irq+0x50/0x80 [ 37.349456] [c000000107e63f60] [c000000008017318] __do_irq+0xb8/0x230 [ 37.349462] [c000000107e63fe0] [c000000008017c68] __do_IRQ+0x88/0xe0 [ 37.349468] [c00000000e733b10] [0000000000000000] 0x0 [ 37.349472] [c00000000e733b50] [c000000008017d10] do_IRQ+0x50/0xb0 [ 37.349478] [c00000000e733b80] [c00000000800b63c] h_virt_irq_common_virt+0x28c/0x290 [ 37.349486] --- interrupt: ea0 at arch_local_irq_restore.part.0+0x188/0x190 [ 37.349492] NIP: c000000008038098 LR: c000000008fc380c CTR: c0000000080d26e0 [ 37.349494] REGS: c00000000e733bb0 TRAP: 0ea0 Not tainted (6.5.0-rc2+) [ 37.349497] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28004484 XER: 0000005b [ 37.349514] CFAR: 0000000000000000 IRQMASK: 0 [ 37.349514] GPR00: c000000008fc380c c00000000e733e50 c000000009572c00 0000000000000000 [ 37.349514] GPR04: 0000000000000000 0000000000000000 c00000000ac42a80 c000000107fef480 [ 37.349514] GPR08: 00000000f8c90000 0000000000000000 0000000000008002 0000000028002822 [ 37.349514] GPR12: c0000000080d26e0 c000000107fef480 c000001ff4663f90 0000000000000000 [ 37.349514] GPR16: 0000000000000000 c00000000002d18c c00000000002d164 c0000000020100e4 [ 37.349514] GPR20: 0000000000000006 c000001ff4660000 0000000000000000 0000000000000001 [ 37.349514] GPR24: 0000000000000000 0000000030942298 0000000031ee00b0 c00000000ac56760 [ 37.349514] GPR28: 0000000000000002 0000000000000003 0000000000000000 fcffffffffffffff [ 37.349568] NIP [c000000008038098] arch_local_irq_restore.part.0+0x188/0x190 [ 37.349574] LR [c000000008fc380c] default_idle_call+0x6c/0x140 [ 37.349579] --- interrupt: ea0 [ 37.349580] [c00000000e733e50] [c00000000e733e90] 0xc00000000e733e90 (unreliable) [ 37.349585] [c00000000e733e90] [c000000008fc380c] default_idle_call+0x6c/0x140 [ 37.349591] [c00000000e733eb0] [c0000000081ce0bc] cpuidle_idle_call+0x1bc/0x260 [ 37.349596] [c00000000e733f10] [c0000000081ce268] do_idle+0x108/0x1c0 [ 37.349601] [c00000000e733f60] [c0000000081ce558] cpu_startup_entry+0x38/0x40 [ 37.349606] [c00000000e733f90] [c00000000805f88c] start_secondary+0x24c/0x250 [ 37.349613] [c00000000e733fe0] [c00000000800e058] start_secondary_prolog+0x10/0x14 [ 37.349619] Code: 4c006c81 00000b2c 3c00e241 02000b2c 0c008240 feff6038 00010048 48006c81 ffff6b39 48006c91 780b217c 78fbff7f <4c006c81> 01000b2c f8ff8241 7813427c
On this baremetal (ibm-p9b-26.ibm2.lab.eng.bos.redhat.com) I tried to reproduce this bug by checkout the first bad commit 606787fed7268feb256957872586370b56af697a "powerpc/64s: Remove support for ELFv1 little endian userspace". But the compiled kernel boot up with panic [ OK ] Finished Load/Save Random Seed. [ OK ] Finished Create Static Device Nodes in /dev. Starting Rule-based Manage…for Device Events and Files... [ OK ] Finished Monitoring of LVM… dmeventd or progress polling. [ OK ] Started Rule-based Manager for Device Events and Files. Starting Load Kernel Module configfs... [ OK ] Finished Load Kernel Module configfs. Starting Load Kernel Module fuse... [ OK ] Finished Load Kernel Module fuse. [ 10.694120] IPMI message handler: version 39.2 [ 10.771425] ipmi device interface [ 10.830605] ipmi-powernv ibm,opal:ipmi: IPMI message handler: Found new BMC (man_id: 0x002a7c, prod_id: 0x0985, dev_id: 0x20) [ 10.870743] at24 0-0050: 16384 byte 24c128 EEPROM, writable, 1 bytes/write [ 10.917802] at24 2-0050: 32768 byte 24c256 EEPROM, writable, 1 bytes/write [ 24.024955] watchdog: CPU 4 detected hard LOCKUP on other CPUs 6 [ 24.024980] watchdog: CPU 4 TB:19395148891436, last SMP heartbeat TB:19386956899153 (15999ms ago) [ 24.025121] watchdog: CPU 6 Hard LOCKUP [ 24.025123] watchdog: CPU 6 TB:19395148977304, last heartbeat TB:19386956898656 (16000ms ago) [ 24.025126] Modules linked in: at24 ipmi_powernv ofpart regmap_i2c ipmi_devintf powernv_flash opal_prd ibmpowernv ipmi_msghandler mtd xfs libcrc32c sd_mod t10_pi sg ast drm_kms_helper syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_shmem_helper drm i40e vmx_crypto aacraid drm_panel_orientation_quirks fuse [ 24.025157] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 6.4.0-rc2+ #2 [ 24.025161] Hardware name: 9006-22P POWER9 0x4e1202 opal:skiboot-v6.0.23 PowerNV [ 24.025162] NIP: 0000000030005104 LR: c0000000000cf300 CTR: c0000000000d7ad0 [ 24.025165] REGS: c000001fff3bbd60 TRAP: 0100 Not tainted (6.4.0-rc2+) [ 24.025168] MSR: 9000000000081002 <SF,HV,ME,RI> CR: 22004822 XER: 20040000 [ 24.025175] CFAR: 000000003000510c IRQMASK: 3 [ 24.025175] GPR00: 0000000000000009 c000001fff657850 0000000030000000 00000000000ffff6 [ 24.025175] GPR04: c000001fff6579a0 0000000000000000 0000000000000000 c00000000400cbe8 [ 24.025175] GPR08: c00000000400cb08 0000000000000000 c00000000400cbe0 0000000000000001 [ 24.025175] GPR12: 0000000031c30000 c000001fff6bc880 0000000000000000 0000000000000000 [ 24.025175] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 24.025175] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 24.025175] GPR24: 0000000000000009 0000000000000003 c00000000400cbe0 0000000000000000 [ 24.025175] GPR28: c00000000400cb08 c00000000400cbe8 0000000000000000 c000001fff6579a0 [ 24.025216] NIP [0000000030005104] 0x30005104 [ 24.025221] LR [c0000000000cf300] opal_return+0x0/0x30 [ 24.025227] Call Trace: [ 24.025228] [c000001fff657850] [c0000000000cc8a4] opal_call+0xe4/0x1c0 (unreliable) [ 24.025235] [c000001fff657910] [c0000000000ccce8] opal_handle_interrupt+0x28/0x40 [ 24.025240] [c000001fff657980] [c0000000000d7b0c] opal_interrupt+0x3c/0xa0 [ 24.025246] [c000001fff6579b0] [c000000000203808] __handle_irq_event_percpu+0x88/0x230 [ 24.025251] [c000001fff657a50] [c000000000203ac4] handle_irq_event+0x74/0x130 [ 24.025256] [c000001fff657a80] [c00000000020b3ac] handle_fasteoi_irq+0xbc/0x300 [ 24.025261] [c000001fff657ac0] [c0000000002018d0] generic_handle_irq+0x50/0x80 [ 24.025266] [c000001fff657ae0] [c000000000017f98] __do_irq+0xb8/0x230 [ 24.025271] [c000001fff657b60] [c000000000018918] __do_IRQ+0xb8/0xe0 [ 24.025275] [c000001fff657ba0] [c000000000018990] do_IRQ+0x50/0xb0 [ 24.025280] [c000001fff657bd0] [c00000000000b63c] h_virt_irq_common_virt+0x28c/0x290 [ 24.025286] --- interrupt: ea0 at arch_local_irq_restore.part.0+0x188/0x190 [ 24.025291] NIP: c0000000000386f8 LR: c000000000fbfb98 CTR: c000000000029310 [ 24.025293] REGS: c000001fff657c00 TRAP: 0ea0 Not tainted (6.4.0-rc2+) [ 24.025295] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22004822 XER: 20040000 [ 24.025305] CFAR: 0000000000000000 IRQMASK: 0 [ 24.025305] GPR00: c000000000fbfb98 c000001fff657ea0 c000000001552900 0000000000000000 [ 24.025305] GPR04: c000001ffa050400 ffffffffffffffff 0005f5e100000000 000000000083126f [ 24.025305] GPR08: 0000001ff7ef0000 0000000000000000 0000000000008002 0000000000004000 [ 24.025305] GPR12: c000000000029310 c000001fff6bc880 0000000000000000 0000000000000000 [ 24.025305] GPR16: 0000000000000001 c000000002ba2a80 0000000000000000 00000000ffff8f15 [ 24.025305] GPR20: c000000002167888 000000000000000a c0000000021f2000 0000000000000000 [ 24.025305] GPR24: 0000000000000000 0000001ff7ef0000 c000000003831680 c000000002bb61e0 [ 24.025305] GPR28: 0000000000000002 0000000000000003 c000000002160400 fcffffffffffffff [ 24.025346] NIP [c0000000000386f8] arch_local_irq_restore.part.0+0x188/0x190 [ 24.025350] LR [c000000000fbfb98] __do_softirq+0xe8/0x3dc [ 24.025355] --- interrupt: ea0 [ 24.025356] [c000001fff657ea0] [c000000003831680] 0xc000000003831680 (unreliable) [ 24.025360] [c000001fff657ee0] [c000000000fbfb98] __do_softirq+0xe8/0x3dc [ 24.025365] [c000001fff657fe0] [c000000000018a30] do_softirq_own_stack+0x40/0x60 [ 24.025370] [c0000000038b39f0] [c00000000015a268] __irq_exit_rcu+0x158/0x190 [ 24.025376] [c0000000038b3a20] [c00000000015adc0] irq_exit+0x20/0x40 [ 24.025381] [c0000000038b3a40] [c0000000000297c4] timer_interrupt+0x174/0x320 [ 24.025386] [c0000000038b3aa0] [c000000000009f8c] decrementer_common_virt+0x28c/0x290 [ 24.025391] --- interrupt: 900 at arch_local_irq_restore.part.0+0x110/0x190 [ 24.025396] NIP: c000000000038680 LR: c000000000038658 CTR: c0000000000291d0 [ 24.025398] REGS: c0000000038b3ad0 TRAP: 0900 Not tainted (6.4.0-rc2+) [ 24.025400] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24004822 XER: 00000000 [ 24.025410] CFAR: 0000000000000000 IRQMASK: 0 [ 24.025410] GPR00: c000000000038658 c0000000038b3d70 c000000001552900 000000028a2a36f3 [ 24.025410] GPR04: 0000000000000001 ffffffffffffffff 0000000000000004 0000001ff7ef0000 [ 24.025410] GPR08: c000001ffa0decf8 0000000000000000 0000000000008002 0000000000004000 [ 24.025410] GPR12: c0000000000291d0 c000001fff6bc880 c000001ff44cff90 0000000000000000 [ 24.025410] GPR16: 0000000000000000 c00000000002d18c c00000000002d164 c0000000020100e4 [ 24.025410] GPR20: 0000000000000006 c000001ff44cc000 c000000002010030 0000000000000001 [ 24.025410] GPR24: 0000000000000000 0000000000000004 000000028aff879e 0000000000000004 [ 24.025410] GPR28: 0000000000000002 0000000000000003 0000000000000004 fcffffffffffffff [ 24.025450] NIP [c000000000038680] arch_local_irq_restore.part.0+0x110/0x190 [ 24.025454] LR [c000000000038658] arch_local_irq_restore.part.0+0xe8/0x190 [ 24.025458] --- interrupt: 900 [ 24.025459] [c0000000038b3db0] [c000000000fb3bf8] cpuidle_enter_state+0xf8/0x5d8 [ 24.025463] [c0000000038b3e50] [c000000000bd951c] cpuidle_enter+0x4c/0x70 [ 24.025468] [c0000000038b3e90] [c0000000001c778c] call_cpuidle+0x4c/0xa0 [ 24.025473] [c0000000038b3eb0] [c0000000001ceda8] cpuidle_idle_call+0x168/0x260 [ 24.025478] [c0000000038b3f10] [c0000000001cefa8] do_idle+0x108/0x1c0 [ 24.025483] [c0000000038b3f60] [c0000000001cf29c] cpu_startup_entry+0x3c/0x40 [ 24.025489] [c0000000038b3f90] [c00000000005feec] start_secondary+0x24c/0x250 [ 24.025494] [c0000000038b3fe0] [c00000000000e058] start_secondary_prolog+0x10/0x14 [ 24.025498] Code: 4c006c81 00000b2c 3c00e241 02000b2c 0c008240 feff6038 00010048 48006c81 ffff6b39 48006c91 780b217c 78fbff7f <4c006c81> 01000b2c f8ff8241 7813427c [ 70.924966] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 70.924990] rcu: 6-...0: (1 GPs behind) idle=a024/1/0x4000000000000002 softirq=177/179 fqs=2994 [ 70.925018] rcu: (detected by 16, t=6002 jiffies, g=-67, q=10478 ncpus=80) [ 70.925031] Sending NMI from CPU 16 to CPUs 6: [ 76.514210] CPU 6 didn't respond to backtrace IPI, inspecting paca. [ 76.514228] irq_soft_mask: 0x03 in_mce: 0 in_nmi: 0 current: 0 (swapper/6) [ 76.514249] Back trace of paca->saved_r1 (0xc0000000038b3c50) (possibly stale): [ 76.514262] Call Trace: [ 76.514270] rcu: rcu_sched kthread starved for 558 jiffies! g-67 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=5 [ 76.514295] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 76.514327] rcu: RCU grace-period kthread stack dump: [ 76.514344] task:rcu_sched state:I stack:0 pid:15 ppid:2 flags:0x00000000 [ 76.514369] Call Trace: [ 76.514375] [c0000000038f7a40] [c000001ff9fa2900] 0xc000001ff9fa2900 (unreliable) [ 76.514409] [c0000000038f7bf0] [c00000000001fcd0] __switch_to+0x130/0x220 [ 76.514443] [c0000000038f7c50] [c000000000fb4d58] __schedule+0x258/0x6d0 [ 76.514475] [c0000000038f7d20] [c000000000fb5244] schedule+0x74/0x140 [ 76.514506] [c0000000038f7d90] [c000000000fbdb34] schedule_timeout+0xa4/0x1d0 [ 76.514540] [c0000000038f7e60] [c000000000224eac] rcu_gp_fqs_loop+0x40c/0x540 [ 76.514574] [c0000000038f7f00] [c000000000229bd0] rcu_gp_kthread+0x190/0x200 [ 76.514608] [c0000000038f7f90] [c00000000018b018] kthread+0x138/0x140 [ 76.514640] [c0000000038f7fe0] [c00000000000dd58] start_kernel_thread+0x14/0x18 [ 76.514673] rcu: Stack dump where RCU GP kthread last ran: [ 76.514691] Sending NMI from CPU 16 to CPUs 5: [ 76.514712] NMI backtrace for cpu 5 [ 76.514733] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.4.0-rc2+ #2 [ 76.514772] Hardware name: 9006-22P POWER9 0x4e1202 opal:skiboot-v6.0.23 PowerNV [ 76.514821] NIP: c0000000000383bc LR: c0000000000386c8 CTR: c0000000000291d0 [ 76.514870] REGS: c0000000038f3be8 TRAP: 0a00 Not tainted (6.4.0-rc2+) [ 76.514906] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24004424 XER: 00000000 [ 76.514962] CFAR: 0000000000000000 IRQMASK: 0 [ 76.514962] GPR00: c0000000000386c8 c0000000038f3d70 c000000001552900 c0000000038f3bb8 [ 76.514962] GPR04: 00000011cfbdb9e1 ffffffffffffffff 002887fa00000000 0000000000000018 [ 76.514962] GPR08: 0000000000003b08 0000000000000043 0000001ff7e50000 00000000000026f9 [ 76.514962] GPR12: c0000000000291d0 c000001fff7fc680 c000001ff44cbf90 0000000000000000 [ 76.514962] GPR16: 0000000000000000 c00000000002d18c c00000000002d164 c0000000020100e4 [ 76.514962] GPR20: 0000000000000006 c000001ff44c8000 c000000002010030 0000000000000001 [ 76.514962] GPR24: 0000000000000000 0000000000000004 00000011d09b71c4 0000000000000004 [ 76.514962] GPR28: 0000000000000002 0000000000000003 0000000000000004 fcffffffffffffff [ 76.515291] NIP [c0000000000383bc] __replay_soft_interrupts+0x3c/0x160 [ 76.515332] LR [c0000000000386c8] arch_local_irq_restore.part.0+0x158/0x190 [ 76.515371] Call Trace: [ 76.515390] [c0000000038f3d70] [c0000000000386c8] arch_local_irq_restore.part.0+0x158/0x190 (unreliable) [ 76.515441] [c0000000038f3db0] [c000000000fb3bf8] cpuidle_enter_state+0xf8/0x5d8 [ 76.515482] [c0000000038f3e50] [c000000000bd951c] cpuidle_enter+0x4c/0x70 [ 76.515520] [c0000000038f3e90] [c0000000001c778c] call_cpuidle+0x4c/0xa0 [ 76.515556] [c0000000038f3eb0] [c0000000001ceda8] cpuidle_idle_call+0x168/0x260 [ 76.515604] [c0000000038f3f10] [c0000000001cefa8] do_idle+0x108/0x1c0 [ 76.515645] [c0000000038f3f60] [c0000000001cf29c] cpu_startup_entry+0x3c/0x40 [ 76.515684] [c0000000038f3f90] [c00000000005feec] start_secondary+0x24c/0x250 [ 76.515734] [c0000000038f3fe0] [c00000000000e058] start_secondary_prolog+0x10/0x14 [ 76.515787] Code: 60000000 7c0802a6 f8010010 f821fe51 e92d0af8 f92101a8 39200000 38610028 892d0933 61290040 992d0933 48044359 <60000000> 39200000 e9410130 f9210160
(In reply to Dave Young from comment #3) > Created attachment 1976289 [details] > untested patch > > Thanks Coiby for bisecting. If any of you have the machine, could you try > the untested patch see if it works? Test it on ibm-p9z-06-lp9.khw3.lab.eng.bos.redhat.com. Before this patch, it can not work with bad commit 606787fed7268feb256957872586370b56af697a "powerpc/64s: Remove support for ELFv1 little endian userspace". After this patch, the vmcore can be saved.
I have opened an upstream bug: https://bugzilla.kernel.org/show_bug.cgi?id=217702
This issue has been fixed in upstream by 106ea7ffd56b ("Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"")
Thanks Pingfan. Feel free to reopen this if needed.
We are seeing this issue again. We weren't testing the kdump.crash in FCOS-Rawhide due to other Selinux-policy related issue so didn't catch this early. Apparently, the transition of kernel version that seems to have caused this is `kernel-6.6.0-0.rc0.20230829git1c59d383390f.59.fc40` -> `kernel-doc-6.6.0-0.rc0.20230830git6c1b980a7e79.1.fc40`. In short, the `kdump.crash` test in Rawhide: Passes with `kernel-6.6.0-0.rc0.20230829git1c59d383390f.59.fc40` Fails with `kernel-doc-6.6.0-0.rc0.20230830git6c1b980a7e79.1.fc40`
Let's open a new BZ. I think this is probably a new regression.