Description of problem:
During automated podman integration testing on F31 (I believe also during beta) I've been tripping a kernel panic. It's happening on cloud VMs, so I've been procrastinating on the debugging, hoping it would clear upon release. It has not, but is reproducible and I can modify/setup the VMs in any way which is helpful for debugging.
Version-Release number of selected component (if applicable):
Within about 20-25 minutes, using libpod CI automation setup
Steps to Reproduce:
1. Use libpod repo from PR #3901 and having proper google cloud credentials
2. $ hack/get_ci_vm.sh fedora-31-libpod-6322976592494592
3. # contrib/cirrus/integration_test.sh
No panic, even if one/more integration tests fail
Previous to F31 release, our automated testing of libpod w/ CGroupsV2 (and crun) was limited to a temporary F30 setup. It is desired by upstream to migrate testing to the latest Fedora release to support ongoing libpod development.
I have full control over these VMs, can describe their current setup precisely, extract a live VM from the test environment as needed, and instrument them however is needed to assist debugging.
Created attachment 1631145 [details]
Panic message from serial console
Created attachment 1631157 [details]
Integration tests ginkgo debug output
Created attachment 1631160 [details]
Created attachment 1631161 [details]
System journal from relevant boot
I'm setting up kexec on the VM that reproduced this, and will try to reproduce and capture a kernel core. Unless anyone has a better/easier idea.
Okay, tripped another panic, it has the exact same 'RIP: 0010:rb_erase+0x1b1/0x370' and similar call trace on the serial console. The VM appears to hang here, and doesn't automatically boot the dump kernel. I tried feeding a [break]-C to it but there is no response.
What am I forgetting?
No luck getting a core, something broken with kexec or my /etc/kdump.conf setup:
core_collector makedumpfile -l --message-level 1 -d 31
I created and formatted the sda2 partition as ext4 and have it mounted as /var/crash from fstab. The kdump service is enabled/active after reporting success building it's special ramdisk. The pre/post scripts simply echo some text to stdout. I turned on /proc/sys/kernel/sysrq then tried to manually test dumping:
Send [BREAK]c over serial console -> *bam* VM reboots kernel -> panics in ramdisk:
...cut kernel messages...
[ 2.168485] Freeing unused kernel image memory: 2272K
[ 2.169448] Write protecting the kernel read-only data: 20480k
[ 2.171306] Freeing unused kernel image memory: 2016K
[ 2.173010] Freeing unused kernel image memory: 1580K
[ 2.182054] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 2.183605] rodata_test: all tests were successful
[ 2.184715] x86/mm: Checking user space page tables
[ 2.194015] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 2.195655] Run /init as init process
[ 2.197587] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[ 2.198892] CPU: 0 PID: 1 Comm: init Not tainted 5.3.7-301.fc31.x86_64 #1
[ 2.200215] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[ 2.202507] Call Trace:
[ 2.203085] dump_stack+0x5c/0x80
[ 2.204165] panic+0x101/0x2d7
[ 2.204933] do_exit.cold+0x1a/0xd1
[ 2.205748] ? __do_sys_newstat+0x48/0x70
[ 2.206422] do_group_exit+0x3a/0xa0
[ 2.207206] __x64_sys_exit_group+0x14/0x20
[ 2.208367] do_syscall_64+0x5f/0x1a0
[ 2.209117] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2.210386] RIP: 0033:0x7f01408f118e
[ 2.210978] Code: Bad RIP value.
[ 2.211912] RSP: 002b:00007ffcd0e7f358 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7
[ 2.213386] RAX: ffffffffffffffda RBX: 00007f01400c3bb0 RCX: 00007f01408f118e
[ 2.215020] RDX: 000000000000007f RSI: 000000000000003c RDI: 000000000000007f
[ 2.216459] RBP: 00007ffcd0e7ffe0 R08: 00000000000000e7 R09: 00007ffcd0e7f268
[ 2.217832] R10: 0000000000000000 R11: 0000000000000206 R12: 00007f01408fa570
[ 2.219167] R13: 0000000000000000 R14: 0000000000000019 R15: 00007f01400c3be0
[ 2.220720] Kernel Offset: disabled
[ 2.221871] Rebooting in 10 seconds..
[ 12.224262] ACPI MEMORY or I/O RESET_REG.
...cut system reboots from bios...
Paolo, have you seen a trace like this before?
When the problem occurs, the system hangs. I've now set `kernel.panic_on_oops = 1` and increased the verbosity of makedumpfile, rebuilt the ramdisk, rebooted, and will see if any of that helps...
...did not make any difference, the system still hangs and requires a manual panic via the serial console.
Note: I switched our CI system to use the deadline scheduler for Fedora 31 and initial runs appear promising. In other words, the BFQ elevator seems to be specifically required/involved in the panic. This is the default scheduler, along with ext4 for the F31 cloud image.
(In reply to Jeff Moyer from comment #8)
> Paolo, have you seen a trace like this before?
Nope, but it looks like you found work for me :) I'm about to share a debugging patch for BFQ.
(In reply to Chris Evich from comment #5)
> I'm setting up kexec on the VM that reproduced this, and will try to
> reproduce and capture a kernel core. Unless anyone has a better/easier idea.
I'm about to attach a kernel debugging patch for BFQ. Could you apply it and retry? The patch is for 5.3.0, so it should be ok for your kernel.
The goal of the patch is to hunt the cause of this crash, through a lot of invariant checks (BUG_ONs). If a BUG_ON triggers, the OOPS will hopefully tell us something useful.
Created attachment 1633036 [details]
Debug patch for BFQ
Sure happy to...but it's been years since I've built a kernel, is there a quick-reference somewhere?
(In reply to Chris Evich from comment #15)
> Sure happy to...but it's been years since I've built a kernel, is there a
> quick-reference somewhere?
Unless you happen to be rather proficient in Italian, I don't have any resource I know well to suggest to you :)
But I had a look at this Fedora's wiki page, which seems good:
I'm also willing to make and install a modified kernel for you, if you can give me access to the offended system.
Oh even easier, yes happy to give you access. Do you have a ssh key I can add?
Note: There's a /root/repro.sh that will trigger the panic after some time. However, serial-console and hard-reset access needs a bigger list of permissions. There's also a chance that the IP address will change on hard-reset. So best let me run that and copy-paste the details for ya (assuming kdump/kexec can't be made to work).
or...come find me (cevich) on Freenode IRC and I can set a root password for you.
(In reply to Chris Evich from comment #17)
> Oh even easier, yes happy to give you access. Do you have a ssh key I can
> Note: There's a /root/repro.sh that will trigger the panic after some time.
> However, serial-console and hard-reset access needs a bigger list of
> permissions. There's also a chance that the IP address will change on
> hard-reset. So best let me run that and copy-paste the details for ya
> (assuming kdump/kexec can't be made to work).
Great, I'll send you my key by email. Then I guess we can proceed privately for a little while, and get back to this thread as we have some progress.
sounds good. I'm just starting my day now, will grab your key and install it...
I may have found the bug. I've attached a tentative fix, to be applied on top of the default branch of my dev-bfq repo.
Created attachment 1634784 [details]
Tentative fix patch, to be applied on top of my dev-bfq branch
Fix accepted for mainline:
Is this in 5.3.11 release?
(In reply to Pavel Raiskup from comment #24)
> Is this in 5.3.11 release?
I guess so.
The previous fix lacked a check. Fixed fix here:
Is this in 5.3.12 release?
This is not in any Fedora kernels that I can find. I'm trying to build Paolo's patch into v5.3.10 source (after I reproduced the issue) to see if it fixes it...
Created attachment 1650442 [details]
serial output from kernel soft-locks then hung VM
...nope :( seems to have made the problem worse. Now my reproducer starts spitting tons of CPU soft-lockups before grinding to a halt. Log attached
(In reply to Paolo from comment #26)
> The previous fix lacked a check. Fixed fix here:
I only applied this patch. Should I have also applied anything else?
In any case, please let me know what you need. I have a fresh VM and knowledge of how to add a patch and build the Fedora kernel package...and you're pinging me on IRC now...
...work is being tracked now in https://bugzilla.kernel.org/show_bug.cgi?id=205447
All: We have a fully tested and blessed fix for this bug now. I've tested this using the F31 5.4 kernel source and confirmed my reproducer no-longer triggers the hang & OOPS.
I will also attach the patches, but this is the final LKML thread for ref:
Created attachment 1657370 [details]
Tarball of V2 set of patches
Please double-check the patch files vs LKML content, as I'm fairly new at this kind of thing.
Created attachment 1657371 [details]
Tarball of V2 set of patches
Anything else todo from my end? I preserved my original reproducer and am happy to help test if needed.
Can the latest patch for this be submitted to 'stable'? I believe that's the condition needed for it to be accepted for inclusion in Fedora.
(In reply to Chris Evich from comment #37)
> Can the latest patch for this be submitted to 'stable'? I believe that's
> the condition needed for it to be accepted for inclusion in Fedora.
Absolutely! Unfortunately, I don't know the process for that. In my workflow I submit stuff only for mainline, and, after it is accepted, I see it being progressivelly selected and applied to stable.
I've asked some colleagues in Linaro, and they are willing to help me with this. But, first, which stable kernel(s) do you need these fix commits to be added to?
So typically, if there is a fixes tag, it could be picked up for stable though that can take time because it is done by a bot
If you know on submission that it needs to go to stable, you can include a Cc: stable.org in the sign-off area and it will go to stable without you needing to do anything else as long as it applies cleanly.
For patches that have already been pulled to mainline without such tags:
Send the patch, after verifying that it follows the above rules, to
stable.org. You must note the upstream commit ID in the
changelog of your submission, as well as the kernel version you wish
it to be applied to. If the patch deviates from the original
upstream patch (for example because it had to be backported) this must be very
clearly documented and justified in the patch description.
In this case it would be 5.4 and 5.5
*** Bug 1768092 has been marked as a duplicate of this bug. ***
(In reply to Justin M. Forbes from comment #40)
> In this case it would be 5.4 and 5.5
Upstream would like to know if 5.5.6 okay for inclusion in F31 and beyond?
If so, I still have my original reproducer VMs available. All I need is the ability to pull down the 5.5.6 kernel with patches applied, using 'fedpkg'. Then I can build + test it fairly quickly (hours).
The email thread by which I requested these fixes to be ported to stable branches is now archived:
As you can see, these fixes are now available for 5.4-stable and 5.5-stable.
If someone needs to drop in, I can send an email with him/her in CC.
So now that it's in 5.4-stable and beyond, is there anything need to have this picked up in F30 and F31?
My motivations come from wanting to remove a workaround I have in place, deep within some automation machinery I'm maintaining. Though I'm sure humans will also appreciate not ever hitting this difficult-to-diagnose kernel panic :D
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 31 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
Thank you for reporting this bug and we are sorry it could not be fixed.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days