Description of problem: After upgrade to F31 from F30 mock periodically hangs system completely and the only way to fix this rebooting. Version-Release number of selected component (if applicable): mock-1.4.19-1.fc31 How reproducible: Initiate mock build (happens not every time, 1:30). Actual results: System hangs. Expected results: Mock not hangs system. Additional info: At least three (including me) people confirm this issue on Fedora 31.
I can confirm this. Problem present on all mock configs - rawhide, f31, epel7...
I thought perhaps recent new systemd update could fix this but nope unfortunately. Happens again. I noticed this time it hangs on: - Enable HW Info plugin... ☠️
Hmm, I was trying to reproduce and run while true; do mock init; done without any error. But it happen to my colleague (praiskup) and it seems that in his case, it happens when you run mock after you run podman.
I am not use podman and it is not installed in my system.
Also noticed seems like all this time when it hangs i run mock with such options: $ mock -n --offline <foo>
In my case, my box isn't hanged - but I can not run podman anymore. I suspect there's something broken with cgroups v2 and systemd-nspawn (which used by mock). Have you tried --old-chroot option (that turns systemd-nspawn off)? Anyways, as I'm not able to reproduce this, having the --verbose or --verbose + --trace output attached.
Yep, yesterday i started trying with --old-chroot and will watch now... Another bug which i struggle because of cgroups v2 is this: https://bugzilla.redhat.com/show_bug.cgi?id=1751120
What does "hangs" mean? A kernel crash, swap frenzy, something else? We need some details. Please start with explanation what "hangs" means and provide the logs (journactl -b-1) from around that time.
Created attachment 1625512 [details] mock journalctl log > What does "hangs" mean? When starting mock build almost right from start it silently stops and system completely unresponsive. No I/O activity, can't switch to tty3-7 from graphic mode. Literally can't do anything, just pressing Reset button helps. * Attached journal. Note: with --old-chroot no hangs. I did tons builds with it and never hangs.
I suspect this could be duplicate to bug 1756972 or (unlikely) some glitch caused by bug fixed by [1] or it can be related cgroups v2 bug [2]. Please try mock from 'dnf copr enable praiskup/mock-fixes' (v1.4.20-1.git.8.2feb615). If the problem persists, it is likely [2] or even something else. [1] https://github.com/rpm-software-management/mock/commit/c4eccaed8b41dedf11cc90a94481bcefc4ead2dc [2] https://github.com/rpm-software-management/mock/issues/374
*** This bug has been marked as a duplicate of bug 1756972 ***
After this update https://bodhi.fedoraproject.org/updates/FEDORA-2019-755583cbdf and after ~60 successful build mock hangs the system again. Another guy confirm this as well.
Created attachment 1633037 [details] mock journalctl log #2
Can you please provide more info? - full mock configuration - fedora version is 31? - full mock command-line when it hanged - what have you built when mock hanged the system? - is kernel dead, does it react on sysrq? - can you try with disabled swap? - your / filesystem? - are you using tmpfs plugin? If this is caused by the fact that mock (or anything below) eats too much RAM, you need to debug _what_ causes that. It could be anything.
> full mock configuration: - It is default, except: config_opts['rpmbuild_networking'] = True > fedora version is 31? - F31. It was never happened before and only right after upgrade to F31. > full mock command-line when it hanged - Command from my attached log: /usr/libexec/mock/mock -n -r fedora-rawhide-x86_64 --rebuild --sources /home/tim/rpmbuild/SOURCES/ --spec rust-maildir.spec' > is kernel dead, does it react on sysrq? - sysrq was disabled. Can test this next time when this happens. > your / filesystem? - ext4. Enough free space. > are you using tmpfs plugin? - No. But if remember correctly i tried with tmpfs and it still hangs sometimes. > If this is caused by the fact that mock (or anything below) eats too > much RAM, you need to debug _what_ causes that. It could be anything. No i assure you. Could happens with tiny project as well which not requires a lot RAM. --- Note: keep in mind that with --old-chroot workaround everything fine.
I'm afraid we can not help here (mock maintainers). If this is about the F30 -> F31 move, it is unlikely caused by mock. We don't have F31 specific patches. From mock POV I'd have to close INSUFFICIENT_DATA. Could you try with up2date systemd on F30 to confirm that this really doesn't happen there? I'm switching this against systemd, which is more likely to have issues with cgroups v2.
> When starting mock build almost right from start it silently stops and system completely unresponsive. No I/O activity, can't switch to tty3-7 from graphic mode. Literally can't do anything, just pressing Reset button helps. That sounds more like a kernel bug. Or it could simply be a hardware issue, e.g. bad memory, that just happens to be triggered in a specific scenario. Right now there simply isn't enough information to figure out what is going on here.
> Or it could simply be a hardware issue, e.g. bad memory, that just happens to be triggered in a specific scenario. There is literally zero issues with this hardware for 2+ years. Only mock (maybe not mock itself bit something related to mock/nspawn) hangs the system. 3 more people have exactly the same issue. They lazy to write it there on RHBZ, but they wrote about this in group chat every day. > is kernel dead, does it react on sysrq? - Update: kernel dead, sysrq don't react when this happens.
It would be good to connect a serial console or netconsole to capture some debug messages when this happens.
I have the same issue here, up to date Fedora 31. Hang happens from time to time, when it happens: Always when systemd gets installed into the chroot.
> when it happens: Always when systemd gets installed into the chroot. Right after package install? Or when executing some RPM scriptlet? I tried to rebuild package with 'BuildRequires: systemd' over night with about 320 attempts, and no problem appeared. What systemd is installed inside chroot (what chroot you build against)? Do your affected boxes have anything in common?
It happened when building siril for rawhide using fedpkg mockbuild, so fedora-rawhide-x86_64 has been used. The freeze happens in ~1 of 10 build attempts for me. The point of freeze: Running scriptlet: systemd-243-4.gitef67743.fc32.x86_64 343/373 Installing : systemd-243-4.gitef67743.fc32.x86_64 343/373 Version of mock packages: mock-1.4.21-1.fc31.noarch mock-core-configs-31.7-1.fc31.noarch Kernel: 5.3.8-300.fc31.x86_64 x86_64 Host systemd: systemd-243-4.gitef67743.fc31.x86_64 My Fedora installation is Fedora 31, but no new installation, I upgraded it from 30 some weeks ago.
In addition some information about my box: # inxi -SCI System: Host: r2d2 Kernel: 5.3.8-300.fc31.x86_64 x86_64 bits: 64 Console: tty 2 Distro: Fedora release 31 (Thirty One) Machine: Type: Desktop System: MSI product: MS-7798 v: 1.0 serial: N/A Mobo: MSI model: B75MA-P45 (MS-7798) v: 1.0 serial: N/A BIOS: American Megatrends v: 1.3 date: 07/30/2012 CPU: Topology: Quad Core model: Intel Core i7-2600K bits: 64 type: MT MCP L2 cache: 8192 KiB Speed: 1596 MHz min/max: 1600/3800 MHz Core speeds (MHz): 1: 1596 2: 1596 3: 1597 4: 1596 5: 1596 6: 1596 7: 1596 8: 1596 Info: Processes: 214 Uptime: 1h 09m Memory: 15.31 GiB used: 1.57 GiB (10.2%) Shell: bash inxi: 3.0.36
There's a kernel trace in bug 1767097.
I have observed these hangs too, but they didn't appear to be related to the systemd rpm. Instead they mostly occurred during the launch of the mock root, but sometimes also later on.
Problem Still happen time to time on my system. System: Host: vascom Kernel: 5.3.11-300.fc31.x86_64 x86_64 bits: 64 Desktop: KDE Plasma 5.16.5 Distro: Fedora release 31 (Thirty One) CPU: Topology: Quad Core model: Intel Core i7-4770 bits: 64 type: MT MCP L2 cache: 8192 KiB
I have the same problem. System: Host: desktop.local Kernel: 5.3.11-300.fc31.x86_64 x86_64 bits: 64 Desktop: KDE Plasma 5.16.5 Distro: Fedora release 31 (Thirty One) CPU: Topology: Quad Core model: AMD Phenom II X4 B40 bits: 64 type: MCP L2 cache: 2048 KiB But If you add systemd.unified_cgroup_hierarchy=0 into kernel parameters then the problem will disappear (at least for me).
systemd.unified_cgroup_hierarchy=0 can be used as workaround until this issue will be fixed in upstream.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 31 kernel bugs. Fedora 31 has now been rebased to 5.5.7-200.fc31. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 32, and are still experiencing this issue, please change the version to Fedora 32. If you experience different issues, please open a new bug report for those.
> Fedora 31 has now been rebased to 5.5.7-200.fc31. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. Just tried latest 5.5.7 on F31 and unfortunately problem is still persist. > If you have moved on to Fedora 32, and are still experiencing this issue, please change the version to Fedora 32. I'll update soon on F32 and will test this for sure.
Still experiencing this issue on F32. - kernel version: 5.6.0-0.rc5.git0.2.fc32 - mock version: mock-2.1-1.fc32 Another guy said that after reinstalling Fedora completely he doesn't have this issue anymore. But not sure can we consider this is as fix or not.
This message is a reminder that Fedora 31 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '31'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 31 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.