Hide Forgot
Description of problem: If raid-check.timer is started, the system ends up unusable with systemd no longer responding and loads of zombie processes. This seems like it may be triggered by a specific date. Version-Release number of selected component (if applicable): systemd-246.10-1.fc33.x86_64 How reproducible: Always Steps to Reproduce: 1. Set date to Mar 21st, 2021 (I think it's date related?) 2. Run `systemctl start raid-check.timer` Actual results: The system becomes unusable, with any systemctl commands hanging and a long list of zombie processes. Expected results: The system boots as normal Additional info: I'm assigning this to systemd because it seems to be a problem with how systemd is handling the timer file. Starting raid-check.service works with no problems whatsoever, so it doesn't seem to be a problem with mdadm at all. When tracking down the bug, I attempted to do a clean install of Fedora on one of my systems, and it turns out that raid-check.timer is enabled by default, which froze the new install. This means that the problem affects the version of systemd on F33 GA as well as the latest updates. I suspect that it has something to do with the day/date since I couldn't find any indication of anyone else seeing this bug before today. A simple workaround is to boot into single user mode and disable raid-check.timer. Unfortunately, this requires a root password.
Created attachment 1765087 [details] raid-check.timer
Additional experimentation has confirmed that this is date related. If the date is after Mar 3rd, 2021 at 1:00AM, this bug will be triggered. Possibly an overflow issue?
The bug is causing people's systems to fail to boot without a clear cause, leading to several posts on reddit and discussion on IRC. I was hit too; after being pointed to this bugzilla entry I was able to recover and will now post some instructions on reddit. Because of the massive user impact, I have upgraded severity and urgency to maximum available values.
A workaround is to add "systemd.mask=raid-check.timer" to the kernel command line when booting which should allow the machine to boot after which "systemctl disable raid-check.timer" can be used to prevent a recurrence.
Created attachment 1765096 [details] Journal from affected system This is from a system with a clean install of F33.
(In reply to Tom Hughes from comment #4) > A workaround is to add "systemd.mask=raid-check.timer" to the kernel command > line when booting which should allow the machine to boot after which > "systemctl disable raid-check.timer" can be used to prevent a recurrence. It looks like systemd won't allow you to disable a masked service, even if it's masked in the kernel command line. If using the above workaround, you'll need to run the following to manually remove the timer: rm /etc/systemd/system/timers.target.wants/raid-check.timer
Some good news: as chrisawi pointed out on IRC, it looks like this is tied to the Europe/Dublin time zone. Switching to Etc/UTC, Europe/London, or other time zones fixes the problem.
Unfortunately I can't reproduce this here... The most likely explanation is some infinite loop in the timer handling code. Could someone who is affected provide a stack trace (with 'gdb -p1' or 'pstack 1'), or maybe a core file ('kill -ABRT 1' and then look look in the journal for information in the core file and upload it here).
I see the issue.
This also triggers an issue [egreshko@f33g ~]$ date Mon Mar 22 06:57:03 CST 2021 [egreshko@f33g ~]$ sudo systemctl --now disable raid-check.timer Removed /etc/systemd/system/timers.target.wants/raid-check.timer. [egreshko@f33g ~]$ sudo systemctl status raid-check.timer ● raid-check.timer - Weekly RAID setup health check Loaded: loaded (/usr/lib/systemd/system/raid-check.timer; disabled; vendor p> Active: inactive (dead) Trigger: n/a Triggers: ● raid-check.service Mar 22 06:56:47 f33g.greshko.com systemd[1]: Started Weekly RAID setup health che> Mar 22 07:17:56 f33g.greshko.com systemd[1]: raid-check.timer: Succeeded. Mar 22 07:17:56 f33g.greshko.com systemd[1]: Stopped Weekly RAID setup health che> [egreshko@f33g ~]$ timedatectl status | grep zone Time zone: Asia/Taipei (CST, +0800) [egreshko@f33g ~]$ sudo timedatectl set-timezone Europe/Dublin [egreshko@f33g ~]$ sudo timedatectl set-timezone Asia/Taipei Note there is no problem. And then, [egreshko@f33g ~]$ sudo systemctl --now enable raid-check.timer Created symlink /etc/systemd/system/timers.target.wants/raid-check.timer → /usr/lib/systemd/system/raid-check.timer. [egreshko@f33g ~]$ sudo timedatectl set-timezone Europe/Dublin [egreshko@f33g ~]$ sudo timedatectl set-timezone Asia/Taipei Failed to set time zone: Connection timed out
Is this affecting all DST transitions?
https://github.com/systemd/systemd/pull/19075
FEDORA-2021-ea92e5703f has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-ea92e5703f
FEDORA-2021-1c1a870ceb has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2021-1c1a870ceb
FEDORA-2021-ea92e5703f has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-ea92e5703f` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-ea92e5703f See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
*** Bug 1942298 has been marked as a duplicate of this bug. ***
FEDORA-2021-ea92e5703f has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2021-1c1a870ceb has been pushed to the Fedora 33 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-1c1a870ceb` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-1c1a870ceb See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2021-1c1a870ceb has been pushed to the Fedora 33 stable repository. If problem still persists, please make note of it in this bug report.