Bug 1941335
Summary: | Starting raid-check.timer renders system unusable | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jonathan Dieter <jonathan> | ||||||
Component: | systemd | Assignee: | systemd-maint | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 33 | CC: | aireilly, bugzilla, cleaver-redhat, cramerd, dominik, ed.greshko, ego.cordatus, fedoraproject, filbranden, flepied, fweimer, gryan, jen, john.kissane, kasong, kevin, lnykryn, mramendi, msekleta, murphy.john69, ol+redhat, przemo, rjones, samuel-rhbugs, scorreia, sgraf, ssahani, s, systemd-maint, tom, yuwatana, zbyszek | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | systemd-248~rc4-3.fc34 systemd-246.13-1.fc33 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2021-03-25 00:18:55 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Jonathan Dieter
2021-03-21 17:08:53 UTC
Created attachment 1765087 [details]
raid-check.timer
Additional experimentation has confirmed that this is date related. If the date is after Mar 3rd, 2021 at 1:00AM, this bug will be triggered. Possibly an overflow issue? The bug is causing people's systems to fail to boot without a clear cause, leading to several posts on reddit and discussion on IRC. I was hit too; after being pointed to this bugzilla entry I was able to recover and will now post some instructions on reddit. Because of the massive user impact, I have upgraded severity and urgency to maximum available values. A workaround is to add "systemd.mask=raid-check.timer" to the kernel command line when booting which should allow the machine to boot after which "systemctl disable raid-check.timer" can be used to prevent a recurrence. Created attachment 1765096 [details]
Journal from affected system
This is from a system with a clean install of F33.
(In reply to Tom Hughes from comment #4) > A workaround is to add "systemd.mask=raid-check.timer" to the kernel command > line when booting which should allow the machine to boot after which > "systemctl disable raid-check.timer" can be used to prevent a recurrence. It looks like systemd won't allow you to disable a masked service, even if it's masked in the kernel command line. If using the above workaround, you'll need to run the following to manually remove the timer: rm /etc/systemd/system/timers.target.wants/raid-check.timer Some good news: as chrisawi pointed out on IRC, it looks like this is tied to the Europe/Dublin time zone. Switching to Etc/UTC, Europe/London, or other time zones fixes the problem. Unfortunately I can't reproduce this here... The most likely explanation is some infinite loop in the timer handling code. Could someone who is affected provide a stack trace (with 'gdb -p1' or 'pstack 1'), or maybe a core file ('kill -ABRT 1' and then look look in the journal for information in the core file and upload it here). I see the issue. This also triggers an issue [egreshko@f33g ~]$ date Mon Mar 22 06:57:03 CST 2021 [egreshko@f33g ~]$ sudo systemctl --now disable raid-check.timer Removed /etc/systemd/system/timers.target.wants/raid-check.timer. [egreshko@f33g ~]$ sudo systemctl status raid-check.timer ● raid-check.timer - Weekly RAID setup health check Loaded: loaded (/usr/lib/systemd/system/raid-check.timer; disabled; vendor p> Active: inactive (dead) Trigger: n/a Triggers: ● raid-check.service Mar 22 06:56:47 f33g.greshko.com systemd[1]: Started Weekly RAID setup health che> Mar 22 07:17:56 f33g.greshko.com systemd[1]: raid-check.timer: Succeeded. Mar 22 07:17:56 f33g.greshko.com systemd[1]: Stopped Weekly RAID setup health che> [egreshko@f33g ~]$ timedatectl status | grep zone Time zone: Asia/Taipei (CST, +0800) [egreshko@f33g ~]$ sudo timedatectl set-timezone Europe/Dublin [egreshko@f33g ~]$ sudo timedatectl set-timezone Asia/Taipei Note there is no problem. And then, [egreshko@f33g ~]$ sudo systemctl --now enable raid-check.timer Created symlink /etc/systemd/system/timers.target.wants/raid-check.timer → /usr/lib/systemd/system/raid-check.timer. [egreshko@f33g ~]$ sudo timedatectl set-timezone Europe/Dublin [egreshko@f33g ~]$ sudo timedatectl set-timezone Asia/Taipei Failed to set time zone: Connection timed out Is this affecting all DST transitions? FEDORA-2021-ea92e5703f has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-ea92e5703f FEDORA-2021-1c1a870ceb has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2021-1c1a870ceb FEDORA-2021-ea92e5703f has been pushed to the Fedora 34 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-ea92e5703f` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-ea92e5703f See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. *** Bug 1942298 has been marked as a duplicate of this bug. *** FEDORA-2021-1c1a870ceb has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2021-1c1a870ceb FEDORA-2021-ea92e5703f has been pushed to the Fedora 34 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2021-1c1a870ceb has been pushed to the Fedora 33 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-1c1a870ceb` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-1c1a870ceb See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2021-1c1a870ceb has been pushed to the Fedora 33 stable repository. If problem still persists, please make note of it in this bug report. |