Bug 1474200
Summary: | kdump error handler was triggered twice | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Xunlei Pang <xlpang> |
Component: | dracut | Assignee: | dracut-maint-list |
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | bhe, dracut-maint-list, harald, jonathan, kdump-team-bugs, ruyang, xlpang, zbyszek |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-08-14 07:51:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Xunlei Pang
2017-07-24 06:41:41 UTC
The cause might be the ordering cycle in dracut systemd services. Following is part of the journalctl output from kdump emergency shell: [ 233.200240] systemd[1]: basic.target: Found ordering cycle on basic.target/start [ 233.201450] systemd[1]: basic.target: Found dependency on sysinit.target/start [ 233.203159] systemd[1]: basic.target: Found dependency on emergency.target/stop [ 233.203920] systemd[1]: basic.target: Found dependency on dracut-pre-pivot.service/start [ 233.205337] systemd[1]: basic.target: Found dependency on initrd.target/start [ 233.206198] systemd[1]: basic.target: Found dependency on basic.target/start [ 233.206919] systemd[1]: basic.target: Breaking ordering cycle by deleting job dracut-pre-pivot.service/start We have a basic => sysinit => emergency => dracut-pre-pivot => initrd => basic ordering cycle, causing emergency depending on emergency itself. Changing the Before directive in dracut-pre-pivot.service from Before=shutdown.target emergency.target to Before=shutdown.target would eliminate the systemd warning message. I'm considering starting kdump-error-handler service through systemctl in emergency.service twice as the cause of the problem. Just found that if kdump error handler was triggered by a makedumpfile failure like in https://bugzilla.redhat.com/show_bug.cgi?id=1474706 The problem described here would not happen. Seems that the problem would be triggered only during certain phases. Will investigate more. (In reply to Ziyue Yang from comment #2) > Just found that if kdump error handler was triggered by a makedumpfile > failure like in > > https://bugzilla.redhat.com/show_bug.cgi?id=1474706 > > The problem described here would not happen. Seems that the problem would be > triggered only during certain phases. Will investigate more. I guess it's due to a different systemd stage, kdump capture service starts late after sysinit.target, so there should be no such ordering cycle issue. However with your fix in Comment 1, the problem still exists with the following reproducing steps ==================== [root@ ~]$ grep -v ^# /etc/kdump.conf path /root/xxx core_collector makedumpfile -l --message-level 1 -d 31 default shell ==================== (dm-1 is the swap partition) swapoff /dev/dm-1 &>/dev/null mkfs.ext4 /dev/dm-1 -F mount /dev/dm-1 /root/xxx mkdir -p /root/xxx/var/crash touch /etc/kdump.conf kdumpctl restart umount /root/xxx mkswap /dev/dm-1 sync sync sleep 1 echo 1 > /proc/sys/kernel/sysrq echo c > /proc/sysrq-trigger So, there is still some other issue we're not clear, or maybe we can solve it from a different view, for example, we allow for multiple failures(i.e. multiple error handler triggered) and try to avoid starting it multiple times if there was one running there. Is this a kdump only issue, or a general issue in dracut? It seems to be a dracut's bug: https://github.com/dracutdevs/dracut/commit/f24d205537b094939379440ee013cca88c7582ac I've compiled and installed the newest version of dracut from github, and the error handler bug seems no longer. exist. (In reply to Ziyue Yang from comment #5) > It seems to be a dracut's bug: > > https://github.com/dracutdevs/dracut/commit/ > f24d205537b094939379440ee013cca88c7582ac > > I've compiled and installed the newest version of dracut from github, and > the error handler bug seems no longer. exist. That's great, it past all my tests. (In reply to Xunlei Pang from comment #6) > (In reply to Ziyue Yang from comment #5) > > It seems to be a dracut's bug: > > > > https://github.com/dracutdevs/dracut/commit/ > > f24d205537b094939379440ee013cca88c7582ac > > > > I've compiled and installed the newest version of dracut from github, and > > the error handler bug seems no longer. exist. > > That's great, it past all my tests. BTW, for this dracut commit, seems there is something wrong with modules.d/98dracut-systemd/dracut-pre-mount.service, it changes to: "After=basic.target cryptsetup.target" instead of original "After=dracut-initqueue.service cryptsetup.target" which leads to the following Ordering cycle: systemd[1]: local-fs.target: Found ordering cycle on local-fs.target/start systemd[1]: local-fs.target: Found dependency on sysroot.mount/start systemd[1]: local-fs.target: Found dependency on dracut-pre-mount.service/start systemd[1]: local-fs.target: Found dependency on basic.target/start systemd[1]: local-fs.target: Found dependency on sysinit.target/start [ SKIP ] Ordering cycle found, skipping Local File Systems Kdump error handler will be triggered again in case of any Ordering cycle, so I think we should use the original dependency instead. (In reply to Xunlei Pang from comment #7) > (In reply to Xunlei Pang from comment #6) > > (In reply to Ziyue Yang from comment #5) > > > It seems to be a dracut's bug: > > > > > > https://github.com/dracutdevs/dracut/commit/ > > > f24d205537b094939379440ee013cca88c7582ac > > > > > > I've compiled and installed the newest version of dracut from github, and > > > the error handler bug seems no longer. exist. > > > > That's great, it past all my tests. > > BTW, for this dracut commit, seems there is something wrong with > modules.d/98dracut-systemd/dracut-pre-mount.service, it changes to: > "After=basic.target cryptsetup.target" instead of original > "After=dracut-initqueue.service cryptsetup.target" which leads to the > following Ordering cycle: Also, this service already contains "DefaultDependencies=no" which means has no dependency on "basic.target". > systemd[1]: local-fs.target: Found ordering cycle on local-fs.target/start > systemd[1]: local-fs.target: Found dependency on sysroot.mount/start > systemd[1]: local-fs.target: Found dependency on > dracut-pre-mount.service/start > systemd[1]: local-fs.target: Found dependency on basic.target/start > systemd[1]: local-fs.target: Found dependency on sysinit.target/start > [ SKIP ] Ordering cycle found, skipping Local File Systems > > Kdump error handler will be triggered again in case of any Ordering cycle, > so I think we should use the original dependency instead. Hi, xunlei, can you make a pull request in github so that Harald can take a look? (In reply to Dave Young from comment #9) > Hi, xunlei, can you make a pull request in github so that Harald can take a > look? Sure, will do (In reply to Dave Young from comment #9) > Hi, xunlei, can you make a pull request in github so that Harald can take a > look? Synced to the latest dracut git repository, it's already been fixed. So all the issues will be fixed if we backport the following two dracut upstream commits: 1) commit c000a21c25bd436f2b3cc2076cb7025cc82d2807 Author: Harald Hoyer <harald> Date: Wed Jun 22 18:12:19 2016 +0200 dracut-systemd/*.service: conflict with shutdown target make reboot/poweroff/halt work also conflict with emergency.target 2) commit b1ae591945acc4e9a962bd817e6e6e40e8587d2c Author: Harald Hoyer <harald> Date: Fri Jul 28 11:57:07 2017 +0200 dracut-systemd: add back missing dependencies otherwise TEST-20-NFS fails (In reply to Xunlei Pang from comment #11) > (In reply to Dave Young from comment #9) > > Hi, xunlei, can you make a pull request in github so that Harald can take a > > look? > > Synced to the latest dracut git repository, it's already been fixed. > > So all the issues will be fixed if we backport the following two dracut > upstream commits: > 1) > commit c000a21c25bd436f2b3cc2076cb7025cc82d2807 > Author: Harald Hoyer <harald> > Date: Wed Jun 22 18:12:19 2016 +0200 > > dracut-systemd/*.service: conflict with shutdown target > > make reboot/poweroff/halt work > > also conflict with emergency.target Sorry, this one is the problematic commit, the fix should be commit f24d205537b094939379440ee013cca88c7582ac Author: Harald Hoyer <harald> Date: Fri Jul 28 09:05:34 2017 +0200 dracut-systemd: fixed dependencies try to break an ordering cycle. https://github.com/dracutdevs/dracut/issues/259 > > 2) > commit b1ae591945acc4e9a962bd817e6e6e40e8587d2c > Author: Harald Hoyer <harald> > Date: Fri Jul 28 11:57:07 2017 +0200 > > dracut-systemd: add back missing dependencies > > otherwise TEST-20-NFS fails is this still broken? (In reply to Harald Hoyer from comment #13) > is this still broken? Hi Harald, We tested on latest Fedora rawhide, it will break. It can work properly after we applied the two fixes below: 1) commit f24d205537b094939379440ee013cca88c7582ac Author: Harald Hoyer <harald> Date: Fri Jul 28 09:05:34 2017 +0200 dracut-systemd: fixed dependencies try to break an ordering cycle. https://github.com/dracutdevs/dracut/issues/259 2) commit b1ae591945acc4e9a962bd817e6e6e40e8587d2c Author: Harald Hoyer <harald> Date: Fri Jul 28 11:57:07 2017 +0200 dracut-systemd: add back missing dependencies otherwise TEST-20-NFS fails dracut-046-2.git20170811.fc27.x86_64 |