Bug 741115 - mdmonitor-takeover.service should use DefaultDependencies=no
mdmonitor-takeover.service should use DefaultDependencies=no
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: mdadm (Show other bugs)
16
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Doug Ledford
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 740703
  Show dependency treegraph
 
Reported: 2011-09-25 11:49 EDT by Michal Schmidt
Modified: 2011-10-24 23:30 EDT (History)
5 users (show)

See Also:
Fixed In Version: mdadm-3.2.2-12.fc16
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 744226 (view as bug list)
Environment:
Last Closed: 2011-10-24 23:30:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Michal Schmidt 2011-09-25 11:49:36 EDT
Description of problem:

The current version of mdmonitor-takeover.service says:
  [Unit]
  Description=Software RAID Monitor Takeover
  After=syslog.target

  [Service]
  Type=forking
  RemainAfterExit=yes
  ExecStart=/sbin/mdmon --takeover --all

  [Install]
  WantedBy=sysinit.target

I do not know the exact intentions of the author of the unit file, but the fact
that this unit installs itself into sysinit.target suggests that it is meant to run early during boot.
However, since it defaults to DefaultDependencies=yes, it automatically gets a dependency After=basic.target (and a few more dependencies, see "man systemd.service" for details).
If the intent is indeed to run this service in early boot before ordinary services, DefaultDependencies=no should be specified in the unit file.

So far this inconsistency did not seem to cause actual problems, because it was hidden by a bug in systemd.
When there is a requirement dependency between a target and another unit (such as: sysinit.target Wants mdmonitor-takeover.service), and both units have DefaultDependencies=yes, the dependency is supposed to be complemented automatically with ordering (sysinit.target After mdmonitor-takeover.service).
This is documented in "man systemd.target", but it did not work until systemd-36. When systemd-36-2.fc16 with the fix was pushed to updates-testing, the problem with mdmonitor-takeover.service appeared. An ordering cycle was the result:
 sysinit.target -> mdmonitor-takeover.service -> basic.target -> sysinit.target
I reverted the fix in systemd-36-3.fc16, but the revert is meant to be only temporary. The ordering inconsistency in mdmonitor-takeover.service should be fixed.

Version-Release number of selected component (if applicable):
mdadm-3.2.2-9.fc16.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install systemd-36-2.fc16, make sure mdmonitor-takeover.service is enabled.
2. Reboot.
  
Actual results:
You will see bug 741078 or similar. Some services will not be started because
they will be eliminated by ordering cycle resolution. mdmonitor-takeover.service will be mentioned in the ordering cycles in the logs.

Expected results:
A clean boot without ordering cycles.

Additional info:
Untested, but this should work:

  [Unit]
  Description=Software RAID Monitor Takeover
  After=syslog.target
  Before=sysinit.target shutdown.target
  Conflicts=shutdown.target
  DefaultDependencies=no

  [Service]
  Type=forking
  RemainAfterExit=yes
  ExecStart=/sbin/mdmon --takeover --all

  [Install]
  WantedBy=sysinit.target
Comment 1 Milan Broz 2011-09-25 13:51:48 EDT
> So far this inconsistency did not seem to cause actual problems, because it 
> was hidden by a bug in systemd.

And that's why it was not found when testing mdadm unit files. It seems like simple bug in new service file.
DefaultDependencies=no makes perfect sense here.

(Doug, do you plan some mdadm rebuild? I think this is trivial change to add...)
Comment 2 Doug Ledford 2011-09-26 11:36:10 EDT
Yes, there is likely to be an mdadm rebuild once a fix for another bug is identified.
Comment 3 John Ellson 2011-09-29 13:08:44 EDT
Why doesn't systemd detect cycles like this?    It seems very fragile if this kind of bug in an unrelated service can make the system near unrecoverable because services like dbus and NetworkManager fail to start.
Comment 4 Michal Schmidt 2011-09-29 16:10:10 EDT
(In reply to comment #3)
> Why doesn't systemd detect cycles like this?

It does. It breaks cycles by dropping a (pretty much randomly selected) member of the cycle. In https://bugzilla.redhat.com/show_bug.cgi?id=741078#c3 I showed an example where dbus.socket was the unfortunate victim.
Comment 5 John Ellson 2011-09-30 09:38:25 EDT
Isn't it a cycle in a directed graph?   Shouldn't the strategy be to drop the lowest node in the cycle, rather than a random node ?    Or perhaps drop the node with the least dependencies up to that point?

Somehow I think it should cause less damage than it did in this case.
Comment 6 Michal Schmidt 2011-09-30 10:09:45 EDT
Let's keep this BZ about the mdadm dependencies. I opened bug 742546 for the discussion about cycle breaking in systemd.
Comment 7 Fedora Update System 2011-10-20 04:20:47 EDT
mdadm-3.2.2-11.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.2-11.fc16
Comment 8 Fedora Update System 2011-10-20 18:15:00 EDT
Package mdadm-3.2.2-11.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.2.2-11.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2011-14682
then log in and leave karma (feedback).
Comment 9 Fedora Update System 2011-10-22 11:15:50 EDT
mdadm-3.2.2-12.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/mdadm-3.2.2-12.fc16
Comment 10 Fedora Update System 2011-10-24 23:30:47 EDT
mdadm-3.2.2-12.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.