Description of problem: Running oadm diagnostics on 3.3.0.32 always returns 3 errors for non-existent systemd units ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show origin-master`: exit status 1 Cannot analyze systemd units. ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show origin-node`: exit status 1 Cannot analyze systemd units. ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show kubernetes`: exit status 1 Cannot analyze systemd units. The correct units in my HAT config are atomic-openshift-master-controllers, atomic-openshift-master-api and atomic-openshift-node Version-Release number of selected component (if applicable): 3.3.0.32 How reproducible: always Steps to Reproduce: 1. On HA (multi-master) master: oadmin diagnostics --master-config=/etc/origin/master/master-config.yaml Actual results: ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show origin-master`: exit status 1 Cannot analyze systemd units. ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show origin-node`: exit status 1 Cannot analyze systemd units. ERROR: [DS1004 from controller openshift/origin/pkg/diagnostics/systemd/locate_units.go] Unable to run `systemctl show kubernetes`: exit status 1 Cannot analyze systemd units. Expected results: status/diagnostics for actual systemd units
This is a regression. It's supposed to quietly skip units that aren't actually there; I think perhaps systemctl is returning an error code where it didn't before, but in any case, this needs a fix. (BTW it *is* checking for atomic-openshift-node but needs to be updated to handle the split in the master units.)
systemd.x86_64 219-30.el7
Red Hat hasn't released systemd-219-30.el7 AFAICS. Should we expect to see this in the wild soon? I don't see this with released version systemd-219-19.el7_2.13. If I run `systemctl show something-bogus`, I get back a unit description (for a non-existent unit) and no error.
That version is from the ops mirror repo: https://mirror.openshift.com/enterprise/rhel/rhel7next/os Should be in the wild in RHEL 7.3
Looks like docker-1.12 brings this level of systemd along as well.
I think this is being counted as a regression in systemd and addressed in https://bugzilla.redhat.com/show_bug.cgi?id=1380259 - as such I'm inclined not to try to work around the changed systemctl behavior.
*** Bug 1432221 has been marked as a duplicate of this bug. ***
The original report was due to a regression in systemctl which was eventually addressed in a systemd update. Checking for both Origin and OCP units is normal; it was just the error received in doing so that was a problem. However I still needed to update the master unit names for the split into -controllers and -api. https://github.com/openshift/origin/pull/18542 does that.
Backports: 3.8: https://github.com/openshift/origin/pull/18555 3.7: https://github.com/openshift/origin/pull/18556 3.6: https://github.com/openshift/origin/pull/18557 It does not seem worth filing bugs for earlier releases, nor backporting earlier than 3.6, but both can be done as needed.
Not reproduced on oc/openshift v3.9.0-0.47.0 systemd units are quietly skipped, only display one hint "[Note] Performing systemd discovery"
Checked for back ports. The issue is not reproduced, same result as 3.9. oc/openshift v3.6.173.0.104 oc/openshift v3.7.31
Checked on oc/openshift v3.8.32 also has the Note info "[Note] Performing systemd discovery" It could be verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489