Hide Forgot
Description of problem: When using crm_resource to list systemd resources, the list does not contain all resources available on a machine. On the other hand it contains resources that should probably not be listed (like systemd-fsck@dev-disk-by\x2duuid-95f6f325\x2d6149\x2d4e8f\x2daadf\x2de78bc53c4a86). Version-Release number of selected component (if applicable): pacemaker-1.1.15-10.el7.x86_64 How reproducible: always, easily Steps to Reproduce: # systemctl list-unit-files | grep haproxy haproxy.service disabled # crm_resource --list-agents systemd | grep haproxy Actual results: crm_resource does not list haproxy service Expected results: crm_resource lists haproxy service Additional info: It seems the issue can be fixed in lib/services/systemd.c where systemd_unit_listall function gets list of services by sending "ListUnits" commands to systemd instead of "ListUnitFiles". This is reproducible on command line: # systemctl list-units | grep haproxy # systemctl list-unit-files | grep haproxy haproxy.service Or you can get the dbus call result like this (omitting the actual output as it is too long): # dbus-send --system --print-reply --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnits 2>/dev/null # dbus-send --system --print-reply --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnitFiles 2>/dev/null This may be only reproducible when the service is disabled and not running. Which is exactly the status you have before creating a resource from the service. This leads to a situation that you want to check if a service exists and what parameters it takes before you create a resource and pacemaker just tells you there is no such service. This was reported upstream today on irc.freenode.net #clusterlabs: [09:52] <rbjorklin> I can't find a resource for haproxy, I also can't find documentation on how to add services myself. Does someone know if this is possible? [09:54] <rbjorklin> Preferably I would like to use the systemd standard as systemd is used in Centos 7 [10:05] <krig> rbjorklin: just use systemd:haproxy for haproxy [10:05] <krig> rbjorklin: don't need an agent [10:05] <rbjorklin> So that works even though it doesn't show up in: pcs resource agents systemd ? [10:06] <krig> ah, well. If there's no systemd service for haproxy, then it won't work. [10:06] <rbjorklin> There is a systemd service for haproxy [10:06] <krig> in that case, it will work, yes. [10:06] <rbjorklin> It just doesn't get listed with the rest [10:06] <krig> that's odd.. but it should still work as long as it's there. [10:07] <rbjorklin> I'll give it a go and report back, thanks for your assistance! [10:07] <krig> no problem! [10:10] <rbjorklin> krig: seems to be working although it's unlisted. Cheers mate! [10:15] <krig> rbjorklin: great! yeah, must be a pcs issue that it doesn't show up. pacemaker itself talks directly to systemd so if systemd knows about it, it should work fine They think it is a pcs issues. That is however not true. Pcs gets the list of agents from crm_resource --list-agents.
Confirmed -- this was recently reported upstream, too.
Recently a bug was found in pcs upstream in parsing systemd: (and service:) resource agent names: bz1419661. The same bug is in crm_resource as well: [root@rh73-node1:~]# crm_resource --show-metadata systemd:nonexistent@some:thing <?xml version="1.0"?> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> <resource-agent name="thing" version="0.1"> <version>1.0</version> <longdesc lang="en"> thing.service </longdesc> <shortdesc lang="en">systemd unit file for thing</shortdesc> <parameters> </parameters> <actions> <action name="start" timeout="100" /> <action name="stop" timeout="100" /> <action name="status" timeout="100" /> <action name="monitor" timeout="100" interval="60"/> <action name="meta-data" timeout="5" /> </actions> <special tag="systemd"> </special> </resource-agent> Agent name should be "nonexistent@some:thing" not just "thing". This however may become irrelevant once pacemaker switches to ListUnitFiles.
I don't think we're going to make the 7.4 deadline for this, so bumping to 7.5.
QA: To test, run: crm_resource --list-agents systemd Before the fix, the output will only show active systemd units, not those that are disabled and thus available for cluster management. Also, the output will be unsorted. After the fix, the output will show all systemd units that can be managed by the cluster, and they will be sorted alphabetically.
(In reply to Tomas Jelinek from comment #3) > Recently a bug was found in pcs upstream in parsing systemd: (and service:) > resource agent names: bz1419661. The same bug is in crm_resource as well: > > [root@rh73-node1:~]# crm_resource --show-metadata > systemd:nonexistent@some:thing > <?xml version="1.0"?> > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> > <resource-agent name="thing" version="0.1"> > <version>1.0</version> > <longdesc lang="en"> > thing.service > </longdesc> > <shortdesc lang="en">systemd unit file for thing</shortdesc> > <parameters> > </parameters> > <actions> > <action name="start" timeout="100" /> > <action name="stop" timeout="100" /> > <action name="status" timeout="100" /> > <action name="monitor" timeout="100" interval="60"/> > <action name="meta-data" timeout="5" /> > </actions> > <special tag="systemd"> > </special> > </resource-agent> > > Agent name should be "nonexistent@some:thing" not just "thing". > > This however may become irrelevant once pacemaker switches to ListUnitFiles. The above issue is also fixed in the same build.
Marking Verified in version pacemaker-1.1.18-11.el7. 1) BEFORE fix - crm_resource does not list available resources which are disabled and not running ============= # rpm -q pacemaker pacemaker-1.1.16-12.el7_4.7.x86_64 ## crm_resource lists postfix service when it is enabled # systemctl list-unit-files | grep postfix postfix.service enabled # crm_resource --list-agents systemd | grep postfix postfix ## postfix service is no longer in crm_resource list after it was disabled and stopped # systemctl disable postfix Removed symlink /etc/systemd/system/multi-user.target.wants/postfix.service. # systemctl stop postfix # systemctl list-unit-files | grep postfix postfix.service disabled # crm_resource --list-agents systemd | grep postfix # echo $? 1 2) AFTER fix - crm_resource lists available resources which are disabled and not running ============= # rpm -q pacemaker pacemaker-1.1.18-11.el7.x86_64 # systemctl list-unit-files | grep postfix postfix.service enabled # crm_resource --list-agents systemd | grep postfix postfix # systemctl disable postfix Removed symlink /etc/systemd/system/multi-user.target.wants/postfix.service. # systemctl stop postfix # systemctl list-unit-files | grep postfix postfix.service disabled # crm_resource --list-agents systemd | grep postfix postfix ## units in list are sorted alphabetically # crm_resource --list-agents systemd | head abrt-ccpp abrt-oops abrt-pstoreoops abrt-vmcore abrt-xorg abrtd arp-ethers atd auditd auth-rpcgss-module # crm_resource --list-agents systemd | tail systemd-update-utmp systemd-update-utmp-runlevel systemd-user-sessions systemd-vconsole-setup tcsd teamd@ tuned unbound-anchor usb_modeswitch@ wpa_supplicant
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0860