Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1346014

Summary:	debug-* actions don't work with all resource classes
Product:	Red Hat Enterprise Linux 8	Reporter:	Ken Gaillot <kgaillot>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	low
Version:	8.0	CC:	cfeist, cluster-maint, idevat, msmazova, omular, tojeline
Target Milestone:	pre-dev-freeze	Keywords:	Reopened, Triaged
Target Release:	8.6	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	pacemaker-2.1.2-1.el8	Doc Type:	No Doc Update
Doc Text:	This is a self-explanatory change that will not affect most users	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-05-10 14:09:46 UTC	Type:	Enhancement
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:	2.1.2
Embargoed:

Description Ken Gaillot 2016-06-13 16:30:14 UTC

Description of problem: The various "pcs resource debug-" commands work fine with OCF resources, partially with LSB resources, and not at all with systemd resources (I didn't test other classes like nagios, but they're not really important for this).

Version-Release number of selected component (if applicable): all

How reproducible: Trivially

Steps to Reproduce:
1. Configure a cluster with non-OCF resources such as LSB and systemd.
2. Run "pcs resource debug-monitor" with those resources.

Actual results:

Example for an LSB resource:
Error performing operation: No such process
Operation monitor for lsb-dummy (lsb::/usr/share/pacemaker/tests/cts/LSBDummy) returned 3
> stdout: Dummy LSB service is stopped
> stdout: LSBDummy status : 3

Example for a systemd resource:
Error performing operation: Argument list too long

I get the "Argument list too long" message with OCF agents as well, but with the right output after that.

Expected results:

For LSB and OCF resources, no "Error performing operation" message.

For systemd resources, maybe a "Error: systemd resources not supported" message (this is already used for fencing resources: "Error: stonith devices are not supported", though at least debug-monitor might be supportable for stonith). Alternatively, it could call "systemctl status", but that would not be exactly comparable to what the cluster does, so it might be better to say it's unsupported.

Other info:

The full list of classes supported by Pacemaker is: lsb, nagios, ocf, service, stonith, systemd, upstart. With "service", Pacemaker will look for lsb, systemd and upstart agents, and use the first one it finds.

Comment 2 Tomas Jelinek 2016-06-14 07:52:05 UTC

Pcs basically just calls "crm_resource -r <resource> --force-<action>" and displays its output. See https://github.com/ClusterLabs/pcs/blob/359e8f5c71fd6d3e3563df63efa15cfe211119da/pcs/resource.py#L2226 for details. So I am not sure what can be done in pcs to fix this. Certainly we can check for the resource's class and say systemd resources are not supported. On the other hand the message about unsupported stonith resources comes from crm_resource and pcs just catches it. It might be better to stick to this approach.

It looks like start and stop work just fine for systemd, monitor is broken:
[root@rh72-node1:~/devel/pcs/pcs]# crm_resource -r postfix --force-check -V
Error performing operation: Argument list too long
[root@rh72-node1:~/devel/pcs/pcs]# crm_resource -r postfix --force-start -V
Operation start for postfix (systemd::postfix) returned 0
[root@rh72-node1:~/devel/pcs/pcs]# crm_resource -r postfix --force-stop -V
Operation stop for postfix (systemd::postfix) returned 0
[root@rh72-node1:~/devel/pcs/pcs]# pcs resource show postfix
 Resource: postfix (class=systemd type=postfix)
  Meta Attrs: target-role=Stopped 
  Operations: monitor interval=60s (postfix-monitor-interval-60s)

The "Error performing operation: No such process" messages comes from crm_resource's stderr. The thing is pcs mixes stdout and stderr together. So this could be fixed in pcs, but we need to figure out when stderr is not relevant. We probably cannot tell by crm_resource's exit code:
[root@rh68-node1:~]# crm_resource -r postfix --force-check -V
Operation monitor for postfix (lsb::postfix) returned 3
 >  stdout: master is stopped
Error performing operation: No such process
[root@rh68-node1:~]# echo $?
3

Can you describe your idea how would you like this to be fixed in more detail?

Comment 3 Ken Gaillot 2016-06-14 13:57:58 UTC

I feel silly now :) Of course this should be against pacemaker.

Comment 4 Andrew Beekhof 2016-06-15 02:58:50 UTC

Agree. It would be nice to support all the other non-fencing classes

Comment 5 Ken Gaillot 2017-01-10 21:54:39 UTC

This will not be addressed in the 7.4 timeframe

Comment 6 Ken Gaillot 2017-07-18 15:48:03 UTC

Due to short 7.5 cycle and limited resources, bumping this to 7.6

Comment 9 Ken Gaillot 2020-10-02 21:22:43 UTC

Due to developer time constraints, this issue has been opened as an upstream bug report, and this report will be closed. If developer time becomes available, this report will be reopened.

Comment 10 Ken Gaillot 2021-08-17 19:49:34 UTC

Fixed upstream as of commit b4e426a0

The crm_resource --force-* options support OCF and LSB resources (including LSB resources configured as the "service" class). All other resource classes (such as systemd) will now give an error message about being unsupported. LSB monitor results will be reported as OCF results (in particular, an LSB exit status of 3 for --force-check will be mapped to "not running", which is OCF exit status 7).

Trying to use one of the options with a bundle will also give an error message.

Comment 18 Markéta Smazová 2022-01-21 17:02:02 UTC

>   [root@virt-524 ~]# rpm -q pacemaker
>   pacemaker-2.1.2-2.el8.x86_64
>   [root@virt-524 ~]# rpm -q pacemaker-cts
>   pacemaker-cts-2.1.2-2.el8.noarch

Configure a cluster wih LSB, OCF, systemd and service class resources. Resource LSBDummy is included in pacemaker-cts package
and is used in this bugzilla for testing the LSB class.

>   [root@virt-524 ~]# pcs resource create lsb1 lsb:LSBDummy op monitor interval=10
>   [root@virt-524 ~]# pcs resource create lsb2 service:LSBDummy op monitor interval=10
>   [root@virt-524 ~]# pcs resource create dummy ocf:pacemaker:Dummy
>   [root@virt-524 ~]# pcs resource create postfix systemd:postfix op monitor interval=60s
>   [root@virt-524 ~]# pcs resource create postfix2 service:postfix op monitor interval=60s

>   [root@virt-524 ~]# pcs status
>   Cluster name: STSRHTS9364
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-525 (version 2.1.2-2.el8-ada5c3b36e2) - partition with quorum
>     * Last updated: Fri Jan 21 16:37:20 2022
>     * Last change:  Fri Jan 21 16:37:12 2022 by hacluster via crmd on virt-524
>     * 2 nodes configured
>     * 7 resource instances configured

>   Node List:
>     * Online: [ virt-524 virt-525 ]

>   Full List of Resources:
>     * fence-virt-524	(stonith:fence_xvm):	 Started virt-524
>     * fence-virt-525	(stonith:fence_xvm):	 Started virt-525
>     * lsb1	(lsb:LSBDummy):	 Started virt-525
>     * lsb2	(service:LSBDummy):	 Started virt-524
>     * postfix	(systemd:postfix):	 Started virt-524
>     * postfix2	(service:postfix):	 Started virt-525
>     * dummy	(ocf::pacemaker:Dummy):	 Started virt-524

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

Run `pcs resource debug-monitor` (or `crm_resource --force-*`) with the resources and check the output:

>   [root@virt-524 ~]# crm_resource -r lsb1 --force-check
>   Operation force-check for lsb1 (lsb:LSBDummy) returned 0 (ok)
>   Running OK
>   LSBDummy status : 0

>   [root@virt-524 ~]# crm_resource -r lsb1 --force-stop
>   Operation force-stop for lsb1 (lsb:LSBDummy) returned 0 (ok)
>   LSBDummy stop : 0B service: [  OK  ]


>   [root@virt-524 ~]# pcs resource debug-monitor lsb2
>   Operation force-check for lsb2 (service:LSBDummy) returned 0 (ok)
>   Running OK
>   LSBDummy status : 0

>   [root@virt-524 ~]# crm_resource -r lsb2 --force-start
>   Operation force-start for lsb2 (service:LSBDummy) returned 0 (ok)
>   LSBDummy start : 0 service: [  OK  ]


>   [root@virt-524 ~]# pcs resource debug-monitor dummy
>   Operation force-check for dummy (ocf:pacemaker:Dummy) returned 0 (ok)

>   [root@virt-524 ~]# crm_resource -r postfix --force-stop
>   Operation force-stop for postfix (systemd:postfix) could not be executed (Error: Manual execution of this standard is unsupported)
>   crm_resource: Error performing operation: Unimplemented

>   [root@virt-524 ~]# crm_resource -r postfix2 --force-check
>   Operation force-check for postfix2 (service:postfix) could not be executed (Error: Manual execution of this standard is unsupported)
>   crm_resource: Error performing operation: Unimplemented


Disable the resources:

>   [root@virt-524 ~]# pcs resource disable lsb1
>   [root@virt-524 ~]# pcs resource disable lsb2
>   [root@virt-524 ~]# pcs resource disable postfix
>   [root@virt-524 ~]# pcs resource disable postfix2
>   [root@virt-524 ~]# pcs resource disable dummy

>   [root@virt-524 ~]# pcs status
>   Cluster name: STSRHTS9364
>   Cluster Summary:
>     * Stack: corosync
>     * Current DC: virt-525 (version 2.1.2-2.el8-ada5c3b36e2) - partition with quorum
>     * Last updated: Fri Jan 21 17:38:56 2022
>     * Last change:  Fri Jan 21 17:38:49 2022 by hacluster via crmd on virt-524
>     * 2 nodes configured
>     * 7 resource instances configured (5 DISABLED)

>   Node List:
>     * Online: [ virt-524 virt-525 ]

>   Full List of Resources:
>     * fence-virt-524	(stonith:fence_xvm):	 Started virt-524
>     * fence-virt-525	(stonith:fence_xvm):	 Started virt-525
>     * lsb1	(lsb:LSBDummy):	 Stopped (disabled)
>     * lsb2	(service:LSBDummy):	 Stopped (disabled)
>     * postfix	(systemd:postfix):	 Stopped (disabled)
>     * postfix2	(service:postfix):	 Stopped (disabled)
>     * dummy	(ocf::pacemaker:Dummy):	 Stopped (disabled)

>   Daemon Status:
>     corosync: active/enabled
>     pacemaker: active/enabled
>     pcsd: active/enabled

Run `pcs resource debug-monitor` (or `crm_resource --force-check`) on the resources and check the output:

>   [root@virt-524 ~]# pcs resource debug-monitor lsb1
>   crm_resource: Error performing operation: Not running
>   Operation force-check for lsb1 (lsb:LSBDummy) returned 7 (not running)
>   Dummy LSB service is stopped
>   LSBDummy status : 3


>   [root@virt-524 ~]# crm_resource -r lsb2 --force-check
>   Operation force-check for lsb2 (service:LSBDummy) returned 7 (not running)
>   Dummy LSB service is stopped
>   LSBDummy status : 3

>   crm_resource: Error performing operation: Not running


>   [root@virt-524 ~]# pcs resource debug-monitor dummy
>   crm_resource: Error performing operation: Not running
>   Operation force-check for dummy (ocf:pacemaker:Dummy) returned 7 (not running)

>   [root@virt-524 ~]# crm_resource -r postfix --force-check
>   Operation force-check for postfix (systemd:postfix) could not be executed (Error: Manual execution of this standard is unsupported)
>   crm_resource: Error performing operation: Unimplemented

>   [root@virt-524 ~]# pcs resource debug-monitor postfix2
>   Operation force-check for postfix2 (service:postfix) could not be executed (Error: Manual execution of this standard is unsupported)
>   crm_resource: Error performing operation: Unimplemented


The `pcs resource debug-*` (`crm_resource --force-*`) command is supported on OCF and LSB resources including LSB resources 
configured as the "service" class (resources lsb1, lsb2, dummy). 
Other systemd and service class resources (postfix and postfix2) are showing error, that they are unsupported.


verified in pacemaker-2.1.2-2.el8

Comment 20 errata-xmlrpc 2022-05-10 14:09:46 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1885