Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1384109

Summary:	It takes about 1 second to get stonith agent metadata using crm_resource
Product:	Red Hat Enterprise Linux 7	Reporter:	Tomas Jelinek <tojeline>
Component:	pacemaker	Assignee:	Jan Pokorný [poki] <jpokorny>
Status:	CLOSED WONTFIX	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	high
Version:	7.3	CC:	cluster-maint, jpokorny, kgaillot, mnovacek, phagara
Target Milestone:	rc
Target Release:	7.9
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:	undefined	Story Points:	---
Clone Of:
Clones:	1552654 (view as bug list)		Environment:
Last Closed:	2020-02-21 16:56:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1552654

Description Tomas Jelinek 2016-10-12 14:56:07 UTC

Description of problem:
It takes about 1 second to get stonith agent metadata using crm_resource --show-metadata.

Recently (bz1262001) pcs switched to reading all information about resource and stonith agents from pacemaker. When listing all fence agents using pcs, it takes quite a long time for the command to finish if more fence agents are installed.

Also whenever pcs neesd to work with metadata, there is a noticeable slow down comparing to previous version.

Since pacemaker changes bits and pieces in metadata we prefer pcs not to get metadata directly from agents.


Version-Release number of selected component (if applicable):
[root@rh68-node1:~]# rpm -q pacemaker
pacemaker-1.1.15-1.el6.x86_64


How reproducible:
always, easily


Steps to Reproduce:
[root@rh68-node1:~]# time fence_apc -o metadata > /dev/null

real    0m0.053s
user    0m0.037s
sys     0m0.012s
[root@rh68-node1:~]# time crm_resource --show-metadata stonith:fence_apc > /dev/null

real    0m1.017s
user    0m0.070s
sys     0m0.007s

# listing using pcs, one agent per line
[root@rh68-node1:~]# time pcs stonith list | wc
     44     337    2087
                                                                                                                    
real    0m44.872s
user    0m2.318s
sys     0m0.437s


Actual results:
It takes about 1 second to get metadata.


Expected results:
It should take about the same time as getting it from the agent directly.

Comment 2 Ken Gaillot 2016-10-12 21:45:21 UTC

This does seem odd.

Reassigning to RHEL7, as RHEL6 is only getting high-priority bugfixes now, and the behavior is present on RHEL7.

Comment 3 Jan Pokorný [poki] 2017-01-26 19:39:16 UTC

If I am not mistaken, part of the issue may be that there are two
roundtrips hidden in the stonith query as opposed to the resource
one:

lrmd API client             lrmd API client
(crm_resource)              (crm_resource)
                            
  |        ^                  |        ^
  |        |                  |        |
  v        |                  v        |
                            
lrmd API handler            lrmd API handler
    (lrmd)                      (lrmd)
                   
  |        ^       
  |        |       
  v        |       
                   
 stonith-ng API
    handler
  (stonithd)

Comment 4 Jan Pokorný [poki] 2017-01-27 13:38:53 UTC

Sorry, there's in fact no message routing round trip at all in the
context of pcs' use of crm_resource.

Results in RHEL 7.3 VM:

- "/usr/sbin/fence_apc -o metadata" takes around 0.084s

- "crm_resource --show-metadata stonith:fence_apc" around 1.022s

Using strace with timeouts, I can notice that there is a significant
pause (750-800 ms) after forked process to exec fence_apc has exited
and before WNOHANG wait resumes.

Comment 5 Jan Pokorný [poki] 2017-01-27 15:42:36 UTC

Looks like commit 12cf7b901733a96e4a7844e9f596430c5e8c2a3c introduced
unnecessary block-for-a-second penalty.

Proposed and tested fix (boost by factor of 10):
https://github.com/ClusterLabs/pacemaker/pull/1214

Unfortunately, it currently fails at an lrmd test
(more investigation pending).

Comment 6 Ken Gaillot 2017-05-10 14:50:03 UTC

This will be not be ready for 7.4, bumping to 7.5

Comment 7 Ken Gaillot 2017-10-18 22:33:50 UTC

This will not make it in time for 7.5

Comment 9 Ken Gaillot 2020-02-21 16:56:40 UTC

Due to developer time constraints, this is unlikely to be done in the 7.9 time frame and so will be fixed for RHEL 8 only (Bug 1552654)