RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1461377 - Make manual fencing via stonith_admin use same timeout as configured in CIB [RHEL 7]
Summary: Make manual fencing via stonith_admin use same timeout as configured in CIB [...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker
Version: 7.4
Hardware: Unspecified
OS: Unspecified
high
low
Target Milestone: pre-dev-freeze
: 7.9
Assignee: Chris Lumens
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1693256
TreeView+ depends on / blocked
 
Reported: 2017-06-14 10:31 UTC by Klaus Wenninger
Modified: 2020-02-21 17:09 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1693256 (view as bug list)
Environment:
Last Closed: 2020-02-21 17:09:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4540611 0 None None None 2019-10-30 02:33:07 UTC

Description Klaus Wenninger 2017-06-14 10:31:48 UTC
Description of problem:
When pengine creates a transition with a fencing-request it puts in stonith-timeout that is passed to stonith-api by tengine.
When fencing is triggered manually via pcs it seems to use the default timeout implemented in stonith_admin (120s).

Version-Release number of selected component (if applicable):
Version     : 0.9.158
Release     : 2.el7

How reproducible:
100%

Steps to Reproduce:
1. configure a fencing-device with a large and reproducible timeout (e.g. 90s)
2. 'pcs property set stonith-timeout=60s'
3. fence a node using pcs ('pcs stonith fence node3') --> success
3. pull the power-cord so that pengine triggers fencing --> timeout

Actual results:

Manually triggered fencing succeeds while triggered by pengine times out.

Expected results:

Both cases should time out if stonith-timeout is set too low.


Additional info:
A timeout can be handed over to stonith_admin.
Could be raised against pacemaker as well but stonith_admin should stay a (guess it is at the moment) tool that can be used to test stonithd standalone.
When pcmk_reboot_timout-attribute (similar for other actions) is used the behaviour is already consistent as the timeout passed via stonith-api is overruled.

Comment 2 Tomas Jelinek 2017-06-14 12:06:28 UTC
Klaus,

Can you describe more precisely what do you propose pcs should do?

Timeout can be set to specific stonith resources. But the user only specifies a node to be fenced in the "pcs stonith fence" command. In that case pcs should not look for a specific fencing device, take its timeout and pass it to stonith_admin as pcs does not know what device will be used to fence the node.

For me it makes sense that pacemaker should figure out and use whatever timeout is set in the CIB unless the timeout is overridden from the command line (stonith_admin --timeout) which is functionality pcs currently does not provide.

Comment 3 Klaus Wenninger 2017-06-14 12:30:35 UTC
(In reply to Tomas Jelinek from comment #2)
> Klaus,
> 
> Can you describe more precisely what do you propose pcs should do?
> 
> Timeout can be set to specific stonith resources. But the user only
> specifies a node to be fenced in the "pcs stonith fence" command. In that
> case pcs should not look for a specific fencing device, take its timeout and
> pass it to stonith_admin as pcs does not know what device will be used to
> fence the node.
> 
> For me it makes sense that pacemaker should figure out and use whatever
> timeout is set in the CIB unless the timeout is overridden from the command
> line (stonith_admin --timeout) which is functionality pcs currently does not
> provide.

Well, as described under additional info one could discuss where to solve this issue (pacemaker or pcs)...
When stonithd is getting the individual timeouts from the devices it is anyway working as desired already.
Just the case where stonith-timeout property would be used by pengine and 120s hardcoded is used by stonith_admin should be considered. Actually no distinction has to be made because stonithd would handle the overruling by individual timeout already.
As stonith_admin is at the moment just using the stonith-API it can't get the stonith-timeout property from the cib without adding usage of other APIs possibly breaking the standalone-capability.
Thus I would have suggested that pcs is getting the timeout from the cib as default behaviour with maybe --timeout to overrule that.

But I'm not familiar with the standalone capabilities of stonithd.
So I'm adding Ken.
Could imagine that stonith_admin adds the cib-access when compiled with pacemaker as well. Or maybe even nicer we could give 0 (or -1) as timeout via the stonith-API to tell stonithd to insert the value from the cib.

Comment 4 Ken Gaillot 2017-06-14 14:27:55 UTC
I do like the idea of using -1 to indicate "use the configured default". Stonithd already maintains a local copy of the entire CIB, so it should be easy to grab stonith-timeout from it when needed.

Comment 5 Klaus Wenninger 2017-06-14 14:30:30 UTC
(In reply to Ken Gaillot from comment #4)
> I do like the idea of using -1 to indicate "use the configured default".
> Stonithd already maintains a local copy of the entire CIB, so it should be
> easy to grab stonith-timeout from it when needed.

yes, my favourite as well - just didn't think of it till I wrote this last comment ... let's do it in pacemaker

Comment 7 Ken Gaillot 2017-10-09 17:45:38 UTC
Due to time constraints, this will not make 7.5

Comment 8 Ken Gaillot 2019-03-18 17:45:05 UTC
Bumping to 7.8 due to time constraints

Comment 10 Ken Gaillot 2020-02-21 17:09:33 UTC
Due to developer time constraints, this is unlikely to be fixed in the 7.9 time frame and so will be fixed for RHEL 8 only (Bug 1693256)


Note You need to log in before you can comment on or make changes to this bug.