Bug 1619253

Summary: premature standby status reported by pcs while standby is in progress
Product: Red Hat Enterprise Linux 7 Reporter: Klaus Wenninger <kwenning>
Component: pcsAssignee: Ondrej Mular <omular>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 7.4CC: cfeist, cluster-maint, cluster-qe, idevat, jruemker, mnovacek, nhostako, omular, pzimek, sbradley, tojeline
Target Milestone: rc   
Target Release: 7.7   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.168-1.el7 Doc Type: Bug Fix
Doc Text:
Cause: A node in standby mode may be running resources while standby is in progress. In `pcs status` pcs is reporting such node as standby (with running resources). There is no such separation of nodes in `pcs status nodes`. Consequence: Standby nodes with running resources are prematuraly listed as in standby in `pcs status nodes` command. Fix: Print separate list of standby nodes with running resources in `pcs status nodes` Result: Standby nodes and standby nodes with running resource are listed separately in output of `pcs status nodes` command.
Story Points: ---
Clone Of: 1419548 Environment:
Last Closed: 2020-03-31 19:09:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1420851    

Comment 2 Klaus Wenninger 2018-08-20 13:01:04 UTC
xml-schema used for status between crm_mon and pcs already has
running_resources-counter pcs might use to signal that a node
is in standby but still with active resources.

Comment 3 Tomas Jelinek 2018-08-20 13:16:19 UTC
(In reply to Klaus Wenninger from comment #2)
> xml-schema used for status between crm_mon and pcs already has
> running_resources-counter pcs might use to signal that a node
> is in standby but still with active resources.

How exactly does that help to display nodes as "standby with active resources" in crm_mon output? Pcs is not going to modify crm_mon output. And as far as I can tell, the issue is that standby nodes with running resources are not displayed as such in "pcs status" i.e. "crm_mon". Do I miss something?

Can you elaborate on why this was cloned for pcs and what are we supposed to do in pcs? Thanks.

Comment 4 Klaus Wenninger 2018-08-20 13:28:48 UTC
pcs wouldn't have to modify anything it gets from crm_mon.
while in the modes where it creates output for textual/html display crm_mon creates a single string for the node-mode in xml-output-mode it creates a list of flags for each node. pcs then uses these flags to create the output.
When it creates the mode-string itself crm_mon is being modified to distinguish the cases properly but when it just passes the flags there is actually nothing
to be changed. It already has the flag for 'standby' and it has the running-resource-counter.
I haven't had a look at the code that creates output for pcs but as the flags indicating the mode are already non-mutual-exclusive there already has to be some logic that would have to be extended in a similar way as in the crm_mon code for textual/html output.
Please correct me if I'm mixing up things.

Comment 5 Tomas Jelinek 2018-08-20 13:58:53 UTC
'pcs status' displays textual output of crm_mon. So for CLI we should already be covered by changes in pacemaker. Maybe this was meant for GUI?

Comment 6 Klaus Wenninger 2018-08-20 14:34:19 UTC
If there are cases where pcs doesn't derive output from xml we should probably re-add the dependency.
If there are other cases where the xml-data is being used for creating the node-mode we should be consistent there as well.

Comment 7 Klaus Wenninger 2018-08-24 12:14:15 UTC
If it is assured that the mode of a node is exclusively taken from from crm_mon's textual display (never ever generated from xml-data) I guess this bz can be closed for pcs as the fix in crm_mon would already cover a correction in pcs output.
Just wanted to assure that there are no surprises ;-)

Comment 8 Tomas Jelinek 2019-02-07 16:50:55 UTC
* 'pcs status' displays output from crm_mon, so we are covered by pacemaker fix for bz1419548
* 'pcs status nodes' displays output based on xml status, so we should fix that in pcs. We'll probably add a new category "Standby with active resources" and put nodes in there based on  standby = true and resources_running > 0

Comment 11 Ivan Devat 2019-08-05 11:14:18 UTC
After Fix:

[kid76 ~] $ rpm -q pcs
pcs-0.9.168-1.el7.x86_64

[kid76 ~] $ pcs resource create D1 ocf:heartbeat:Delay stopdelay=120
[kid76 ~] $ pcs resource
 D1     (ocf::heartbeat:Delay): Started kid76 (Monitoring)
[kid76 ~] $ pcs node standby kid76
[kid76 ~] $ pcs status nodes
Pacemaker Nodes:
 Online: lion76
 Standby:
 Standby with resource(s) running: kid76
 Maintenance:
 Offline:
Pacemaker Remote Nodes:
 Online:
 Standby:
 Standby with resource(s) running:
 Maintenance:
 Offline:

Comment 15 errata-xmlrpc 2020-03-31 19:09:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0996