Bug 1384172

Summary: crm_attribute should detect remote node name correctly when different from hostname
Product: Red Hat Enterprise Linux 8 Reporter: Ken Gaillot <kgaillot>
Component: pacemakerAssignee: Chris Lumens <clumens>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: low Docs Contact:
Priority: high    
Version: 8.0CC: cluster-maint, jrehova, mnovacek, msmazova, phagara
Target Milestone: rcKeywords: Reopened, Triaged
Target Release: 8.7Flags: pm-rhel: mirror+
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pacemaker-2.1.3-2.el8 Doc Type: Bug Fix
Doc Text:
Cause: crm_attribute, when run from the command line on Pacemaker Remote nodes, would use the local hostname as the node name. Consequence: If a Pacemaker Remote node's name in the cluster differed from its local hostname, crm_attribute run from the command line would not manage attributes correctly for it. (crm_attribute run from resource agents was not affected.) Fix: crm_attribute now contacts the cluster to learn the local node name. Result: crm_attribute works as expected when run from the command line on Pacemaker Remote nodes.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-08 09:42:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ken Gaillot 2016-10-12 17:40:16 UTC
Description of problem: If a tool such as attrd_updater is used to query or update a node attribute without specifying a node name, it will use the local host uname as the node name. However, the node name can be different. This can particularly be a problem for resource agents that manage node attributes.


Version-Release number of selected component (if applicable): 1.1.15


How reproducible: Easy


Steps to Reproduce:
1. Configure and start a cluster with a cluster node or remote node whose node name is different from its uname.
2. Run "attrd_updater -n attrtest -U 1" on the node.
3. Check which node name was used for the attribute.

Actual results: The uname has been used


Expected results: The configured node name has been used


Additional info: The fix for Bug 1374175 should also be usable for this.

Comment 1 Ken Gaillot 2017-01-10 22:10:05 UTC
This will not be ready in the 7.4 timeframe

Comment 3 Ken Gaillot 2017-10-18 22:31:56 UTC
This will not make it in time for 7.5

Comment 4 Ken Gaillot 2018-06-18 22:19:58 UTC
Detailed history for the record:

The affected tools, when a node name is not explicitly given and thus should default to the local node, are crm_attribute, crm_standby, crm_failcount, crm_master, and attrd_updater. The scenarios of interest are:

* Whether the tool is called from a cluster node or a Pacemaker Remote node

* Whether the tool is called by a resource agent executed by the cluster, or not (i.e. manually on the command line, or by a script not executed by cluster)

* Whether the calling host's node name in the cluster is the same as its local hostname

The tools have always worked when called in any manner from a full cluster node whose node name is the same as its local hostname, and since upstream version 1.1.9, from a full cluster node whose node name is different from its local hostname. (RHEL has always had newer versions than that.)

They have always worked when called from a Pacemaker Remote node whose node name is the same as its local hostname, with the exception of attrd_updater, which was fixed for that case in upstream 1.1.14 (and RHEL 7.2, which was based on 1.1.13 but included some of 1.1.14).

That same attrd_updater fix also handled the case where the Pacemaker Remote node name was different from its local hostname (but for attrd_updater only). The other tools were fixed for this case, but only when they are called by a resource agent executed by the cluster, partially in RHEL 7.4 as Bug 1417936 (and its 7.3 z-stream Bug 1417936) and completely in RHEL 7.5 as part of the fix for Bug 1489728 (and its 7.4 z-stream Bug 1497602).

When the tools are called other than by a resource agent, crm_standby and crm_failcount are fixed for such nodes by Bug 1374175. That leaves a single situation covered by this bz: when crm_attribute is called other than by a resource agent from a Pacemaker Remote node whose node name is different from its local hostname.

Comment 5 Ken Gaillot 2018-06-18 22:30:40 UTC
QA: The reproducer in the Description is not correct. This bz only covers crm_attribute now. The reproducer is:

1. Configure and start a cluster with a remote node whose name in the cluster is different from its local hostname.

2. Set a permanent node attribute for the remote node, from the remote node's command line, without specifying an explicit node name:

crm_attribute -n foo -v bar -l forever

Before the fix, you will get an error like "Could not map name=... to a UUID", and the attribute will not appear in the CIB. After the fix, you will not get an error, and the attribute will appear in the CIB under the remote node's correct node name.

Comment 7 Ken Gaillot 2020-02-21 16:58:58 UTC
Due to developer time constraints, I am moving this to RHEL 8 only

Comment 11 Ken Gaillot 2020-10-13 22:07:00 UTC
Due to developer time prioritization constraints, an upstream bug report has been filed for this issue, and this report will be closed. If time becomes available, this can be reopened.

Comment 13 Ken Gaillot 2022-05-24 15:54:36 UTC
Fixed in upstream main and 2.1 branches as of commit 97ce57a0

Comment 17 jrehova 2022-08-04 14:42:53 UTC
* 3-node cluster

Version of pacemaker:

> [root@virt-041 ~]# rpm -q pacemaker
> pacemaker-2.1.4-4.el8.x86_64

Status of cluster:
 
> [root@virt-041 ~]# pcs status
> Cluster name: STSRHTS9402
> Cluster Summary:
>   * Stack: corosync
>   * Current DC: node-01 (version 2.1.4-4.el8-dc6eb4362e) - partition with quorum
>   * Last updated: Thu Aug  4 15:43:44 2022
>   * Last change:  Thu Aug  4 15:39:39 2022 by root via cibadmin on node-01
>   * 3 nodes configured
>   * 4 resource instances configured
> 
> Node List:
>   * Online: [ node-01 node-02 ]
>   * RemoteOnline: [ node-03 ]
> 
> Full List of Resources:
>   * fence-node-01	(stonith:fence_xvm):	 Started node-02
>   * fence-node-02	(stonith:fence_xvm):	 Started node-02
>   * fence-node-03	(stonith:fence_xvm):	 Started node-01
>   * node-03	(ocf::pacemaker:remote):	 Started node-01
> 
> Daemon Status:
>   corosync: inactive/disabled
>   pacemaker: inactive/disabled
>   pacemaker_remote: active/enabled
>   pcsd: active/enabled

> [root@virt-041 ~]# hostname
> virt-041
 
Setting a permanent node attribute for the remote node:

> [root@virt-041 ~]# crm_attribute -n foo -v bar -l forever
> [root@virt-041 ~]# pcs node attribute
> Node Attributes:
>  node-03: foo=bar
> [root@virt-041 ~]# cibadmin --query --scope nodes
> <nodes>
>   <node id="1" uname="node-01"/>
>   <node id="2" uname="node-02"/>
>   <node type="remote" id="node-03" uname="node-03">
>     <instance_attributes id="nodes-node-03">
>       <nvpair id="nodes-node-03-foo" name="foo" value="bar"/>
>     </instance_attributes>
>   </node>
> </nodes>

Comment 19 errata-xmlrpc 2022-11-08 09:42:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7573