Hide Forgot
Description of problem: If a tool such as attrd_updater is used to query or update a node attribute without specifying a node name, it will use the local host uname as the node name. However, the node name can be different. This can particularly be a problem for resource agents that manage node attributes. Version-Release number of selected component (if applicable): 1.1.15 How reproducible: Easy Steps to Reproduce: 1. Configure and start a cluster with a cluster node or remote node whose node name is different from its uname. 2. Run "attrd_updater -n attrtest -U 1" on the node. 3. Check which node name was used for the attribute. Actual results: The uname has been used Expected results: The configured node name has been used Additional info: The fix for Bug 1374175 should also be usable for this.
This will not be ready in the 7.4 timeframe
This will not make it in time for 7.5
Detailed history for the record: The affected tools, when a node name is not explicitly given and thus should default to the local node, are crm_attribute, crm_standby, crm_failcount, crm_master, and attrd_updater. The scenarios of interest are: * Whether the tool is called from a cluster node or a Pacemaker Remote node * Whether the tool is called by a resource agent executed by the cluster, or not (i.e. manually on the command line, or by a script not executed by cluster) * Whether the calling host's node name in the cluster is the same as its local hostname The tools have always worked when called in any manner from a full cluster node whose node name is the same as its local hostname, and since upstream version 1.1.9, from a full cluster node whose node name is different from its local hostname. (RHEL has always had newer versions than that.) They have always worked when called from a Pacemaker Remote node whose node name is the same as its local hostname, with the exception of attrd_updater, which was fixed for that case in upstream 1.1.14 (and RHEL 7.2, which was based on 1.1.13 but included some of 1.1.14). That same attrd_updater fix also handled the case where the Pacemaker Remote node name was different from its local hostname (but for attrd_updater only). The other tools were fixed for this case, but only when they are called by a resource agent executed by the cluster, partially in RHEL 7.4 as Bug 1417936 (and its 7.3 z-stream Bug 1417936) and completely in RHEL 7.5 as part of the fix for Bug 1489728 (and its 7.4 z-stream Bug 1497602). When the tools are called other than by a resource agent, crm_standby and crm_failcount are fixed for such nodes by Bug 1374175. That leaves a single situation covered by this bz: when crm_attribute is called other than by a resource agent from a Pacemaker Remote node whose node name is different from its local hostname.
QA: The reproducer in the Description is not correct. This bz only covers crm_attribute now. The reproducer is: 1. Configure and start a cluster with a remote node whose name in the cluster is different from its local hostname. 2. Set a permanent node attribute for the remote node, from the remote node's command line, without specifying an explicit node name: crm_attribute -n foo -v bar -l forever Before the fix, you will get an error like "Could not map name=... to a UUID", and the attribute will not appear in the CIB. After the fix, you will not get an error, and the attribute will appear in the CIB under the remote node's correct node name.
Due to developer time constraints, I am moving this to RHEL 8 only
Due to developer time prioritization constraints, an upstream bug report has been filed for this issue, and this report will be closed. If time becomes available, this can be reopened.
Fixed in upstream main and 2.1 branches as of commit 97ce57a0
* 3-node cluster Version of pacemaker: > [root@virt-041 ~]# rpm -q pacemaker > pacemaker-2.1.4-4.el8.x86_64 Status of cluster: > [root@virt-041 ~]# pcs status > Cluster name: STSRHTS9402 > Cluster Summary: > * Stack: corosync > * Current DC: node-01 (version 2.1.4-4.el8-dc6eb4362e) - partition with quorum > * Last updated: Thu Aug 4 15:43:44 2022 > * Last change: Thu Aug 4 15:39:39 2022 by root via cibadmin on node-01 > * 3 nodes configured > * 4 resource instances configured > > Node List: > * Online: [ node-01 node-02 ] > * RemoteOnline: [ node-03 ] > > Full List of Resources: > * fence-node-01 (stonith:fence_xvm): Started node-02 > * fence-node-02 (stonith:fence_xvm): Started node-02 > * fence-node-03 (stonith:fence_xvm): Started node-01 > * node-03 (ocf::pacemaker:remote): Started node-01 > > Daemon Status: > corosync: inactive/disabled > pacemaker: inactive/disabled > pacemaker_remote: active/enabled > pcsd: active/enabled > [root@virt-041 ~]# hostname > virt-041 Setting a permanent node attribute for the remote node: > [root@virt-041 ~]# crm_attribute -n foo -v bar -l forever > [root@virt-041 ~]# pcs node attribute > Node Attributes: > node-03: foo=bar > [root@virt-041 ~]# cibadmin --query --scope nodes > <nodes> > <node id="1" uname="node-01"/> > <node id="2" uname="node-02"/> > <node type="remote" id="node-03" uname="node-03"> > <instance_attributes id="nodes-node-03"> > <nvpair id="nodes-node-03-foo" name="foo" value="bar"/> > </instance_attributes> > </node> > </nodes>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7573