Bug 1759555

Summary:	After a cluster upgrade, cannot run pcs/crm_attribute on offline CIB on pacemaker remotes
Product:	Red Hat Enterprise Linux 8	Reporter:	Damien Ciabrini <dciabrin>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED MIGRATED	QA Contact:	cluster-qe <cluster-qe>
Severity:	medium	Docs Contact:
Priority:	high
Version:	8.0	CC:	cluster-maint, lmiksik, michele, pkomarov
Target Milestone:	rc	Keywords:	MigratedToJIRA, Reopened, Triaged
Target Release:	8.10	Flags:	pm-rhel: mirror+
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:	Most users would not encounter this	Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-09-22 18:36:44 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:	2.1.7
Embargoed:

Description Damien Ciabrini 2019-10-08 13:35:09 UTC

Description of problem:
I want to update a running pacemaker cluster that includes a pacemaker remote node.

[root@ratester3 ~]# crm_mon -1
Stack: corosync
Current DC: ratester1 (version 2.0.1-4.el8_0.4-0eb7991564) - partition with quorum
Last updated: Tue Oct  8 06:02:14 2019
Last change: Tue Oct  8 05:49:02 2019 by root via cibadmin on ratester2
 
3 nodes configured
1 resource configured
 
Online: [ ratester1 ratester2 ]
RemoteOnline: [ ratester3 ]
 
Active resources:
 
 ratester3      (ocf::pacemaker:remote):        Started ratester1
 


I'm updating only the real cluster nodes (ratester1 and ratester2) to a newer pacemaker
version:

[root@ratester3 ~]# crm_mon -1
Stack: corosync
Current DC: ratester2 (version 2.0.2-3.el8-744a30d655) - partition with quorum
Last updated: Tue Oct  8 07:36:09 2019
Last change: Tue Oct  8 06:58:22 2019 by ratester3 via crm_attribute on ratester2

3 nodes configured
1 resource configured

Online: [ ratester1 ratester2 ]
RemoteOnline: [ ratester3 ]

Active resources:

 ratester3      (ocf::pacemaker:remote):        Started ratester2



At this point, I'm trying to set an attribute in the live CIB, everything still
works ok:

[root@ratester3 ~]# pcs node attribute ratester2 foo=foo_value



However, if I try to run the same operation on an offline CIB, the same operation fails:

[root@ratester3 ~]# pcs cluster cib > cib.xml
[root@ratester3 ~]# pcs -f cib.xml node attribute ratester2 bar=bar_value
Error: unable to set attribute bar
Error performing operation: Protocol not supported
Error setting bar=bar_value (section=nodes, set=nodes-2): Protocol not supported


In fact, with debug info enabled, it seems this is because the feature set version
has been bumped in the cluster, even if the pacemaker remote hasn't been upgraded
yet.

[root@ratester3 ~]# PCMK_debug=yes PCMK_logfile=/dev/stdout pcs -f cib.xml node attribute ratester2 bar=bar_value
Error: unable to set attribute bar
Set r/w permissions for uid=189, gid=189 on /dev/stdout
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (crm_log_args)    notice: Invoked: /usr/sbin/crm_attribute -t nodes --node ratester2 --name bar --update bar_value 
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (validate_with_relaxng)   info: Creating RNG parser context
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_file_signon)         debug: crm_attribute: Opened connection to local file 'cib.xml'
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_file_perform_op_delegate)    info: cib_query on /cib/configuration/nodes/node[translate(@uname,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive[@class='ocf'][@provider='pacemaker'][@type='remote'][translate(@id,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive/meta_attributes/nvpair[@name='remote-node'][translate(@value,'ABCDEF
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_process_xpath)       debug: Processing cib_query op for /cib/configuration/nodes/node[translate(@uname,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive[@class='ocf'][@provider='pacemaker'][@type='remote'][translate(@id,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive/meta_attributes/nvpair[@name='remote-node'][translate(@value,'A
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (query_node_uuid)         info: Mapped node name 'ratester2' to UUID 2
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_file_perform_op_delegate)    info: cib_query on //cib/configuration/nodes//node[@id='2']//instance_attributes//nvpair[@name='bar']
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_process_xpath)       debug: cib_query: //cib/configuration/nodes//node[@id='2']//instance_attributes//nvpair[@name='bar'] does not exist
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_file_perform_op_delegate)    info: cib_modify on nodes
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_perform_op)  error: Discarding update with feature set '3.2.0' greater than our own '3.1.0'
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (update_attr_delegate)    info: Update   <node id="2">
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (update_attr_delegate)    info: Update     <instance_attributes id="nodes-2">
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (update_attr_delegate)    info: Update       <nvpair id="nodes-2-bar" name="bar" value="bar_value"/>
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (update_attr_delegate)    info: Update     </instance_attributes>
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (update_attr_delegate)    info: Update   </node>
Error performing operation: Protocol not supported
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (cib_file_signoff)        debug: Disconnecting from the CIB manager
Oct 08 08:59:06 ratester3 crm_attribute       [26653] (crm_xml_cleanup)         info: Cleaning up memory from libxml2
Error setting bar=bar_value (section=nodes, set=nodes-2): Protocol not supported



This is problematic for our use of pacemaker and pcs in OpenStack, for
a couple of reasons:

1. operators can upgrade their cluster nodes in a random order, so we
   can't guarantee that they will upgrade all their pacemaker remotes
   before upgrading the real cluster nodes.

2. likewise, we are using bundles, which run pacemaker remotes, and we
   can't guarantee that operators will restart all containers with
   up-to-date container images before upgrading the real cluster
   nodes.

3. in OpenStack we have an idiomatic way of calling pcs with offline
   CIB, because we drive the creation of pcs resources from puppet and
   we have to implement a means of checking for resource differences
   between two puppet runs.



Version-Release number of selected component (if applicable):
pacemaker-2.0.1-4.el8_0.4.x86_64

How reproducible:
Always

Steps to Reproduce:
1. create a cluster with a pacemaker remote node (with e.g. pacemaker-2.0.1-4.el8_0.4.x86_64)
2. upgrade the real cluster node to a pacemaker rpm that ships a different feature set (e.g. pacemaker-2.0.2-3.el8.x86_64)
3. from the non-upgraded remote node, try to update a node attribute in a offline CIB

Actual results:
no attribute can be updated in the offline CIB attribute because the node's feature set lags behind.

Expected results:
adding/updating attributes in the offline CIB should still work even of the real cluster nodes have been upgraded.

Additional info:

Comment 1 Ken Gaillot 2019-10-08 13:44:24 UTC

This is a known issue (Bug 1603613), but since we don't yet have a RHEL 8 clone for it, we can use this bz for that purpose.

Comment 2 Andrew Beekhof 2019-10-08 21:18:44 UTC

A bug that prevents updates, even if it is "just" in a layered, is surely higher than medium severity.

Comment 3 Ken Gaillot 2019-10-09 16:42:50 UTC

(In reply to Andrew Beekhof from comment #2)
> A bug that prevents updates, even if it is "just" in a layered, is surely
> higher than medium severity.

Do OSP updates require running configuration commands on saved CIBs on remote nodes before they've been updated? As I understood it, only manual commands were affected, and the workaround would be to run them on an updated node.

BTW I've been using "medium" to indicate "not in the next point release", 8.2 in this case, which is pretty locked down due to QA capacity. Since this was internally reported, we could maybe get it in via the rebase bz and close this CURRENTRELEASE when it comes out, but 8.2 dev freeze is in less than 2 months, so that might be a stretch on our end.

Comment 4 Damien Ciabrini 2019-10-10 19:03:29 UTC

(In reply to Ken Gaillot from comment #3)
> Do OSP updates require running configuration commands on saved CIBs on
> remote nodes before they've been updated? As I understood it, only manual
> commands were affected, and the workaround would be to run them on an
> updated node.

Unfortunately the commands ran on offline CIB is part of the update process and is automated in a workflow.

Whenever the user runs an "stack update action" on a host, the update mechanism (in puppet) checks whether the Openstack services have to be updated.
For every Openstack service that is managed via a pacemaker bundle (e.g. galera, rabbitmq, redis...), the check consists in:
  . dumping the current live cib in a file
  . applying the service configuration that comes with the "stack update action" (i.e. runs a series of pcs -f offline-cib.xml <bundle-create/update>)   
  . compare the resulting offline cib against the live cib
  . if there's a change, apply it in the live cib

For the record, we need to use an offline cib because pcs offers no way [1] to update a specific property of a bundle (e.g. change a bind-mount). So we need to calculate ahead of time the potential changes in the offline file, and compute the diffs with the live cib ourselves.


In parallel, as a workaround for this bz, we're investigating ways of splitting out update process to first check whether the host we're running the update process on has the latest feature set, so we could bail out or run an appropriate fallback action if that's the case. But this is only a mitigation, as in OpenStack, we cannot constraint the operators to run the pacemaker upgrade on all the pacemaker remotes first, and ultimately on the real cluster nodes (for a variety of reasons that I'm not discussing here).


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1598197#c7

Comment 5 Ken Gaillot 2019-10-10 20:02:27 UTC

(In reply to Damien Ciabrini from comment #4)
> For the record, we need to use an offline cib because pcs offers no way [1]
> to update a specific property of a bundle (e.g. change a bind-mount). So we
> need to calculate ahead of time the potential changes in the offline file,
> and compute the diffs with the live cib ourselves.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1598197#c7

Looks like that was fixed in 7.7, so that might be a good alternative in the meantime

> In parallel, as a workaround for this bz, we're investigating ways of
> splitting out update process to first check whether the host we're running
> the update process on has the latest feature set, so we could bail out or
> run an appropriate fallback action if that's the case. But this is only a
> mitigation, as in OpenStack, we cannot constraint the operators to run the
> pacemaker upgrade on all the pacemaker remotes first, and ultimately on the
> real cluster nodes (for a variety of reasons that I'm not discussing here).

It's considered best practice to update the cluster nodes first anyway.

If I'm following correctly, the update being discussed here is a configuration update, not a software update. In that case, do we support running different software versions on different nodes outside of a rolling upgrade? Upgrading the software on the remote node before doing the configuration update would be a workaround for this issue.

Comment 6 Damien Ciabrini 2019-10-10 20:45:53 UTC

(In reply to Ken Gaillot from comment #5)
> (In reply to Damien Ciabrini from comment #4)
> > For the record, we need to use an offline cib because pcs offers no way [1]
> > to update a specific property of a bundle (e.g. change a bind-mount). So we
> > need to calculate ahead of time the potential changes in the offline file,
> > and compute the diffs with the live cib ourselves.
> > 
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1598197#c7
> 
> Looks like that was fixed in 7.7, so that might be a good alternative in the
> meantime

Oh thanks for the pointer, I didn't know that. We'll look into it!

> 
> > In parallel, as a workaround for this bz, we're investigating ways of
> > splitting out update process to first check whether the host we're running
> > the update process on has the latest feature set, so we could bail out or
> > run an appropriate fallback action if that's the case. But this is only a
> > mitigation, as in OpenStack, we cannot constraint the operators to run the
> > pacemaker upgrade on all the pacemaker remotes first, and ultimately on the
> > real cluster nodes (for a variety of reasons that I'm not discussing here).
> 
> It's considered best practice to update the cluster nodes first anyway.
> 
> If I'm following correctly, the update being discussed here is a
> configuration update, not a software update. In that case, do we support
> running different software versions on different nodes outside of a rolling
> upgrade? Upgrading the software on the remote node before doing the
> configuration update would be a workaround for this issue.

It's a mix of both actually. When the operator want to update his stack, he runs
a update command a pass parameter to point to new container images (that ship with
up-to-date pacemaker_remote rpm). The update command then 1) stops pacemaker locally
2) updates container images, 3) runs yum update, and 4) restart cluster.

In normal circumstances, the pacemaker always restart locally with updated rpm on
the host and in images.

Now if the user didn't tell the update command to update the container image, we
can end up with a pacemaker discrepancy between host and containers. This is
usually not an issue because pacemaker code usually stays compatible in the same
RHEL release and feature set are quite stable. so the next update command ran by
the operator eventually fixes the discrepancy. But sometimes, such issue arise
and this make the openstack update process break.

Another typical problem is when the operator runs the update with the proper
container images, but the update runs first on all pacemaker nodes, and then
on the remaining pacemaker remote nodes (e.g. for complex deployments that
runs the DB on dedicated pacemaker remote nodes). In such case, the same problem
can break out update process.

Comment 18 RHEL Program Management 2021-05-18 07:32:29 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 19 Ken Gaillot 2021-05-18 13:38:42 UTC

This is still desirable and intended, but no time frame is available. This bz will be reopened once developer time becomes available.

Comment 21 RHEL Program Management 2023-09-22 18:34:58 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 22 RHEL Program Management 2023-09-22 18:36:44 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.