Hide Forgot
This is based on the discussion in bz1305130. How the CIB upgrade process works in pacemaker: 1. Pacemaker keeps track of what pacemaker version is running on each node. 2. Pacemaker elects the DC in such a way that DC is always the node with the oldest pacemaker version in a cluster. 3. When the "cibadmin --upgrade" command is run, the request is sent to the DC. 4. DC bumps the CIB schema version to the newest version supported by that very DC and tells other nodes about the change. 5. As a result, the CIB never gets upgraded to a schema version which is not supported by all nodes. If the CIB upgrade is requested on a file, then there is no communication in a cluster. CIB schema version simply gets bumped to the newest version supported by that particular node. When moving to the new pcs architecture, the CIB upgrade process was moved to the pcs library. In order to get rid of the side effect, we switched the live CIB upgrade to file-based CIB upgrade. That way we bypass all the checking done in pacemaker (see steps 1-5 above). What we need to do is switch back to the live upgrade (unless -f was specified on the command line) to ensure the correct upgrade procedure is used and deal with the resulting side effect.
Created attachment 1218962 [details] proposed fix Test: 1) Setup: - Have a cluster with no pacemaker alerts support (RHEL7.2). - Upgrade pcs and pacemaker on one node to RHEL7.3 version with alerts support. [root@rh72-node1:~]# rpm -q pcs pcs-0.9.152-10.el7.x86_64 [root@rh72-node1:~]# rpm -q pacemaker pacemaker-1.1.15-11.el7.x86_64 [root@rh72-node2:~]# rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [root@rh72-node3:~]# rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" 2) Before fix: [root@rh72-node1:~]# pcs alert create path=/some/path CIB has been upgraded to the latest schema version. [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.5" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" 3) After fix: [root@rh72-node1:~]# pcs alert create path=/some/path Error: Upgrading of CIB to the latest schema failed: Call cib_upgrade failed (-62): Timer expired [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" Note pacemaker 1.1.13 exits with an error (timer expired) if current CIB schema version matches the latest available version. This is fixed in newer builds of pacemaker. It does not have any effect on pcs bug / behavior other than pcs printing different error message instead of saying the CIB is already at the newest schema available. Only "pcs alert" and "pcs acl" commands are affected. The bug is in the new pcs library which other commands do not use yet. Acls require schema version 2.0 which is quite old so the bug may not manifest there. The bug was introduced in pcs-0.9.152-3.el7.
Created attachment 1230027 [details] proposed fix Since we now update the CIB in the live cluster prior to making the actual requested changes, we need to report the CIB has been upgraded even if we do not make any changes due to an error. Before fix: [root@rh73-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh73-node1:~]# pcs alert create path=/some/path id=b@d Error: invalid alert-id 'b@d', '@' is not a valid character for a alert-id [root@rh73-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.5" After fix: [root@rh73-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh73-node1:~]# pcs alert create path=/some/path id=b@d CIB has been upgraded to the latest schema version. Error: invalid alert-id 'b@d', '@' is not a valid character for a alert-id [root@rh73-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.5"
Setup: > Have cluster with enough old pacemaker. [vm-rhel72-1 ~] $ rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [vm-rhel72-1 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [vm-rhel72-3 ~] $ rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [vm-rhel72-3 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" > Upgrade pacemaker on one node. [vm-rhel72-1 ~] $ rpm -q pacemaker pacemaker-1.1.16-2.el7.x86_64 After Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.156-1.el7.x86_64 [vm-rhel72-1 ~] $ pcs alert create path=/some/path Error: Upgrading of CIB to the latest schema failed: Call cib_upgrade failed (-62): Timer expired [vm-rhel72-1 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [vm-rhel72-3 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1958