Hide Forgot
+++ This bug was initially created as a clone of Bug #1389443 +++ This is based on the discussion in bz1305130. How the CIB upgrade process works in pacemaker: 1. Pacemaker keeps track of what pacemaker version is running on each node. 2. Pacemaker elects the DC in such a way that DC is always the node with the oldest pacemaker version in a cluster. 3. When the "cibadmin --upgrade" command is run, the request is sent to the DC. 4. DC bumps the CIB schema version to the newest version supported by that very DC and tells other nodes about the change. 5. As a result, the CIB never gets upgraded to a schema version which is not supported by all nodes. If the CIB upgrade is requested on a file, then there is no communication in a cluster. CIB schema version simply gets bumped to the newest version supported by that particular node. When moving to the new pcs architecture, the CIB upgrade process was moved to the pcs library. In order to get rid of the side effect, we switched the live CIB upgrade to file-based CIB upgrade. That way we bypass all the checking done in pacemaker (see steps 1-5 above). What we need to do is switch back to the live upgrade (unless -f was specified on the command line) to ensure the correct upgrade procedure is used and deal with the resulting side effect. --- Additional comment from Tomas Jelinek on 2016-11-09 08:42 EST --- Test: 1) Setup: - Have a cluster with no pacemaker alerts support (RHEL7.2). - Upgrade pcs and pacemaker on one node to RHEL7.3 version with alerts support. [root@rh72-node1:~]# rpm -q pcs pcs-0.9.152-10.el7.x86_64 [root@rh72-node1:~]# rpm -q pacemaker pacemaker-1.1.15-11.el7.x86_64 [root@rh72-node2:~]# rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [root@rh72-node3:~]# rpm -q pacemaker pacemaker-1.1.13-10.el7.x86_64 [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" 2) Before fix: [root@rh72-node1:~]# pcs alert create path=/some/path CIB has been upgraded to the latest schema version. [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.5" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" 3) After fix: [root@rh72-node1:~]# pcs alert create path=/some/path Error: Upgrading of CIB to the latest schema failed: Call cib_upgrade failed (-62): Timer expired [root@rh72-node1:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node2:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" [root@rh72-node3:~]# pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.3" Note pacemaker 1.1.13 exits with an error (timer expired) if current CIB schema version matches the latest available version. This is fixed in newer builds of pacemaker. It does not have any effect on pcs bug / behavior other than pcs printing different error message instead of saying the CIB is already at the newest schema available. Only "pcs alert" and "pcs acl" commands are affected. The bug is in the new pcs library which other commands do not use yet. Acls require schema version 2.0 which is quite old so the bug may not manifest there. The bug was introduced in pcs-0.9.152-3.el7.
Created attachment 1222736 [details] proposed fix
Setup: > Have cluster with enough old pacemaker. [vm-rhel67-1 ~] $ rpm -q pacemaker pacemaker-1.1.12-8.el6.x86_64 [vm-rhel67-1 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.0" [vm-rhel67-2 ~] $ rpm -q pacemaker pacemaker-1.1.12-8.el6.x86_64 [vm-rhel67-2 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.0" > Upgrade pacemaker on one node. [vm-rhel67-1 ~] $ rpm -q pacemaker pacemaker-1.1.15-3.el6.x86_64 Before Fix: [vm-rhel67-1 ~] $ rpm -q pcs pcs-0.9.155-1.el6.x86_64 [vm-rhel67-1 ~] $ pcs alert create path=/some/path CIB has been upgraded to the latest schema version. [vm-rhel67-1 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.5" [vm-rhel67-2 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.0" After Fix: [vm-rhel67-1 ~] $ rpm -q pcs pcs-0.9.155-2.el6.x86_64 [vm-rhel67-1 ~] $ pcs alert create path=/some/path Error: Upgrading of CIB to the latest schema failed: Call cib_upgrade failed (-62): Timer expired [vm-rhel67-1 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.0" [vm-rhel67-2 ~] $ pcs cluster cib | tr ' ' '\n' | grep validate-with validate-with="pacemaker-2.0"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0707.html