Bug 1647451
| Summary: | [OSP14] Pacemaker resource constraints cause API outage during maintenance | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Lukas Bezdicka <lbezdick> | |
| Component: | openstack-tripleo-heat-templates | Assignee: | Damien Ciabrini <dciabrin> | |
| Status: | CLOSED ERRATA | QA Contact: | pkomarov | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | high | |||
| Version: | 14.0 (Rocky) | CC: | agurenko, dciabrin, emacchi, mburns, pkomarov, rheslop | |
| Target Milestone: | zstream | Keywords: | Triaged, ZStream | |
| Target Release: | 14.0 (Rocky) | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-tripleo-heat-templates-9.2.1-0.20190119154868.el7ost | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1647449 | |||
| : | 1647452 1680280 (view as bug list) | Environment: | ||
| Last Closed: | 2019-04-30 17:51:14 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1647449 | |||
| Bug Blocks: | 1647452, 1647453, 1680280 | |||
|
Description
Lukas Bezdicka
2018-11-07 13:54:15 UTC
Merged in upstream Master [1] and Stable/Rocky [2] [1] https://review.openstack.org/#/c/635492/ [2] https://review.openstack.org/#/c/636891/ Verified,
(undercloud) [stack@undercloud-0 ~]$
(undercloud) [stack@undercloud-0 ~]$ rhos-release -L
Installed repositories (rhel-7.6):
14
ceph-3
ceph-osd-3
rhel-7.6
(undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version
2019-04-12.1(undercloud) [stack@undercloud-0 ~]$
preform uc/oc update
check vip was moved before pacemaker stopped:
(undercloud) [stack@undercloud-0 ~]$ grep -A 5 'Moving VIP' overcloud_update_run_Controller.log|tail
2019-04-16 12:04:44 | u'Tuesday 16 April 2019 12:03:25 -0400 (0:00:09.002) 0:12:22.797 ********* ',
2019-04-16 12:04:44 | u'changed: [controller-1] => {"changed": true, "out": "offline"}']
2019-04-16 12:04:44 | [u'',
--
2019-04-16 12:16:13 | u'changed: [controller-0] => {"changed": true, "cmd": "CLUSTER_NODE=$(crm_node -n)\\n echo \\"Retrieving all the VIPs which are hosted on this node\\"\\n VIPS_TO_MOVE=$(crm_mon --as-xml | xmllint --xpath \'//resource[@resource_agent = \\"ocf::heartbeat:IPaddr2\\" and @role = \\"Started\\" and @managed = \\"true\\" and ./node[@name = \\"\'${CLUSTER_NODE}\'\\"]]/@id\' - | sed -e \'s/id=//g\' -e \'s/\\"//g\')\\n for v in ${VIPS_TO_MOVE}; do\\n echo \\"Moving VIP $v on another node\\"\\n pcs resource move $v --wait=300\\n done\\n echo \\"Removing the location constraints that were created to move the VIPs\\"\\n for v in ${VIPS_TO_MOVE}; do\\n echo \\"Removing location ban for VIP $v\\"\\n ban_id=$(cibadmin --query | xmllint --xpath \'string(//rsc_location[@rsc=\\"\'${v}\'\\" and @node=\\"\'${CLUSTER_NODE}\'\\" and @score=\\"-INFINITY\\"]/@id)\' -)\\n if [ -n \\"$ban_id\\" ]; then\\n pcs constraint remove ${ban_id}\\n else\\n echo \\"Could not retrieve and clear location constraint for VIP $v\\" 2>&1\\n fi\\n done", "delta": "0:00:08.860774", "end": "2019-04-16 16:14:47.100904", "rc": 0, "start": "2019-04-16 16:14:38.240130", "stderr": "", "stderr_lines": [], "stdout": "Retrieving all the VIPs which are hosted on this node\\nMoving VIP ip-192.168.24.6 on another node\\nWarning: Creating location constraint cli-ban-ip-192.168.24.6-on-controller-0 with a score of -INFINITY for resource ip-192.168.24.6 on node controller-0.\\nThis will prevent ip-192.168.24.6 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\\nResource \'ip-192.168.24.6\' is running on node controller-1.\\nMoving VIP ip-172.17.1.20 on another node\\nWarning: Creating location constraint cli-ban-ip-172.17.1.20-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.20 on node controller-0.\\nThis will prevent ip-172.17.1.20 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\\nResource \'ip-172.17.1.20\' is running on node controller-1.\\nMoving VIP ip-172.17.1.11 on another node\\nWarning: Creating location constraint cli-ban-ip-172.17.1.11-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.11 on node controller-0.\\nThis will prevent ip-172.17.1.11 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\\nResource \'ip-172.17.1.11\' is running on node controller-2.\\nRemoving the location constraints that were created to move the VIPs\\nRemoving location ban for VIP ip-192.168.24.6\\nRemoving location ban for VIP ip-172.17.1.20\\nRemoving location ban for VIP ip-172.17.1.11", "stdout_lines": ["Retrieving all the VIPs which are hosted on this node", "Moving VIP ip-192.168.24.6 on another node", "Warning: Creating location constraint cli-ban-ip-192.168.24.6-on-controller-0 with a score of -INFINITY for resource ip-192.168.24.6 on node controller-0.", "This will prevent ip-192.168.24.6 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource \'ip-192.168.24.6\' is running on node controller-1.", "Moving VIP ip-172.17.1.20 on another node", "Warning: Creating location constraint cli-ban-ip-172.17.1.20-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.20 on node controller-0.", "This will prevent ip-172.17.1.20 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource \'ip-172.17.1.20\' is running on node controller-1.", "Moving VIP ip-172.17.1.11 on another node", "Warning: Creating location constraint cli-ban-ip-172.17.1.11-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.11 on node controller-0.", "This will prevent ip-172.17.1.11 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource \'ip-172.17.1.11\' is running on node controller-2.", "Removing the location constraints that were created to move the VIPs", "Removing location ban for VIP ip-192.168.24.6", "Removing location ban for VIP ip-172.17.1.20", "Removing location ban for VIP ip-172.17.1.11"]}',
2019-04-16 12:16:13 | u'',
2019-04-16 12:16:13 | u'TASK [Stop pacemaker cluster] **************************************************',
2019-04-16 12:16:13 | u'Tuesday 16 April 2019 12:14:47 -0400 (0:00:09.236) 0:23:44.881 ********* ',
2019-04-16 12:16:13 | u'changed: [controller-0] => {"changed": true, "out": "offline"}']
2019-04-16 12:16:13 | [u'',
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0878 |