Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1680280

Summary: [OSP13] Pacemaker resource constraints cause API outage during maintenance
Product: Red Hat OpenStack Reporter: Damien Ciabrini <dciabrin>
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: urgent Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: agurenko, dciabrin, emacchi, lbezdick, mburns, pkomarov
Target Milestone: zstreamKeywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.2.0-13.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1647451 Environment:
Last Closed: 2019-04-30 17:27:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1647449, 1647451    
Bug Blocks: 1647452, 1647453    

Comment 8 pkomarov 2019-04-16 21:30:31 UTC
Verified, 

[stack@undercloud-0 ~]$ rhos-release -L
Installed repositories (rhel-7.6):
  13
  ceph-3
  ceph-osd-3
  rhel-7.6
[stack@undercloud-0 ~]$ cat core_puddle_version 
2019-04-10.1[stack@undercloud-0 ~]$ 

preform uc/oc update 

check vip was moved before pacemaker stopped: 

[stack@undercloud-0 ~]$ grep -A 5 'Moving VIP' overcloud_update_run_Controller.log|tail 
2019-04-16 12:59:26 | TASK [Stop pacemaker cluster] **************************************************
2019-04-16 12:59:26 | Tuesday 16 April 2019  12:57:50 -0400 (0:00:09.990)       0:14:06.332 ********* 
2019-04-16 12:59:26 | changed: [controller-1] => {"changed": true, "out": "offline"}
--
2019-04-16 13:13:38 | changed: [controller-0] => {"changed": true, "cmd": "CLUSTER_NODE=$(crm_node -n)\n echo \"Retrieving all the VIPs which are hosted on this node\"\n VIPS_TO_MOVE=$(crm_mon --as-xml | xmllint --xpath '//resource[@resource_agent = \"ocf::heartbeat:IPaddr2\" and @role = \"Started\" and @managed = \"true\" and ./node[@name = \"'${CLUSTER_NODE}'\"]]/@id' - | sed -e 's/id=//g' -e 's/\"//g')\n for v in ${VIPS_TO_MOVE}; do\n echo \"Moving VIP $v on another node\"\n pcs resource move $v --wait=300\n done\n echo \"Removing the location constraints that were created to move the VIPs\"\n for v in ${VIPS_TO_MOVE}; do\n echo \"Removing location ban for VIP $v\"\n ban_id=$(cibadmin --query | xmllint --xpath 'string(//rsc_location[@rsc=\"'${v}'\" and @node=\"'${CLUSTER_NODE}'\" and @score=\"-INFINITY\"]/@id)' -)\n if [ -n \"$ban_id\" ]; then\n pcs constraint remove ${ban_id}\n else\n echo \"Could not retrieve and clear location constraint for VIP $v\" 2>&1\n fi\n done", "delta": "0:00:09.357201", "end": "2019-04-16 17:11:58.955588", "rc": 0, "start": "2019-04-16 17:11:49.598387", "stderr": "", "stderr_lines": [], "stdout": "Retrieving all the VIPs which are hosted on this node\nMoving VIP ip-192.168.24.9 on another node\nWarning: Creating location constraint cli-ban-ip-192.168.24.9-on-controller-0 with a score of -INFINITY for resource ip-192.168.24.9 on node controller-0.\nThis will prevent ip-192.168.24.9 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\nResource 'ip-192.168.24.9' is running on node controller-1.\nMoving VIP ip-172.17.1.16 on another node\nWarning: Creating location constraint cli-ban-ip-172.17.1.16-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.16 on node controller-0.\nThis will prevent ip-172.17.1.16 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\nResource 'ip-172.17.1.16' is running on node controller-1.\nMoving VIP ip-172.17.1.15 on another node\nWarning: Creating location constraint cli-ban-ip-172.17.1.15-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.15 on node controller-0.\nThis will prevent ip-172.17.1.15 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.\nResource 'ip-172.17.1.15' is running on node controller-2.\nRemoving the location constraints that were created to move the VIPs\nRemoving location ban for VIP ip-192.168.24.9\nRemoving location ban for VIP ip-172.17.1.16\nRemoving location ban for VIP ip-172.17.1.15", "stdout_lines": ["Retrieving all the VIPs which are hosted on this node", "Moving VIP ip-192.168.24.9 on another node", "Warning: Creating location constraint cli-ban-ip-192.168.24.9-on-controller-0 with a score of -INFINITY for resource ip-192.168.24.9 on node controller-0.", "This will prevent ip-192.168.24.9 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource 'ip-192.168.24.9' is running on node controller-1.", "Moving VIP ip-172.17.1.16 on another node", "Warning: Creating location constraint cli-ban-ip-172.17.1.16-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.16 on node controller-0.", "This will prevent ip-172.17.1.16 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource 'ip-172.17.1.16' is running on node controller-1.", "Moving VIP ip-172.17.1.15 on another node", "Warning: Creating location constraint cli-ban-ip-172.17.1.15-on-controller-0 with a score of -INFINITY for resource ip-172.17.1.15 on node controller-0.", "This will prevent ip-172.17.1.15 from running on controller-0 until the constraint is removed. This will be the case even if controller-0 is the last node in the cluster.", "Resource 'ip-172.17.1.15' is running on node controller-2.", "Removing the location constraints that were created to move the VIPs", "Removing location ban for VIP ip-192.168.24.9", "Removing location ban for VIP ip-172.17.1.16", "Removing location ban for VIP ip-172.17.1.15"]}
2019-04-16 13:13:38 | 
2019-04-16 13:13:38 | TASK [Stop pacemaker cluster] **************************************************
2019-04-16 13:13:38 | Tuesday 16 April 2019  13:11:59 -0400 (0:00:09.694)       0:28:14.844 ********* 
2019-04-16 13:13:38 | changed: [controller-0] => {"changed": true, "out": "offline"}
2019-04-16 13:13:38 |

Comment 11 errata-xmlrpc 2019-04-30 17:27:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0939