Bug 1435271 - [UPDATES] Run 'clustercheck' during minor update just on database nodes
Summary: [UPDATES] Run 'clustercheck' during minor update just on database nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 11.0 (Ocata)
Assignee: Yurii Prokulevych
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks: 1394025
TreeView+ depends on / blocked
 
Reported: 2017-03-23 13:44 UTC by Yurii Prokulevych
Modified: 2017-05-17 20:12 UTC (History)
10 users (show)

Fixed In Version: openstack-tripleo-heat-templates-6.0.0-0.10.el7ost
Doc Type: Bug Fix
Doc Text:
With this release, 'clustercheck' will only run on nodes specified in the 'wsrep_cluster_address' option of Galera. This change was implemented to take into account use cases where Galera is run on a dedicated node (as is made possible with composable roles). Previously, during minor updates 'clustercheck' ran on all nodes running pacemaker, assuming Galera was also on the same node.
Clone Of:
Environment:
Last Closed: 2017-05-17 20:12:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 451467 0 None None None 2017-03-29 15:58:50 UTC
Red Hat Product Errata RHEA-2017:1245 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC

Description Yurii Prokulevych 2017-03-23 13:44:57 UTC
Description of problem:
-----------------------
Minor update of composable deployment failed because 'clustercheck' failed to get status from Galera. 
[root@messaging-0 ~]# clustercheck 
HTTP/1.1 503 Service Unavailable
Content-Type: text/plain
Connection: close
Content-Length: 36

Galera cluster node is not synced.
[root@messaging-0 ~]# echo $?
1

Right now we 'clustercheck' https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/yum_update.sh#L117 is run at all nodes that have pcs active(https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/yum_update.sh#L100) despite 





Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-tripleo-heat-templates-6.0.0-0.20170307170102.3134785.0rc2.el7ost.noarch

Steps to Reproduce:
1. Deploy RHOS-11 (2017-03-14.2) with dedicated roles(database,messaging,controller,networker,compute,ceph)
2. Setup latest repo on uc and oc
3. Update uc
4. Start oc update


Actual results:
---------------
Updates fails

Additional info:
----------------
Virtual setup: 3ceph + 2compute + 3controller + 3galera +3messaging + 2networkers

Comment 6 Yurii Prokulevych 2017-04-07 13:36:40 UTC
Verified with openstack-tripleo-heat-templates-6.0.0-3.el7ost.noarch


openstack stack list
+--------------------------------------+------------+-----------------+----------------------+----------------------+
| ID                                   | Stack Name | Stack Status    | Creation Time        | Updated Time         |
+--------------------------------------+------------+-----------------+----------------------+----------------------+
| a906c7a0-3e20-49f4-acfc-585dfefe9452 | overcloud  | UPDATE_COMPLETE | 2017-04-07T08:08:02Z | 2017-04-07T10:58:01Z |
+--------------------------------------+------------+-----------------+----------------------+----------------------+

nova list
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| 469c7b43-b4e8-44fb-8c2c-3e41da55aa3f | ceph-0       | ACTIVE | -          | Running     | ctlplane=192.168.24.10 |
| 4f397ab2-8e9d-429c-b839-930bfe506b22 | ceph-1       | ACTIVE | -          | Running     | ctlplane=192.168.24.19 |
| e66945fa-9441-4815-a00d-259bedb9f34e | ceph-2       | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
| 3e7ce3c9-1b1b-4869-adf7-11968cfcf549 | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.7  |
| 038bb09b-4a0a-4d84-adf8-2f465a4d16a9 | compute-1    | ACTIVE | -          | Running     | ctlplane=192.168.24.9  |
| 409fd440-527c-46f0-9295-71d0e0810493 | controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.22 |
| 3635c8e6-ecbb-4d23-9f23-cd1dc52d868b | controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 5f35c248-a1b4-473b-be2a-e991f1a44c70 | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| 6df65fb3-fcaa-4c59-b8b2-68e3af0aca1c | galera-0     | ACTIVE | -          | Running     | ctlplane=192.168.24.24 |
| 9d75252c-c6e9-4a66-9bbd-6c22db341580 | galera-1     | ACTIVE | -          | Running     | ctlplane=192.168.24.17 |
| b7dc4b7c-0c9f-4e94-b9b4-d403b47b3063 | galera-2     | ACTIVE | -          | Running     | ctlplane=192.168.24.18 |
| 4626ab11-0983-4991-bd1d-c6691908a3fe | messaging-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| f931298e-4927-4bee-bcad-296d0da5bb92 | messaging-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.23 |
| c0f4407c-89f5-4e25-8c0c-129f6698cc9c | messaging-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.8  |
| d8587085-ddf8-40d7-8236-b8923f7ef0ff | networker-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| a39935f4-76fe-453d-9a33-de5657582473 | networker-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.20 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+

Comment 8 errata-xmlrpc 2017-05-17 20:12:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.