Bug 1328621

Summary: Pacemaker constantly fails at documented step 10.4 upgrading the overcloud
Product: Red Hat OpenStack Reporter: Andreas Karis <akaris>
Component: openstack-tripleo-heat-templatesAssignee: Jiri Stransky <jstransk>
Status: CLOSED NOTABUG QA Contact: Arik Chernetsky <achernet>
Severity: urgent Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: aschultz, mburns, mcornea, michele, rhel-osp-director-maint, slinaber
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-03 15:31:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Andreas Karis 2016-04-19 21:17:22 UTC
Description of problem:
Pacemaker constantly failes during documented step 9.3 upgrading the overcloud
https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/director-installation-and-usage/93-upgrading-the-overcloud

The documentation contains the following warning:
Important
If the Overcloud stack failed during this step, log into one of your Controller nodes, run sudo pcs cluster start, then rerun openstack overcloud deploy on the director. 

Unfortunately, even if I restart pcs, I constantly run into the same issue. pcs stops on every run.




Version-Release number of selected component (if applicable):
Upgrade from 7.3 to 8.0

How reproducible:
Physical lab, fresh install, with a few instances. The following commands were run chronologically, until the issue occured.

 

. stackrc

[stack@undercloud-physical ~]$ nova list

+--------------------------------------+------------------------+--------+------------+-------------+---------------------+

| ID                                   | Name                   | Status | Task State | Power State | Networks            |

+--------------------------------------+------------------------+--------+------------+-------------+---------------------+

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-compute-1    | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2 | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |

+--------------------------------------+------------------------+--------+------------+-------------+---------------------+

 

[stack@undercloud-physical ~]$ nova list

+--------------------------------------+--------------+--------+------------+-------------+---------------------------------+

| ID                                   | Name         | Status | Task State | Power State | Networks                        |

+--------------------------------------+--------------+--------+------------+-------------+---------------------------------+

| d4aa8469-f418-48ea-bd7e-0b9411ee9ec7 | cirros-test1 | ACTIVE | -          | Running     | private=192.168.0.5, 10.0.0.204 |

| 792e9fe1-36fc-484e-b843-e9906815da6a | cirros-test2 | ACTIVE | -          | Running     | private=192.168.0.6, 10.0.0.205 |

| 7903192b-4039-445c-a5e4-4f797054b399 | cirros-test3 | ACTIVE | -          | Running     | private=192.168.0.7, 10.0.0.206 |

| 24e06489-7c7c-4733-bb49-3abbe84d8b07 | cirros-test4 | ACTIVE | -          | Running     | private=192.168.0.8, 10.0.0.207 |

+--------------------------------------+--------------+--------+------------+-------------+---------------------------------+

 

[stack@undercloud-physical ~]$ neutron net-list

+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+

| id                                   | name                                               | subnets                                               |

+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+

| 998aaf0a-9d19-487d-b1c9-972373f0d8e1 | public                                             | b4b5ac61-f233-4efa-bd90-70dba61992f0 10.0.0.0/24      |

| db69e4e1-4afe-4e18-a80e-8e917e0e741e | private                                            | 30456489-0e82-4274-9a9c-2c95d59b7ea8 192.168.0.0/24   |

| b79b608f-da1f-4bb5-b656-4f9f0897d295 | HA network tenant 7b9789132feb465ca0bf22e96f91849e | d1107b5e-aa7f-44f6-8efc-24950dbbd174 169.254.192.0/18 |

+--------------------------------------+----------------------------------------------------+-------------------------------------------------------+

 

[stack@undercloud-physical ~]$ for i in 204 205 206 207;do ping -W1 -c1 10.0.0.$i;done

PING 10.0.0.204 (10.0.0.204) 56(84) bytes of data.

64 bytes from 10.0.0.204: icmp_seq=1 ttl=63 time=1.57 ms

 

--- 10.0.0.204 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 1.571/1.571/1.571/0.000 ms

PING 10.0.0.205 (10.0.0.205) 56(84) bytes of data.

64 bytes from 10.0.0.205: icmp_seq=1 ttl=63 time=0.766 ms

 

--- 10.0.0.205 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.766/0.766/0.766/0.000 ms

PING 10.0.0.206 (10.0.0.206) 56(84) bytes of data.

64 bytes from 10.0.0.206: icmp_seq=1 ttl=63 time=0.863 ms

 

--- 10.0.0.206 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.863/0.863/0.863/0.000 ms

PING 10.0.0.207 (10.0.0.207) 56(84) bytes of data.

64 bytes from 10.0.0.207: icmp_seq=1 ttl=63 time=0.836 ms

 

--- 10.0.0.207 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.836/0.836/0.836/0.000 ms

 

Upgrade according to:

Chapter 9. Upgrading the Environment - Red Hat Customer Portal

 

$ sudo subscription-manager repos --disable=rhel-7-server-openstack-7.0-rpms --disable=rhel-7-server-openstack-7.0-director-rpms

$ sudo subscription-manager repos --enable=rhel-7-server-openstack-8-rpms --enable=rhel-7-server-openstack-8-director-rpms

$ sudo yum upgrade

openstack undercloud upgrade

sudo systemctl list-units openstack-*

 

[stack@undercloud-physical ~]$ openstack server list

+--------------------------------------+------------------------+--------+---------------------+

| ID                                   | Name                   | Status | Networks            |

+--------------------------------------+------------------------+--------+---------------------+

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0 | ACTIVE | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1 | ACTIVE | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2 | ACTIVE | ctlplane=192.0.2.14 |

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-compute-0    | ACTIVE | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-compute-1    | ACTIVE | ctlplane=192.0.2.12 |

+--------------------------------------+------------------------+--------+---------------------+

[stack@undercloud-physical ~]$ ironic node-list

+--------------------------------------+-----------------+--------------------------------------+-------------+--------------------+-------------+

| UUID                                 | Name            | Instance UUID                        | Power State | Provisioning State | Maintenance |

+--------------------------------------+-----------------+--------------------------------------+-------------+--------------------+-------------+

| 90727673-99cf-4177-aa94-6b2aa53e0b06 | overcloud-node1 | 783ae7bf-62d2-4e44-8817-63377db991a7 | power on    | active             | False       |

| 2b2c477b-74a6-4ef1-9b45-185ce864dec8 | overcloud-node2 | None                                 | power off   | available          | True        |

| 7923b610-3a17-49e5-961f-b93eec682eb8 | overcloud-node3 | 51787f3d-154b-4b1c-adb9-07a71dd61124 | power on    | active             | False       |

| 9ae5c9af-ab12-4ebf-bb75-866a93f459e3 | overcloud-node4 | 23b05a42-a739-42f4-b8e1-1f7115548173 | power on    | active             | False       |

| c3f2f89c-1113-4f5c-8d3d-dd6a4e8a67d4 | overcloud-node5 | 610d5b42-ba23-42e3-887f-1f98588279c3 | power on    | active             | False       |

| 58a08b02-8c74-46d4-90b0-36fcafb94e17 | overcloud-node6 | f178081e-29d3-41d5-a866-276add618f28 | power on    | active             | False       |

| c79239d5-be2d-45ff-ba1b-bdc25bc53e9e | overcloud-node7 | None                                 | power off   | available          | False       |

| b8312b72-6d63-49b8-8352-f591a59837cc | overcloud-node8 | None                                 | power off   | available          | False       |

+--------------------------------------+-----------------+--------------------------------------+-------------+--------------------+-------------+

[stack@undercloud-physical ~]$ heat stack-list

+--------------------------------------+------------+-----------------+---------------------+--------------+

| id                                   | stack_name | stack_status    | creation_time       | updated_time |

+--------------------------------------+------------+-----------------+---------------------+--------------+

| 53212f0c-c85a-4839-b42d-1899755d8ff3 | overcloud  | CREATE_COMPLETE | 2016-04-19T15:44:36 | None         |

+--------------------------------------+------------+-----------------+---------------------+--------------+

[stack@undercloud-physical ~]$

 

9.2 updating overcloud images

 

$ sudo yum install rhosp-director-images rhosp-director-images-ipa

 

[stack@undercloud-physical ~]$ rm -rf ~/images/*

[stack@undercloud-physical ~]$ cp /usr/share/rhosp-director-images/overcloud-full-latest-8.0.tar ~/images/.

[stack@undercloud-physical ~]$ cp /usr/share/rhosp-director-images/ironic-python-agent-latest-8.0.tar ~/images/.

[stack@undercloud-physical ~]$ cd ~/images

[stack@undercloud-physical images]$ for tarfile in *.tar; do tar -xf $tarfile; done

 

[stack@undercloud-physical images]$ openstack image list

+--------------------------------------+--------------------------------+

| ID                                   | Name                           |

+--------------------------------------+--------------------------------+

| deb0b5ac-c52b-4e24-be0d-718b96456fc7 | overcloud-full                 |

| e5d5ad4b-b369-4ddc-9317-679ab6e236d0 | bm-deploy-kernel               |

| 105a2fd2-5aa6-44a2-bb1c-0548513e49cb | bm-deploy-ramdisk              |

| a0d89537-2a95-4891-a4a0-dfb61cd7d8e8 | overcloud-full_20160418T203555 |

| fa01d67d-fd3a-4e67-ab22-c16a91456d54 | overcloud-full-initrd          |

| a59ad9aa-efd2-4593-a3b3-e46ad1c1de75 | overcloud-full-vmlinuz         |

+--------------------------------------+--------------------------------+

[stack@undercloud-physical images]$ openstack image list | awk '{print $2}' | xargs -I {} openstack image delete {}

No image with a name or ID of 'ID' exists.

[stack@undercloud-physical images]$ cd ~/images

[stack@undercloud-physical images]$ openstack overcloud image upload --update-existing

Image "overcloud-full-vmlinuz" was uploaded.

+--------------------------------------+------------------------+-------------+---------+--------+

|                  ID                  |          Name          | Disk Format |   Size  | Status |

+--------------------------------------+------------------------+-------------+---------+--------+

| 57000c02-955f-4732-bd98-d61241a51a52 | overcloud-full-vmlinuz |     aki     | 5153408 | active |

+--------------------------------------+------------------------+-------------+---------+--------+

Image "overcloud-full-initrd" was uploaded.

+--------------------------------------+-----------------------+-------------+----------+--------+

|                  ID                  |          Name         | Disk Format |   Size   | Status |

+--------------------------------------+-----------------------+-------------+----------+--------+

| a3f8e2dc-c466-4e6d-b3b6-1b27912088d8 | overcloud-full-initrd |     ari     | 40324659 | active |

+--------------------------------------+-----------------------+-------------+----------+--------+

Image "overcloud-full" was uploaded.

+--------------------------------------+----------------+-------------+------------+--------+

|                  ID                  |      Name      | Disk Format |    Size    | Status |

+--------------------------------------+----------------+-------------+------------+--------+

| 0c942b36-4b97-4a79-83ea-da960af65066 | overcloud-full |    qcow2    | 1030608384 | active |

+--------------------------------------+----------------+-------------+------------+--------+

Image "bm-deploy-kernel" was uploaded.

+--------------------------------------+------------------+-------------+---------+--------+

|                  ID                  |       Name       | Disk Format |   Size  | Status |

+--------------------------------------+------------------+-------------+---------+--------+

| 248813e9-8cf9-4d39-ad7d-39ead3ac3440 | bm-deploy-kernel |     aki     | 5153408 | active |

+--------------------------------------+------------------+-------------+---------+--------+

Image "bm-deploy-ramdisk" was uploaded.

+--------------------------------------+-------------------+-------------+-----------+--------+

|                  ID                  |        Name       | Disk Format |    Size   | Status |

+--------------------------------------+-------------------+-------------+-----------+--------+

| 4e7699f7-a316-4f09-a677-d13af29cef9f | bm-deploy-ramdisk |     ari     | 344412915 | active |

+--------------------------------------+-------------------+-------------+-----------+--------+

[stack@undercloud-physical images]$ openstack baremetal configure boot

[stack@undercloud-physical images]$ openstack image list

+--------------------------------------+------------------------+

| ID                                   | Name                   |

+--------------------------------------+------------------------+

| 4e7699f7-a316-4f09-a677-d13af29cef9f | bm-deploy-ramdisk      |

| 248813e9-8cf9-4d39-ad7d-39ead3ac3440 | bm-deploy-kernel       |

| a3f8e2dc-c466-4e6d-b3b6-1b27912088d8 | overcloud-full-initrd  |

| 0c942b36-4b97-4a79-83ea-da960af65066 | overcloud-full         |

| 57000c02-955f-4732-bd98-d61241a51a52 | overcloud-full-vmlinuz |

+--------------------------------------+------------------------+

[stack@undercloud-physical images]$ ls -l /httpboot

total 496980

-rwxr-xr-x. 1 root   root     5153408 Apr 19 15:23 agent.kernel

-rw-r--r--. 1 root   root   344412915 Apr 19 15:23 agent.ramdisk

-rw-r--r--. 1 ironic ironic       258 Apr 19 11:45 boot.ipxe

-rw-r--r--. 1 ironic ironic       239 Apr 19 13:44 discoverd.ipxe

-rwxr-xr-x. 1 ironic ironic   5153184 Apr 18 16:36 discovery.kernel

-rw-r--r--. 1 ironic ironic 154164260 Apr 18 16:36 discovery.ramdisk

-rw-r--r--. 1 root   root         286 Apr 19 13:45 inspector.ipxe

drwxr-xr-x. 2 ironic ironic         6 Apr 19 11:52 pxelinux.cfg

 

9.3 upgrading the overcloud

 

===> this part of the documentation could be a bit more clear I guess:

 

~~~

Important
If using custom NIC templates from Red Hat OpenStack Platform 7, add the ManagementSubnetIp parameter to the parameters section of your NIC templates. For example:

 

parameters: ManagementIpSubnet: # Only populated when including environments/network-management.yaml default: '' description: IP address/subnet on the management network type: string 

~~~

 

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml

(...)

Stack overcloud UPDATE_COMPLETE

Overcloud Endpoint: http://10.0.0.4:5000/v2.0

Overcloud Deployed

 

[stack@undercloud-physical ~]$ !$

templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

2016-04-19 19:49:30 [overcloud]: UPDATE_IN_PROGRESS  Stack UPDATE started

(...)

2016-04-19 19:55:57 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

[stack@undercloud-physical ~]$ heat stack-list

+--------------------------------------+------------+---------------+---------------------+---------------------+

| id                                   | stack_name | stack_status  | creation_time       | updated_time        |

+--------------------------------------+------------+---------------+---------------------+---------------------+

| 53212f0c-c85a-4839-b42d-1899755d8ff3 | overcloud  | UPDATE_FAILED | 2016-04-19T15:44:36 | 2016-04-19T19:49:30 |

+--------------------------------------+------------+---------------+---------------------+---------------------+

[stack@undercloud-physical ~]$ nova list

ssh +--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| ID                                   | Name                    | Status | Task State | Power State | Networks            |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-novacompute-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

[stack@undercloud-physical ~]$ ssh heat-admin.2.16

The authenticity of host '192.0.2.16 (192.0.2.16)' can't be established.

ECDSA key fingerprint is 01:08:aa:94:0e:8b:62:85:06:b5:c5:f4:df:45:c2:9d.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added '192.0.2.16' (ECDSA) to the list of known hosts.

[heat-admin@overcloud-controller-0 ~]$ sudo -i

[root@overcloud-controller-0 ~]# pcs cluster status

Error: cluster is not currently running on this node

[root@overcloud-controller-0 ~]# pcs cluster start

Starting Cluster...

[root@overcloud-controller-0 ~]# pcs cluster status

Cluster Status:

Last updated: Tue Apr 19 20:02:06 2016        Last change: Tue Apr 19 19:55:32 2016 by root via crm_resource on overcloud-controller-0

Stack: corosync

Current DC: NONE

3 nodes and 112 resources configured

OFFLINE: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

 

PCSD Status:

  overcloud-controller-0: Online

  overcloud-controller-1: Online

  overcloud-controller-2: Online

[root@overcloud-controller-0 ~]# exit

logout

[heat-admin@overcloud-controller-0 ~]$

 

[stack@undercloud-physical ~]$ templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

(...)

2016-04-19 20:05:47 [0]: SIGNAL_IN_PROGRESS  Signal: deployment succeeded

2016-04-19 20:05:47 [0]: UPDATE_COMPLETE  state changed

2016-04-19 20:05:47 [NovaComputeDeployment]: SIGNAL_COMPLETE  Unknown

2016-04-19 20:05:47 [0]: SIGNAL_COMPLETE  Unknown

2016-04-19 20:05:48 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

(...)

[stack@undercloud-physical ~]$ nova list

ssh +--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| ID                                   | Name                    | Status | Task State | Power State | Networks            |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-novacompute-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

h[stack@undercloud-physical ~]$ ssh heat-admin.2.15

The authenticity of host '192.0.2.15 (192.0.2.15)' can't be established.

ECDSA key fingerprint is 1e:fd:96:6c:c0:7c:70:24:b6:01:8d:c2:ec:04:3a:28.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added '192.0.2.15' (ECDSA) to the list of known hosts.

sudo -i

pcs cluster status

sudo -i

pcs cluster status

Last login: Tue Apr 19 16:15:37 2016 from 192.0.2.1

[heat-admin@overcloud-controller-1 ~]$ sudo -i

[root@overcloud-controller-1 ~]# pcs cluster status

Error: cluster is not currently running on this node

[root@overcloud-controller-1 ~]# pcs cluster start

Starting Cluster...

[root@overcloud-controller-1 ~]# exit

logout

[heat-admin@overcloud-controller-1 ~]$ exit

logout

Connection to 192.0.2.15 closed.

[stack@undercloud-physical ~]$ ssh heat-admin.2.14

The authenticity of host '192.0.2.14 (192.0.2.14)' can't be established.

ECDSA key fingerprint is b6:43:ab:d1:ea:fa:a5:61:5e:4b:07:65:b3:ad:d0:f5.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added '192.0.2.14' (ECDSA) to the list of known hosts.

[heat-admin@overcloud-controller-2 ~]$ pcs cluster status

Error: cluster is not currently running on this node

[heat-admin@overcloud-controller-2 ~]$ pcs cluster start

Please authenticate yourself to the local pcsd

Username: ^CTraceback (most recent call last):

  File "/usr/sbin/pcs", line 219, in <module>

    main(sys.argv[1:])

  File "/usr/sbin/pcs", line 204, in main

    orig_argv, True

  File "/usr/lib/python2.7/site-packages/pcs/utils.py", line 782, in call_local_pcsd

    username = get_terminal_input('Username: ')

  File "/usr/lib/python2.7/site-packages/pcs/utils.py", line 1599, in get_terminal_input

    return raw_input("")

KeyboardInterrupt

[heat-admin@overcloud-controller-2 ~]$ sudo -i

[root@overcloud-controller-2 ~]# pcs cluster status

Error: cluster is not currently running on this node

[root@overcloud-controller-2 ~]# pcs cluster start

Starting Cluster...

[root@overcloud-controller-2 ~]# pcs cluster status

Cluster Status:

Last updated: Tue Apr 19 20:08:47 2016        Last change: Tue Apr 19 19:55:32 2016 by root via crm_resource on overcloud-controller-0

Stack: corosync

Current DC: NONE

3 nodes and 112 resources configured

OFFLINE: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

 

PCSD Status:

  overcloud-controller-0: Online

  overcloud-controller-1: Online

  overcloud-controller-2: Online

[root@overcloud-controller-2 ~]# pcs cluster status

Cluster Status:

Last updated: Tue Apr 19 20:08:51 2016        Last change: Tue Apr 19 19:55:32 2016 by root via crm_resource on overcloud-controller-0

Stack: corosync

Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum

3 nodes and 112 resources configured

Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]

 

PCSD Status:

  overcloud-controller-0: Online

  overcloud-controller-1: Online

  overcloud-controller-2: Online

(...)

[stack@undercloud-physical ~]$ templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

 

Step 9.3: I needed to restart on _all_ controllers: pcs cluster start

 

Systems in the overcloud need to be registered (and need to reach the outside world / repository)! This isn't mentioned in the doc!!!

 

[stack@undercloud-physical ~]$ heat deployment-output-show 28735c62-5c80-40d4-90d7-b5daea975ae3 --all

{

  "deploy_stdout": "active\nactive\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.\n",

  "deploy_stderr": "There are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n",

  "deploy_status_code": 1

}

 

[root@undercloud-physical ~]# iptables -I FORWARD --src 10.0.0.0/24 --j ACCEPT

[root@undercloud-physical ~]# iptables -t nat -I POSTROUTING --src 10.0.0.0/24 --j MASQUERADE

 

 

[root@overcloud-controller-0 ~]# subscription-manager register

Registering to: subscription.rhn.redhat.com:443/subscription

Username: akaris

Password:

subcription-manager lThe system has been registered with ID: ff75f964-1d5c-4e00-9f84-c0fa6cbe612e

ist --all --a[root@overcloud-controller-0 ~]# subcription-manager list --all --available | egrep -i 'employee|pool' | egrep -i employee -A1

-bash: subcription-manager: command not found

[root@overcloud-controller-0 ~]# subscription-manager list --all --available | egrep -i 'employee|pool' | egrep -i employee -A1

Subscription Name:   Red Hat Satellite Employee Subscription

Pool ID:             8a85f9863f14fed3013f82b2c7b33615

--

Subscription Name:   Employee SKU

Pool ID:             8a85f98144844aff014488d058bf15be

[root@overcloud-controller-0 ~]# subscription-manager attach --pool=8a85f9863f14fed3013f82b2c7b33615

Successfully attached a subscription for: Red Hat Satellite Employee Subscription

[root@overcloud-controller-0 ~]#

 

[stack@undercloud-physical ~]$ ssh heat-admin.2.15

Last login: Tue Apr 19 20:07:45 2016 from 192.0.2.1

[heat-admin@overcloud-controller-1 ~]$ sudo -i

[root@overcloud-controller-1 ~]# subscription-manager register

Registering to: subscription.rhn.redhat.com:443/subscription

Username: akaris

Password:

The system has been registered with ID: bb0f4cef-36b3-4c69-acb3-e5d666876d73

[root@overcloud-controller-1 ~]# subscription-manager attach --pool=8a85f9863f14fed3013f82b2c7b33615

Successfully attached a subscription for: Red Hat Satellite Employee Subscription

[root@overcloud-controller-1 ~]# exit

logout

[heat-admin@overcloud-controller-1 ~]$ exit

logout

Connection to 192.0.2.15 closed.

[stack@undercloud-physical ~]$ ssh heat-admin.2.14

Last login: Tue Apr 19 20:08:19 2016 from 192.0.2.1

[heat-admin@overcloud-controller-2 ~]$ sudo -i

[root@overcloud-controller-2 ~]# subscription-manager register

Registering to: subscription.rhn.redhat.com:443/subscription

Username: akaris

Password:

The system has been registered with ID: 5a7e6813-e724-49d7-9ee9-b4bb5434d079

[root@overcloud-controller-2 ~]# subscription-manager attach --pool=8a85f9863f14fed3013f82b2c7b33615

Successfully attached a subscription for: Red Hat Satellite Employee Subscription

[root@overcloud-controller-2 ~]# exit

logout

[heat-admin@overcloud-controller-2 ~]$ exit

logout

 

 

[stack@undercloud-physical ~]$ templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

 

2016-04-19 20:27:57 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

2016-04-19 20:27:57 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

[stack@undercloud-physical ~]$ heat stack-list

+--------------------------------------+------------+---------------+---------------------+---------------------+

| id                                   | stack_name | stack_status  | creation_time       | updated_time        |

+--------------------------------------+------------+---------------+---------------------+---------------------+

| 53212f0c-c85a-4839-b42d-1899755d8ff3 | overcloud  | UPDATE_FAILED | 2016-04-19T15:44:36 | 2016-04-19T20:24:40 |

+--------------------------------------+------------+---------------+---------------------+---------------------+

[stack@undercloud-physical ~]$ heat resource-list -n5 overcloud | grep -iv comple

heat deployment-output +--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| resource_name                              | physical_resource_id                          | resource_type                                     | resource_status | updated_time        | stack_name                                                                                    |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| UpdateWorkflow                             | 8f01e2a7-aeef-4b15-b1e3-f2b14e6ca591          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED   | 2016-04-19T20:26:56 | overcloud                                                                                     |

| ControllerPacemakerUpgradeDeployment_Step1 | d9f96402-3668-46c7-b0c2-4cc8be763b96          | OS::Heat::SoftwareDeploymentGroup                 | UPDATE_FAILED   | 2016-04-19T20:26:58 | overcloud-UpdateWorkflow-tswg5622fcyz                                                         |

| 1                                          | e0c1c1b3-2282-4fef-8ccf-8240d0f2bde7          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:27:03 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 0                                          | 65f8d24d-359f-4ffe-a4d7-6719ee0b7e56          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:27:09 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 2                                          | 039b29c4-9672-4ede-9be4-f9b946a19d76          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:27:10 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

[stack@undercloud-physical ~]$ heat deployment-output-show 039b29c4-9672-4ede-9be4-f9b946a19d76 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

 

 

 

[stack@undercloud-physical ~]$ ssh heat-admin.2.16

Last login: Tue Apr 19 20:17:50 2016 from 192.0.2.1

[heat-admin@overcloud-controller-0 ~]$ sudo -i

[root@overcloud-controller-0 ~]# pcs cluster status

Error: cluster is not currently running on this node

[root@overcloud-controller-0 ~]# pcs cluster start

Starting Cluster...

 

2016-04-19 20:39:36 [1]: SIGNAL_COMPLETE  Unknown

2016-04-19 20:39:37 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

[stack@undercloud-physical ~]$ heat resource-list -n5 overcloud

^C... terminating heat client

[stack@undercloud-physical ~]$ heat resource-list -n5 overcloud | grep -iv comple

heat +--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| resource_name                              | physical_resource_id                          | resource_type                                     | resource_status | updated_time        | stack_name                                                                                    |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| UpdateWorkflow                             | 8f01e2a7-aeef-4b15-b1e3-f2b14e6ca591          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED   | 2016-04-19T20:38:35 | overcloud                                                                                     |

| ControllerPacemakerUpgradeDeployment_Step1 | d9f96402-3668-46c7-b0c2-4cc8be763b96          | OS::Heat::SoftwareDeploymentGroup                 | UPDATE_FAILED   | 2016-04-19T20:38:38 | overcloud-UpdateWorkflow-tswg5622fcyz                                                         |

| 2                                          | 01b6c9b4-6375-474b-91e8-8df9b5b20dbe          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:38:44 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 1                                          | bea42ca1-62c1-410e-88d7-e91e5e70cfb8          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:38:46 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 0                                          | e2743d51-47a1-4ec3-b7ef-5c13363b6913          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:38:48 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

[stack@undercloud-physical ~]$ heat resource^C

[stack@undercloud-physical ~]$ heat deployment-output-show e2743d51-47a1-4ec3-b7ef-5c13363b6913 --all

{

  "deploy_stdout": "OFFLINE: [ overcloud-controller-1 overcloud-controller-2 ]\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

 

[stack@undercloud-physical ~]$ nova list

ssh heat-+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| ID                                   | Name                    | Status | Task State | Power State | Networks            |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-novacompute-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

[stack@undercloud-physical ~]$ ssh heat-admin.2.15

Last login: Tue Apr 19 20:22:50 2016 from 192.0.2.1

[heat-admin@overcloud-controller-1 ~]$ sudo -i

[root@overcloud-controller-1 ~]# pcs status

Error: cluster is not currently running on this node

[root@overcloud-controller-1 ~]# pcs cluster start

Starting Cluster...

[root@overcloud-controller-1 ~]#

[root@overcloud-controller-1 ~]# exit

logout

[heat-admin@overcloud-controller-1 ~]$ exit

logout

Connection to 192.0.2.15 closed.

[stack@undercloud-physical ~]$ ssh 192.0.2.14

Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

[stack@undercloud-physical ~]$ ssh heat-admin.2.14

Last login: Tue Apr 19 20:23:37 2016 from 192.0.2.1

[heat-admin@overcloud-controller-2 ~]$ sudo pcs cluster start

Starting Cluster...

[heat-admin@overcloud-controller-2 ~]$ exit

logout

Connection to 192.0.2.14 closed.

(reverse-i-search)`upda': ironic node-^Cdate overcloud-node6 replace properties/capabilities='profile:control,boot_option:local'

[stack@undercloud-physical ~]$ templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

 

 

 

Documentation lacks information about how to troubleshoot this!!!

 

 

2016-04-19 20:50:27 [2]: SIGNAL_COMPLETE  Unknown

2016-04-19 20:50:28 [0]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

[stack@undercloud-physical ~]$ heat resource-list -n5 overcloud | grep -iv complete

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| resource_name                              | physical_resource_id                          | resource_type                                     | resource_status | updated_time        | stack_name                                                                                    |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| UpdateWorkflow                             | 8f01e2a7-aeef-4b15-b1e3-f2b14e6ca591          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED   | 2016-04-19T20:49:22 | overcloud                                                                                     |

| ControllerPacemakerUpgradeDeployment_Step1 | d9f96402-3668-46c7-b0c2-4cc8be763b96          | OS::Heat::SoftwareDeploymentGroup                 | UPDATE_FAILED   | 2016-04-19T20:49:24 | overcloud-UpdateWorkflow-tswg5622fcyz                                                         |

| 2                                          | 1da40f42-0f50-429d-bf67-060ba3527123          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:49:31 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 1                                          | 39f6966f-9da8-472a-9bb6-4b5852bc7261          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:49:33 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 0                                          | 20e64cc4-bd91-411a-bfba-6cba2d891254          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T20:49:35 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

[stack@undercloud-physical ~]$ heat deployment-output-show 20e64cc4-bd91-411a-bfba-6cba2d891254 --all

{

  "deploy_stdout": "httpd has stopped\n Clone Set: openstack-keystone-clone [openstack-keystone]\nopenstack-keystone has stopped\nredis has stopped\nmongod has stopped\nrabbitmq has stopped\nmemcached has stopped\ngalera has stopped\novercloud-controller-0: Stopping Cluster (pacemaker)...\novercloud-controller-2: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (corosync)...\novercloud-controller-0: Stopping Cluster (corosync)...\novercloud-controller-2: Stopping Cluster (corosync)...\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nNo package python-zaqarclient available.\n",

  "deploy_stderr": "Error: Nothing to do\n",

  "deploy_status_code": 1

}

[stack@undercloud-physical ~]$ heat deployment-output-show 39f6966f-9da8-472a-9bb6-4b5852bc7261 --all

{

  "deploy_stdout": "inactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nNo package python-zaqarclient available.\n",

  "deploy_stderr": "Error: Nothing to do\n",

  "deploy_status_code": 1

}

[stack@undercloud-physical ~]$ heat deployment-output-show 1da40f42-0f50-429d-bf67-060ba3527123 --all

{

  "deploy_stdout": "inactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nNo package python-zaqarclient available.\n",

  "deploy_stderr": "Error: Nothing to do\n",

  "deploy_status_code": 1

}

 

 

[stack@undercloud-physical ~]$ nova list

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| ID                                   | Name                    | Status | Task State | Power State | Networks            |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

| f178081e-29d3-41d5-a866-276add618f28 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |

| 51787f3d-154b-4b1c-adb9-07a71dd61124 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |

| 783ae7bf-62d2-4e44-8817-63377db991a7 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |

| 610d5b42-ba23-42e3-887f-1f98588279c3 | overcloud-novacompute-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |

| 23b05a42-a739-42f4-b8e1-1f7115548173 | overcloud-novacompute-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

[stack@undercloud-physical ~]$ ssh heat-admin.2.16

Last login: Tue Apr 19 20:35:40 2016 from 192.0.2.1

[heat-admin@overcloud-controller-0 ~]$ sudo -

sudo: -: command not found

[heat-admin@overcloud-controller-0 ~]$ pcs status

Error: cluster is not currently running on this node

[heat-admin@overcloud-controller-0 ~]$ pcs cluster start

Please authenticate yourself to the local pcsd

Username: ^CTraceback (most recent call last):

  File "/usr/sbin/pcs", line 219, in <module>

    main(sys.argv[1:])

  File "/usr/sbin/pcs", line 204, in main

    orig_argv, True

  File "/usr/lib/python2.7/site-packages/pcs/utils.py", line 782, in call_local_pcsd

    username = get_terminal_input('Username: ')

  File "/usr/lib/python2.7/site-packages/pcs/utils.py", line 1599, in get_terminal_input

    return raw_input("")

KeyboardInterrupt

[heat-admin@overcloud-controller-0 ~]$ sudo pcs cluster start

Starting Cluster...

[heat-admin@overcloud-controller-0 ~]$ exit

logout

Connection to 192.0.2.16 closed.

[stack@undercloud-physical ~]$ ssh heat-admin.2.15

Last login: Tue Apr 19 20:46:26 2016 from 192.0.2.1

[heat-admin@overcloud-controller-1 ~]$ sudo pcs cluster status

Error: cluster is not currently running on this node

[heat-admin@overcloud-controller-1 ~]$ sudo pcs cluster start

Starting Cluster...

[heat-admin@overcloud-controller-1 ~]$ exit

logout

Connection to 192.0.2.15 closed.

[stack@undercloud-physical ~]$ ssh heat-admin.2.14

Last login: Tue Apr 19 20:46:45 2016 from 192.0.2.1

[heat-admin@overcloud-controller-2 ~]$ sudo pcs cluster status

Error: cluster is not currently running on this node

[heat-admin@overcloud-controller-2 ~]$ sudo pcs cluster start

Starting Cluster...

[heat-admin@overcloud-controller-2 ~]$ exit

logout

Connection to 192.0.2.14 closed.

(reverse-i-search)`de': heat ^Cployment-output-show 1da40f42-0f50-429d-bf67-060ba3527123 --all

[stack@undercloud-physical ~]$ templates-examples/templates-simple-7/upgrade.yaml

Running ...

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /home/stack/templates-examples/templates-simple-7/network-environment.yaml --control-flavor control --compute-flavor compute --control-scale 3 --compute-scale 2 --ntp-server 10.5.26.10 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates

 

 

2016-04-19 21:10:47 [overcloud-ControllerAllNodesValidationDeployment-b3rcop24qnp7]: UPDATE_COMPLETE  Stack UPDATE completed successfully

2016-04-19 21:10:47 [0]: SIGNAL_COMPLETE  Unknown

2016-04-19 21:10:47 [ControllerDeployment]: SIGNAL_COMPLETE  Unknown

2016-04-19 21:10:47 [0]: SIGNAL_COMPLETE  Unknown

2016-04-19 21:10:48 [NetworkDeployment]: SIGNAL_COMPLETE  Unknown

Stack overcloud UPDATE_FAILED

Heat Stack update failed.

 

 

[stack@undercloud-physical ~]$ heat resource-list -n5 overcloud | grep -iv complet

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| resource_name                              | physical_resource_id                          | resource_type                                     | resource_status | updated_time        | stack_name                                                                                    |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

| UpdateWorkflow                             | 8f01e2a7-aeef-4b15-b1e3-f2b14e6ca591          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED   | 2016-04-19T21:09:40 | overcloud                                                                                     |

| ControllerPacemakerUpgradeDeployment_Step1 | d9f96402-3668-46c7-b0c2-4cc8be763b96          | OS::Heat::SoftwareDeploymentGroup                 | UPDATE_FAILED   | 2016-04-19T21:09:42 | overcloud-UpdateWorkflow-tswg5622fcyz                                                         |

| 0                                          | 495f2f6d-0cc2-4c02-9b26-8d7015eb2a31          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T21:09:50 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 1                                          | 120d1d09-9b82-48db-8f9f-1f176e6974ee          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T21:09:53 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

| 2                                          | 2b8d1ebb-3004-4c17-8fb8-0f1d62bba8d4          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-19T21:09:56 | overcloud-UpdateWorkflow-tswg5622fcyz-ControllerPacemakerUpgradeDeployment_Step1-olmu2ngax7zn |

+--------------------------------------------+-----------------------------------------------+---------------------------------------------------+-----------------+---------------------+-----------------------------------------------------------------------------------------------+

[stack@undercloud-physical ~]$ heat deployment-output-show 2b8d1ebb-3004-4c17-8fb8-0f1d62bba8d4 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud-physical ~]$ heat deployment-output-show 120d1d09-9b82-48db-8f9f-1f176e6974ee --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud-physical ~]$ heat deployment-output-show 495f2f6d-0cc2-4c02-9b26-8d7015eb2a31 --all

{

  "deploy_stdout": "httpd has stopped\n Clone Set: openstack-keystone-clone [openstack-keystone]\nopenstack-keystone has stopped\nredis has stopped\nmongod has stopped\nrabbitmq has stopped\nmemcached has stopped\ngalera has stopped\novercloud-controller-0: Stopping Cluster (pacemaker)...\novercloud-controller-2: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (corosync)...\novercloud-controller-0: Stopping Cluster (corosync)...\novercloud-controller-2: Stopping Cluster (corosync)...\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nNo package python-zaqarclient available.\n",

  "deploy_stderr": "Error: Nothing to do\n",

  "deploy_status_code": 1

}

[stack@undercloud-physical ~]$

Comment 2 Andreas Karis 2016-04-20 23:02:06 UTC
Running into this in a virtual lab as well https://bugzilla.redhat.com/show_bug.cgi?id=1328621


10.4. Upgrading the Overcloud - Red Hat Customer Portal


Important
If the Overcloud stack failed during this step, log into one of your Controller nodes, run sudo pcs cluster start, then rerun openstack overcloud deploy on the director.


Still in a constant loop: update kills cluster -> manual restart of cluster -> update kills cluster -> ...
Contrary to what's documented, restarting pcsd on one node only doesn't help:

[stack@undercloud ~]$ heat resource-list -n5 overcloud | grep -i failed

heat deployment-output-show | UpdateWorkflow                             | 5105adcb-562d-4702-ae28-f6cb25f021c5          | OS::TripleO::Tasks::UpdateWorkflow                | UPDATE_FAILED   | 2016-04-20T22:03:08 | overcloud                                                                                     |

| ControllerPacemakerUpgradeDeployment_Step1 | b87ec833-35a1-412f-8c66-385d46cb67ae          | OS::Heat::SoftwareDeploymentGroup                 | UPDATE_FAILED   | 2016-04-20T22:03:13 | overcloud-UpdateWorkflow-bfb6swfof4jd                                                         |

| 0                                          | d767a806-c96a-4ede-b18f-0f60c7c60099          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-20T22:03:15 | overcloud-UpdateWorkflow-bfb6swfof4jd-ControllerPacemakerUpgradeDeployment_Step1-7mmjddgrjqvc |

| 2                                          | 821b5f05-ca81-4a4e-9524-a486dd53f0ad          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-20T22:03:17 | overcloud-UpdateWorkflow-bfb6swfof4jd-ControllerPacemakerUpgradeDeployment_Step1-7mmjddgrjqvc |

| 1                                          | 6195bdaa-3414-429c-9f44-92a1b97c94b3          | OS::Heat::SoftwareDeployment                      | CREATE_FAILED   | 2016-04-20T22:03:24 | overcloud-UpdateWorkflow-bfb6swfof4jd-ControllerPacemakerUpgradeDeployment_Step1-7mmjddgrjqvc |

[stack@undercloud ~]$ heat deployment-output-show 6195bdaa-3414-429c-9f44-92a1b97c94b3 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show 821b5f05-ca81-4a4e-9524-a486dd53f0ad --all

heat d{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show d767a806-c96a-4ede-b18f-0f60c7c60099 --all

{

  "deploy_stdout": "OFFLINE: [ overcloud-controller-1 overcloud-controller-2 ]\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}



So I restarted it on all 3 controllers, with the following result:

[stack@undercloud ~]$ heat deployment-output-show 7018206d-af70-4469-ab76-83ccfab56c33 --all

{

  "deploy_stdout": "httpd has stopped\n Clone Set: openstack-keystone-clone [openstack-keystone]\nopenstack-keystone has stopped\nredis has stopped\nmongod has stopped\nrabbitmq has stopped\nmemcached has stopped\ngalera has stopped\novercloud-controller-2: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (pacemaker)...\novercloud-controller-0: Stopping Cluster (pacemaker)...\novercloud-controller-1: Stopping Cluster (corosync)...\novercloud-controller-0: Stopping Cluster (corosync)...\novercloud-controller-2: Stopping Cluster (corosync)...\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is registered to Red Hat Subscription Management, but is not receiving updates. You can use subscription-manager to assign subscriptions.\n",

  "deploy_stderr": "There are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show aa435d18-1d75-44a5-8a0d-2bfbe8b69ef4 --all

{

  "deploy_stdout": "active\nactive\nactive\nactive\nactive\nactive\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is registered to Red Hat Subscription Management, but is not receiving updates. You can use subscription-manager to assign subscriptions.\n",

  "deploy_stderr": "There are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show 2358656c-c3fc-4633-b947-61375e4f177d --all

{

  "deploy_stdout": "active\nactive\nactive\ninactive\nLoaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is registered to Red Hat Subscription Management, but is not receiving updates. You can use subscription-manager to assign subscriptions.\n",

  "deploy_stderr": "There are no enabled repos.\n Run \"yum repolist all\" to see the repos you have.\n You can enable repos with yum-config-manager --enable <repo>\n",

  "deploy_status_code": 1

}



fixed like this

[root@overcloud-controller-0 ~]# subscription-manager repos --enable=rhel-7-server-openstack-8-rpms --enable=rhel-7-server-openstack-8-director-rpms

Error: rhel-7-server-openstack-8-rpms is not a valid repository ID. Use --list option to see valid repositories.

Error: rhel-7-server-openstack-8-director-rpms is not a valid repository ID. Use --list option to see valid repositories.

[root@overcloud-controller-0 ~]# subscription-manager attach --pool=8a85f9814d368e7d014d44509b5e4eef

Successfully attached a subscription for: Employee SKU

[root@overcloud-controller-0 ~]# subscription-manager repos --enable=rhel-7-server-openstack-8-rpms --enable=rhel-7-server-openstack-8-director-rpms

Repository 'rhel-7-server-openstack-8-director-rpms' is enabled for this system.

Repository 'rhel-7-server-openstack-8-rpms' is enabled for this system.

[root@overcloud-controller-0 ~]#




which then leads again to



[stack@undercloud ~]$ heat deployment-output-show 2eb51934-ea2c-425b-9e7f-8902ec10ea2d --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show 5c78bf87-575d-4dac-888a-8f92e0c3f0a5 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show aec131cd-6cee-4e02-b5f3-13e9c0785808 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n",

  "deploy_stderr": "",

  "deploy_status_code": 1

}

[stack@undercloud ~]$




===> controller0, pcs cluster start as suggested in doc and new upgarde deployment



[stack@undercloud ~]$ heat deployment-output-show fb5b093c-0b80-4e40-8315-8f9bcf959187 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n", 

  "deploy_stderr": "", 

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show 45a711f1-fd27-45fd-9284-e03b1d6f5f36 --all

{

  "deploy_stdout": "OFFLINE: [ overcloud-controller-1 overcloud-controller-2 ]\nERROR: upgrade cannot start with some cluster nodes being offline\n", 

  "deploy_stderr": "", 

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show d134deac-35d8-416f-969c-add72812a876 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n", 

  "deploy_stderr": "", 

  "deploy_status_code": 1

}



Next update run:


[stack@undercloud ~]$ heat deployment-output-show e2ddc8c5-af63-425e-840b-e760bd413681 --all

{

  "deploy_stdout": "Error: cluster is not currently running on this node\nERROR: upgrade cannot start with some cluster nodes being offline\n", 

  "deploy_stderr": "", 

  "deploy_status_code": 1

}

[stack@undercloud ~]$ heat deployment-output-show 71512745-2e0b-4e39-9ede-fd49535449fe --all

null

[stack@undercloud ~]$ heat deployment-output-show 9181865e-9542-43b7-9065-e7e5e90f6760 --all

null

[stack@undercloud ~]$ 


And yet again pcs is not running any more on any of the nodes.

Comment 3 Andreas Karis 2016-04-21 16:08:30 UTC
This is a 3 controller, 2 compute node scenario

Comment 5 Michele Baldessari 2017-10-03 15:24:14 UTC
Hi Andreas,

apologies for the late answer here. Can you still reproduce this with OSP8 updated? The link at c#4 is a 404. Do I presume correctly that we are talking about: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/8/html-single/upgrading_red_hat_openstack_platform/#sect-Major-Upgrading_the_Overcloud-Controller (3.4.4)

So you ran:
 The deploy command with major-upgrade-pacemaker-init.yaml environment file.
 Then upgrade-non-controller.sh on each Object Storage node.
 Then with major-upgrade-pacemaker.yaml environment file, it fails.

Do you have an env where this happen that we can take a look at?

Note that it seems that you have not configured any repos on the overcloud?

Comment 6 Andreas Karis 2017-10-03 15:31:23 UTC
Hi,

"2016-04-21" ... I think this happened somewhere in a lab, no case attached. I am closing this as not a bug :-)

- Andreas