OSP11 -> OSP12 upgrade: pre-upgrade validations are preventing a re-run of the upgrade-non-controller.sh script to upgrade a compute node after a failed attempt
Additional non-controller upgrade attempts after a failed upgrade can fail during service validation if services are not running. To prevent such upgrade failures you can skip services validation. Pass the option "--skip-tags validation" to the Ansible invocation.
For example:
upgrade-non-controller.sh --upgrade compute-0 --ansible-opts "--skip-tags validation"
Description of problem:
OSP11 -> OSP12 upgrade: unable to re-run the upgrade-non-controller.sh script to upgrade a compute node after a failed attempt because the pre-upgrade validations are failing.
Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.3-16.el7ost.noarch
openstack-tripleo-common-7.6.3-6.el7ost.noarch
How reproducible:
100%
Steps to Reproduce:
1. Deploy OSP11
2. Upgrade to OSP12
3. Complete major-upgrade-composable-steps-docker.yaml step successfuly
4. Run upgrade-non-controller.sh --upgrade compute-0 which fails due to unreacheable repositories:
TASK [Upgrade os-net-config] ******************************************************************************************************************************************************************************************************************
fatal: [192.168.24.11]: FAILED! => {"changed": true, "failed": true, "msg": "http://rhos-qe-mirror-brq.usersys.redhat.com/rcm-guest/puddles/OpenStack/12.0-RHEL-7/latest/RH7-RHOS-12.0/x86_64/os/Packages/python2-pbr-3.1.1-1.el7ost.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=rhelosp-12.0-puddle clean metadata\nTrying other mirror.\nhttp://rhos-qe-mirror-brq.usersys.redhat.com/rcm-guest/puddles/OpenStack/12.0-RHEL-7/latest/RH7-RHOS-12.0/x86_64/os/Packages/os-net-config-7.3.1-1.el7ost.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=rhelosp-12.0-puddle clean metadata\nTrying other mirror.\n\n\nError downloading packages:\n os-net-config-7.3.1-1.el7ost.noarch: [Errno 256] No more mirrors to try.\n python2-pbr-3.1.1-1.el7ost.noarch: [Errno 256] No more mirrors to try.\n\n", "rc": 1, "results": ["Loaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is not registered with an entitlement server. You can use subscription-manager to register.\nResolving Dependencies\n--> Running transaction check\n---> Package os-net-config.noarch 0:6.1.0-2.el7ost will be updated\n---> Package os-net-config.noarch 0:7.3.1-1.el7ost will be an update\n--> Processing Dependency: python-pbr >= 2.0.0 for package: os-net-config-7.3.1-1.el7ost.noarch\n--> Running transaction check\n---> Package python-pbr.noarch 0:1.10.0-2.el7ost will be obsoleted\n---> Package python2-pbr.noarch 0:3.1.1-1.el7ost will be obsoleting\n--> Finished Dependency Resolution\n\nDependencies Resolved\n\n================================================================================\n Package Arch Version Repository Size\n================================================================================\nInstalling:\n python2-pbr noarch 3.1.1-1.el7ost rhelosp-12.0-puddle 191 k\n replacing python-pbr.noarch 1.10.0-2.el7ost\nUpdating:\n os-net-config noarch 7.3.1-1.el7ost rhelosp-12.0-puddle 260 k\n\nTransaction Summary\n================================================================================\nInstall 1 Package\nUpgrade 1 Package\n\nTotal download size: 451 k\nDownloading packages:\nDelta RPMs disabled because /usr/bin/applydeltarpm not installed.\n"]}
5. Fix repositories issue
6. Re-run upgrade-non-controller.sh --upgrade compute-0
Actual results:
The command fails the pre-upgrade validation check for the neutron-openvswitch-agent service:
TASK [PreUpgrade step0,validation: Check service neutron-openvswitch-agent is running] ********************************************************************************************************************************************************
fatal: [192.168.24.11]: FAILED! => {"changed": true, "cmd": "/usr/bin/systemctl show 'neutron-openvswitch-agent' --property ActiveState | grep '\\bactive\\b'", "delta": "0:00:00.008534", "end": "2017-11-29 12:26:33.244232", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-29 12:26:33.235698", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
Expected results:
We should have a way to disable to pre-upgrade validation checks for the upgrade-non-controller script to allow re-running it after a failed attempt.
Additional info:
Workaround: ssh to the compute node and manually start the services which were stopped during the failed upgrade attempt:
systemctl start neutron-openvswitch-agent
Then re-run the upgrade-non-controller.sh script.
fwiw I don't think this is a blocker for 12 and would be nice to get into the first async release please. Upstream patch posted today needs some more testing
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2018:2331
Description of problem: OSP11 -> OSP12 upgrade: unable to re-run the upgrade-non-controller.sh script to upgrade a compute node after a failed attempt because the pre-upgrade validations are failing. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-7.0.3-16.el7ost.noarch openstack-tripleo-common-7.6.3-6.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP11 2. Upgrade to OSP12 3. Complete major-upgrade-composable-steps-docker.yaml step successfuly 4. Run upgrade-non-controller.sh --upgrade compute-0 which fails due to unreacheable repositories: TASK [Upgrade os-net-config] ****************************************************************************************************************************************************************************************************************** fatal: [192.168.24.11]: FAILED! => {"changed": true, "failed": true, "msg": "http://rhos-qe-mirror-brq.usersys.redhat.com/rcm-guest/puddles/OpenStack/12.0-RHEL-7/latest/RH7-RHOS-12.0/x86_64/os/Packages/python2-pbr-3.1.1-1.el7ost.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=rhelosp-12.0-puddle clean metadata\nTrying other mirror.\nhttp://rhos-qe-mirror-brq.usersys.redhat.com/rcm-guest/puddles/OpenStack/12.0-RHEL-7/latest/RH7-RHOS-12.0/x86_64/os/Packages/os-net-config-7.3.1-1.el7ost.noarch.rpm: [Errno -1] Package does not match intended download. Suggestion: run yum --enablerepo=rhelosp-12.0-puddle clean metadata\nTrying other mirror.\n\n\nError downloading packages:\n os-net-config-7.3.1-1.el7ost.noarch: [Errno 256] No more mirrors to try.\n python2-pbr-3.1.1-1.el7ost.noarch: [Errno 256] No more mirrors to try.\n\n", "rc": 1, "results": ["Loaded plugins: product-id, search-disabled-repos, subscription-manager\nThis system is not registered with an entitlement server. You can use subscription-manager to register.\nResolving Dependencies\n--> Running transaction check\n---> Package os-net-config.noarch 0:6.1.0-2.el7ost will be updated\n---> Package os-net-config.noarch 0:7.3.1-1.el7ost will be an update\n--> Processing Dependency: python-pbr >= 2.0.0 for package: os-net-config-7.3.1-1.el7ost.noarch\n--> Running transaction check\n---> Package python-pbr.noarch 0:1.10.0-2.el7ost will be obsoleted\n---> Package python2-pbr.noarch 0:3.1.1-1.el7ost will be obsoleting\n--> Finished Dependency Resolution\n\nDependencies Resolved\n\n================================================================================\n Package Arch Version Repository Size\n================================================================================\nInstalling:\n python2-pbr noarch 3.1.1-1.el7ost rhelosp-12.0-puddle 191 k\n replacing python-pbr.noarch 1.10.0-2.el7ost\nUpdating:\n os-net-config noarch 7.3.1-1.el7ost rhelosp-12.0-puddle 260 k\n\nTransaction Summary\n================================================================================\nInstall 1 Package\nUpgrade 1 Package\n\nTotal download size: 451 k\nDownloading packages:\nDelta RPMs disabled because /usr/bin/applydeltarpm not installed.\n"]} 5. Fix repositories issue 6. Re-run upgrade-non-controller.sh --upgrade compute-0 Actual results: The command fails the pre-upgrade validation check for the neutron-openvswitch-agent service: TASK [PreUpgrade step0,validation: Check service neutron-openvswitch-agent is running] ******************************************************************************************************************************************************** fatal: [192.168.24.11]: FAILED! => {"changed": true, "cmd": "/usr/bin/systemctl show 'neutron-openvswitch-agent' --property ActiveState | grep '\\bactive\\b'", "delta": "0:00:00.008534", "end": "2017-11-29 12:26:33.244232", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-29 12:26:33.235698", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} Expected results: We should have a way to disable to pre-upgrade validation checks for the upgrade-non-controller script to allow re-running it after a failed attempt. Additional info: Workaround: ssh to the compute node and manually start the services which were stopped during the failed upgrade attempt: systemctl start neutron-openvswitch-agent Then re-run the upgrade-non-controller.sh script.