Bug 1477962
Summary: | OSP11 -> OSP12 upgrade: Ensure non-controller are usable after upgrade and before converge. | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | openstack-tripleo-heat-templates | Assignee: | Marios Andreou <mandreou> |
Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | dbecker, jschluet, lyarwood, mandreou, mbracho, mbultel, mburns, morazi, rhel-osp-director-maint, sathlang, sclewis |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-7.0.1-0.20170928105409.el7ost python-tripleoclient-7.3.1-0.20170925220840.f114a61.el7ost openstack-tripleo-common-7.6.1-0.20170926174320.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-13 21:48:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1399762, 1477770 |
Description
Marius Cornea
2017-08-03 10:35:20 UTC
Current proposal (duplicated from upstream bug for convenience) and adding the reviews to trackers above With the help of a utility function in https://review.openstack.org/#/c/491749/ (python-tripleoclient) we can use the upgrade_tasks playbook generated by the tripleo-heat-templates at https://review.openstack.org/#/c/490848/ (note: this depends on a few shardy tht reviews see shortlog). So, in the upgrade-non-controller.sh script, we add download and execution of both the upgrade_tasks and deploy_steps playbooks with https://review.openstack.org/#/c/490847/ (tripleo-common). The generated playbooks look like https://paste.fedoraproject.org/paste/gUi5Ckq2qoTT~ed5kItxRw/raw (while it lasts)... seems like most of the things we need for the compute and swift nodes are in the ugprade_tasks (e.g. stop openstack-nova-compute which we had to add recently into the tripleo_upgrade_node.sh). Reviews: (tripleo-common): https://review.openstack.org/#/c/490847/ "Download and run upgrade/deploy_steps_playbooks for upgrade" | |Depends-On: | -->(tripleo-heat-templates): https://review.openstack.org/#/c/490848/ "Also write an upgrade_(batch)_tasks playbook" (&see shortlog!) | |Depends-On: | -->(python-tripleo-client): https://review.openstack.org/#/c/491749/ "Adds when in upgrade_tasks playbook written by config download" just also posted https://review.openstack.org/498776 for disabling the puppet config run and related workarounds from the tripleo-upgrade-node.sh script. If testing you'll also need to apply this on your tripleo-heat-templates before running the major-upgrade-composable-steps-docker.yaml stage of the overcloud upgrade. adding to trackers and for testing: # tripleo-heat-templates: https://review.openstack.org/#/c/498776/ "Remove puppet run and workarounds from tripleo_upgrade_node.sh" curl https://review.openstack.org/changes/498776/revisions/current/patch?download | base64 -d | sudo patch -d /usr/share/openstack-tripleo-heat-templates/ -p1 So I managed to get the RoleConfig output after applying the following patch and running the deploy command with --setup-heat-outputs option. I think we should include this step in the major-upgrade-composable-steps-docker.yaml step so we don't have to include an additional step in the upgrade procedure. curl -4 https://review.openstack.org/changes/495658/revisions/current/patch?download | base64 -d | sudo patch -d /usr/lib/python2.7/site-packages/ -p1 -f #!/bin/bash timeout 180m openstack overcloud deploy \ --setup-heat-outputs \ --templates /usr/share/openstack-tripleo-heat-templates \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/services-docker/sahara.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /home/stack/docker-osp12.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml \ After this I was able to run upgrade-non-controller.sh --upgrade compute-0 which failed with the below error: a quick note here: /usr/bin/tripleo-ansible-inventory --list takes around 2 minutes for a basic 1 controller + 1 compute deployment so you get the impression that the command is stuck at: Wed Aug 30 11:03:04 EDT 2017 upgrade-non-controller.sh Starting the upgrade steps playbook run for compute-0 from compute-0/tripleo-bVAAT_-config/ In the end the playbook fails with the following error: TASK [Ensure empty directory: emptying.] ****************************************************************************************************************************************************************************************************** [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: ('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2) fatal: [192.168.24.13]: FAILED! => {"failed": true, "msg": "The conditional check '('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2)' failed. The error was: error while evaluating conditional (('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2)): 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/home/stack/compute-0/tripleo-bVAAT_-config/Compute/upgrade_tasks.yaml': line 42, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n - file:\n ^ here\n"} to retry, use: --limit @/home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_playbook.retry This is the complete output: You can see that the 'Check openvswitch version' is skipped hence the dict object' has no attribute 'stdout' error regarding ovs_version.stdout PLAY [overcloud] ****************************************************************************************************************************************************************************************************************************** TASK [Gathering Facts] ************************************************************************************************************************************************************************************************************************ ok: [192.168.24.13] TASK [include] ******************************************************************************************************************************************************************************************************************************** included: /home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_tasks.yaml for 192.168.24.13 included: /home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_tasks.yaml for 192.168.24.13 included: /home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_tasks.yaml for 192.168.24.13 included: /home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_tasks.yaml for 192.168.24.13 included: /home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_tasks.yaml for 192.168.24.13 TASK [include] ******************************************************************************************************************************************************************************************************************************** skipping: [192.168.24.13] TASK [include] ******************************************************************************************************************************************************************************************************************************** included: /home/stack/compute-0/tripleo-bVAAT_-config/Compute/upgrade_tasks.yaml for 192.168.24.13 TASK [Check if neutron_ovs_agent is deployed] ************************************************************************************************************************************************************************************************* changed: [192.168.24.13] TASK [Check yum for rpm-python present] ******************************************************************************************************************************************************************************************************* skipping: [192.168.24.13] TASK [Fail when rpm-python wasn't present] **************************************************************************************************************************************************************************************************** skipping: [192.168.24.13] TASK [PreUpgrade step0,validation: Check service neutron-openvswitch-agent is running] ******************************************************************************************************************************************************** skipping: [192.168.24.13] TASK [Stop neutron_ovs_agent service] ********************************************************************************************************************************************************************************************************* skipping: [192.168.24.13] TASK [Stop snmp service] ********************************************************************************************************************************************************************************************************************** skipping: [192.168.24.13] TASK [Check openvswitch version.] ************************************************************************************************************************************************************************************************************* skipping: [192.168.24.13] TASK [Check openvswitch packaging.] *********************************************************************************************************************************************************************************************************** skipping: [192.168.24.13] TASK [Ensure empty directory: emptying.] ****************************************************************************************************************************************************************************************************** [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: ('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2) fatal: [192.168.24.13]: FAILED! => {"failed": true, "msg": "The conditional check '('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2)' failed. The error was: error while evaluating conditional (('2.5.0-14' in '{{ovs_version.stdout}}' or ovs_packaging_issue|succeeded) and (step == 2)): 'dict object' has no attribute 'stdout'\n\nThe error appears to have been in '/home/stack/compute-0/tripleo-bVAAT_-config/Compute/upgrade_tasks.yaml': line 42, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n - file:\n ^ here\n"} to retry, use: --limit @/home/stack/compute-0/tripleo-bVAAT_-config/upgrade_steps_playbook.retry PLAY RECAP ************************************************************************************************************************************************************************************************************************************ 192.168.24.13 : ok=8 changed=1 unreachable=0 failed=1 we also need https://review.openstack.org/#/c/499540/ mcornea ++ adding to trackers Adding another review for allowing the upgrade tasks to run between steps: https://review.openstack.org/#/c/499517/ Also I filed a BZ for tripleo-inventory being too slow: https://bugzilla.redhat.com/show_bug.cgi?id=1487759 Remaining issues that we need to track in this bug: - set up RoleConfig output during major-upgrade-composable-steps so we don't have to run an additional step with --setup-heat-outputs option - cache the tripleo-ansible-inventory so we don't waste 5 minutes per non controller node waiting for the ouptut of tripleo-ansible-inventory (In reply to Marius Cornea from comment #13) > Remaining issues that we need to track in this bug: > > - set up RoleConfig output during major-upgrade-composable-steps so we > don't have to run an additional step with --setup-heat-outputs option > > - cache the tripleo-ansible-inventory so we don't waste 5 minutes per non > controller node waiting for the ouptut of tripleo-ansible-inventory The slow inventory issue was addressed by https://review.openstack.org/#/c/501603/ In addition we need to address upgrading non controller nodes for split stack deployments. RoleConfig output issue is being tracked in bug 1490425 Remaining issues to be addressed by this bug: - upgrading non controller nodes on split stack deployments (In reply to Marius Cornea from comment #15) > RoleConfig output issue is being tracked in bug 1490425 > > Remaining issues to be addressed by this bug: > > - upgrading non controller nodes on split stack deployments We actually have a different BZ (bug 1474697) filed for split stack deployments so I think this bug can be moved to POST as all the patches attached to it are merged to stable/pike. (In reply to Marius Cornea from comment #16) > (In reply to Marius Cornea from comment #15) > > RoleConfig output issue is being tracked in bug 1490425 > > > > Remaining issues to be addressed by this bug: > > > > - upgrading non controller nodes on split stack deployments > > We actually have a different BZ (bug 1474697) filed for split stack > deployments so I think this bug can be moved to POST as all the patches > attached to it are merged to stable/pike. thanks mcornea I updated the trackers to point to stable/pike (the last two merged before pike was branched and I checked they are in stable/pike tripleo-heat-templates and tripleo-common for https://review.openstack.org/#/c/490848/ and https://review.openstack.org/#/c/490847/ respectively I'll bring this on our call later and we can move to POST Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |