Bug 1506283
| Summary: | OSP10 minor update fails when using no custom nic profiles | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Gregory Charot <gcharot> |
| Component: | openstack-tripleo-heat-templates | Assignee: | anil venkata <vkommadi> |
| Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 10.0 (Newton) | CC: | dbecker, emacchi, mbultel, mburns, mcornea, morazi, rhel-osp-director-maint, sathlang, slinaber, vkommadi, yprokule |
| Target Milestone: | z7 | Keywords: | Triaged, ZStream |
| Target Release: | 10.0 (Newton) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-5.3.8-1.el7ost | Doc Type: | No Doc Update |
| Doc Text: |
undefined
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-02-27 16:50:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Gregory said, commit https://github.com/openstack/tripleo-heat-templates/commit/bce61783bc175e98b535c678d90829344dab5c47#diff-002d345b79ce06e07a34abc8da5ade5f fixes the issue. And this commit is part of openstack-tripleo-heat-templates-5.3.3-1.el7ost Hi, so here is the whole story. Hold tight. So in the current (non working in this case) update_network function the command os-net-config is run unconditionally, whether there was an upgrade of the package or not. This is the root cause of the problem here as the /etc/os-net-config/config.json is empty. The changed mentioned to solve the problem make it work because it has the side-effect of removing the special os-net-config from the update-network function[2] preventing any non-conditional run of os-net-config. But as I said it's a side-effect, meaning that the special os-net-config treatment meant by the original change here[3] has been erased, which may not be a good thing. In the patch that "solves" the problem[4] os-net-config special treatment is kept only for the osp9->osp10 upgrade and only on the controllers. So I think we should re-include the special handling of the os-net-config for everything, using the new function[5] as it check whether there has been an upgrade of the package. We should also add a check for the non-emptyness of the configuration file even if a upgrade happen to be on the safest side. Going to post a review going in that direction. [1] https://github.com/openstack/tripleo-heat-templates/blob/9f8ba2c052e04c1ba8db756a48181a54c9cd8f68/extraconfig/tasks/pacemaker_common_functions.sh#L334 [2] https://github.com/openstack/tripleo-heat-templates/blob/bce61783bc175e98b535c678d90829344dab5c47/extraconfig/tasks/pacemaker_common_functions.sh#L375-L378 [3] https://github.com/openstack/tripleo-heat-templates/commit/9f8ba2c052e04c1ba8db756a48181a54c9cd8f68#diff-002d345b79ce06e07a34abc8da5ade5fR326 [4] https://github.com/openstack/tripleo-heat-templates/commit/bce61783bc175e98b535c678d90829344dab5c47#diff-002d345b79ce06e07a34abc8da5ade5f [5] https://github.com/openstack/tripleo-heat-templates/blob/bce61783bc175e98b535c678d90829344dab5c47/extraconfig/tasks/pacemaker_common_functions.sh#L350-L373 So this review should be applied on top of the review in there[1], meaning this one[2] [1] https://bugzilla.redhat.com/show_bug.cgi?id=1434621 [2] https://review.openstack.org/#/c/474967/ Verified on openstack-tripleo-heat-templates-5.3.8-1.el7ost.noarch Minor update successfuly completed on deployment without using network isolation. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0364 |
Description of problem: When deploying OSP10 with no network customisation minor update fails. Version-Release number of selected component (if applicable): cat /etc/rhosp-release Red Hat OpenStack Platform release 10.0 (Newton) (undercloud) rpm -qa | grep tripleo openstack-tripleo-0.0.8-0.2.4de13b3git.el7ost.noarch openstack-tripleo-heat-templates-compat-2.0.0-58.el7ost.noarch openstack-tripleo-validations-5.1.2-1.el7ost.noarch openstack-tripleo-image-elements-5.3.0-3.el7ost.noarch openstack-tripleo-puppet-elements-5.3.0-1.el7ost.noarch openstack-tripleo-heat-templates-5.3.0-6.el7ost.noarch python-tripleoclient-5.4.3-1.el7ost.noarch puppet-tripleo-5.6.1-4.el7ost.noarch openstack-tripleo-ui-1.2.0-1.el7ost.noarch openstack-tripleo-common-5.4.2-4.el7ost.noarch (undercloud) rpm -qa | grep director rhosp-director-images-ipa-10.0-20170920.1.el7ost.noarch rhosp-director-images-10.0-20170920.1.el7ost.noarch rhos-release osp10 How reproducible: Always based on my env Steps to Reproduce: 1. openstack overcloud deploy --templates --ntp-server x.x.x.x --control-scale 1 --compute-scale 2 --neutron-tunnel-types vxlan --neutron-network-type vxlan --control-flavor control --compute-flavor compute 2. openstack overcloud update stack -i overcloud Actual results: Update fails on the compute node(s) openstack stack failures list overcloud overcloud.Controller.0: resource_type: OS::TripleO::Controller physical_resource_id: 0f6f365f-f37a-43d6-810a-a309a5f29883 status: UPDATE_FAILED status_reason: | UPDATE aborted overcloud.Compute.1.UpdateDeployment: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: f71762af-c166-4396-a725-638e58ed5ede status: UPDATE_FAILED status_reason: | Error: resources.UpdateDeployment: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | Started yum_update.sh on server 5db9b0c7-513f-493a-9781-9d628be6bdb0 at Tue Oct 17 18:36:28 UTC 2017 Checking openstack-nova-migration is installed Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is not registered with an entitlement server. You can use subscription-manager to register. Metadata Cache Created Checking for ceph-osd dependency issues ceph-osd package is available from an enabled repo Delta RPMs disabled because /usr/bin/applydeltarpm not installed. yum update os-net-config return code: 0 ERROR: os-net-config configuration failed deploy_stderr: | [2017/10/17 06:37:00 PM] [INFO] Using config file at: /etc/os-net-config/config.json [2017/10/17 06:37:00 PM] [INFO] Using mapping file at: /etc/os-net-config/mapping.yaml [2017/10/17 06:37:00 PM] [INFO] Ifcfg net config provider created. Traceback (most recent call last): File "/usr/bin/os-net-config", line 10, in <module> sys.exit(main()) File "/usr/lib/python2.7/site-packages/os_net_config/cli.py", line 157, in main iface_array = yaml.load(cf.read()).get("network_config") AttributeError: 'NoneType' object has no attribute 'get' overcloud.Compute.0.UpdateDeployment: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: ae63b2e7-afc4-40e6-86bc-cfe38e8b8f59 status: UPDATE_FAILED status_reason: | Error: resources.UpdateDeployment: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1 deploy_stdout: | Started yum_update.sh on server 4a796a30-986f-4858-806f-6e3a9bd94f93 at Tue Oct 17 18:05:29 UTC 2017 Checking openstack-nova-migration is installed Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is not registered with an entitlement server. You can use subscription-manager to register. Metadata Cache Created Checking for ceph-osd dependency issues ceph-osd package is available from an enabled repo Delta RPMs disabled because /usr/bin/applydeltarpm not installed. yum update os-net-config return code: 0 ERROR: os-net-config configuration failed deploy_stderr: | [2017/10/17 06:06:03 PM] [INFO] Using config file at: /etc/os-net-config/config.json [2017/10/17 06:06:03 PM] [INFO] Using mapping file at: /etc/os-net-config/mapping.yaml [2017/10/17 06:06:03 PM] [INFO] Ifcfg net config provider created. Traceback (most recent call last): File "/usr/bin/os-net-config", line 10, in <module> sys.exit(main()) File "/usr/lib/python2.7/site-packages/os_net_config/cli.py", line 157, in main iface_array = yaml.load(cf.read()).get("network_config") AttributeError: 'NoneType' object has no attribute 'get' Relevant error is ERROR: os-net-config configuration failed due to yaml.load(cf.read()).get("network_config") Expected results: Update completes sucessfully Additional info: On the compute nodes /etc/os-net-config/config.json is empty, file exists on the controller. os-net-config fails to run because there is no config.json looking at yum_update.sh, it does a update_network which is declare in pacemaker_common_functions.sh os-net-config -c /etc/os-net-config/config.json -v --detailed-exit-codes RETVAL=$? if [[ $RETVAL == 2 ]]; then echo "os-net-config: interface configuration files updated successfully" elif [[ $RETVAL != 0 ]]; then echo "ERROR: os-net-config configuration failed" exit $RETVAL fi set -e we can see the same error message "ERROR: os-net-config configuration failed" present in the stack failures list. Instead if I use https://github.com/openstack/tripleo-heat-templates/blob/stable/newton/extraconfig/tasks/yum_update.sh AND https://github.com/openstack/tripleo-heat-templates/blob/stable/newton/extraconfig/tasks/pacemaker_common_functions.sh The problem "goes away", config.json is still empty on the computes but the update_network does not call os-net-config If using custom network config, the problem does not appear as the config.json files are not empty.