Bug 1324691
| Summary: | rhel-osp-director: deploy or scaling operations implicitly call update after upgrade | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> |
| Component: | python-tripleoclient | Assignee: | Jiri Stransky <jstransk> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Arik Chernetsky <achernet> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 8.0 (Liberty) | CC: | aschultz, augol, dbecker, dmacpher, hbrock, jcoufal, jschluet, jslagle, jstransk, kbasil, mburns, mcornea, morazi, rhel-osp-director-maint, sathlang |
| Target Milestone: | async | ||
| Target Release: | 8.0 (Liberty) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
Cause: The upgrade process currently sets a variable in the heat stack which does not get cleared on completion
Consequence: all deploy or scale commands after an upgrade will trigger the Update workflow which does yum updates and various other process which are not expected in deploy/scale workflows
Workaround (if any): On the first deploy or scale attempt after an upgrade, pass an environment file containing:
parameter_defaults:
UpdateIdentifier:
Result: update will not be triggered on that or future deploy/scale operations
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-05-02 17:47:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
WIP patches (not tested manually yet) sumbitted to tripleo-common and python-tripleoclient. We'll also need a tripleo-heat-templates patch which i'm working on now. And i'll try to test those together. Adding my test here as it seems to have the same cause:
Doing openstack overcloud deploy after the upgrade is finished resulted in the following error:
The upgrade command at step 6:
stack@instack:~>>> cat deploy.ha.ceph.ipv6.ssl.8.0-step6
export THT=~/templates/my-overcloud-8.0
openstack overcloud deploy --templates $THT \
-e $THT/environments/network-isolation-v6.yaml \
-e ~/templates/network-environment-8.0-v6.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e $THT/environments/major-upgrade-pacemaker-converge.yaml \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 2 \
--ntp-server clock.redhat.com \
--libvirt-type qemu
Removing major-upgrade-pacemaker-converge.yaml and running deploy:
stack@instack:~>>> cat deploy.ha.ceph.ipv6.ssl.8.0
export THT=~/templates/my-overcloud-8.0
openstack overcloud deploy --templates $THT \
-e $THT/environments/network-isolation-v6.yaml \
-e ~/templates/network-environment-8.0-v6.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 2 \
--ntp-server clock.redhat.com \
--libvirt-type qemu
ended with:
{
"status": "FAILED",
"server_id": "9e1aca53-99b3-4d00-a825-678ef7c9348c",
"config_id": "cd969547-ca04-44b8-b819-1adf729e63fd",
"output_values": {
"deploy_stdout": "Started yum_update.sh on server 9e1aca53-99b3-4d00-a825-678ef7c9348c at Thu Apr 7 12:52:02 UTC 2016\nDumping Pacemaker config\nChecking for missing constraints\n start openstack-nova-novncproxy-clone then start openstack-nova-api-clone (kind:Mandatory)\n start rabbitmq-clone then start openstack-keystone-clone (kind:Mandatory)\n promote galera-master then start openstack-keystone-clone (kind:Mandatory)\n Clone Set: haproxy-clone [haproxy]\n start haproxy-clone then start openstack-keystone-clone (kind:Mandatory)\n start memcached-clone then start openstack-keystone-clone (kind:Mandatory)\n promote redis-master then start openstack-ceilometer-central-clone (kind:Mandatory) (Options: require-all=false)\n start neutron-server-clone then start neutron-openvswitch-agent-clone (kind:Mandatory)\nresource-stickiness: INFINITY\nSetting resource start/stop timeouts\nMaking sure rabbitmq has the notify=true meta parameter\nApplying new Pacemaker config\nERROR failed to apply new pacemaker config\n",
"deploy_stderr": "Error: unable to push cib\nCall cib_replace failed (-205): Update was older than existing configuration\n\n",
"update_managed_packages": "false",
"deploy_status_code": 1
},
"creation_time": "2016-04-06T10:47:19",
"updated_time": "2016-04-07T12:52:21",
"input_values": {},
"action": "UPDATE",
"status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1",
"id": "37bb7abe-f286-44d3-8089-0e5f798815b0"
}
I'm able to scale down from the failed scale up just fine. Tried several times already. No blocker, there is revert back, quick 0 day or close to that day async is fine. T-h-t part submitted too [1] but to allow the upgrades job to pass upstream, 2 more patches need to be merged first [2][3]. The upstream upgrades job doesn't test full upgrades just yet, but it tests a stack-update, so it does test the involved code at least partially. [1] https://review.openstack.org/#/c/304094 [2] https://review.openstack.org/#/c/296592 [3] https://review.openstack.org/#/c/304592 After adding "UpdateIdentifier:" to the parameter_defaults section of the included environmant file, was able to re-run the deployment command successfully post upgrade. |
rhel-osp-director: Scaling of computes post upgrade 7.3->8.0 failes with error: yum -y update returned 1 instead of one of [0] Environment: openstack-tripleo-heat-templates-0.8.14-5.el7ost.noarch openstack-puppet-modules-7.0.17-1.el7ost.noarch instack-undercloud-2.2.7-2.el7ost.noarch openstack-tripleo-heat-templates-kilo-0.8.14-5.el7ost.noarch Steps to reproduce: 1. deploy overcloud 7.3 with 1 compute and popuate it with objects. 2. upgrade the setup to 8.0 3. Attemp to scale computes from 1 to 3. Result: Stack overcloud UPDATE_FAILED Heat Stack update failed. Could not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\nCould not retrieve fact='apache_version', resolution='<anonymous>': undefined method `[]' for nil:NilClass\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::host'; class ::nova::vncproxy has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::vncproxy_protocol'; class ::nova::vncproxy has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::port'; class ::nova::vncproxy has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::vncproxy_path'; class ::nova::vncproxy has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer::Agent::Compute]): This class is deprecated. Please use ceilometer::agent::polling with compute namespace instead.\u001b[0m\n\u001b[1;31mWarning: The package type's allow_virtual parameter will be changing its default value from false to true in a future release. If you do not want to allow virtual packages, please explicitly set allow_virtual to false.\n (at /usr/share/ruby/vendor_ruby/puppet/type.rb:816:in `set_default')\u001b[0m\n\u001b[1;31mError: yum -y update returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Tripleo::Packages/Exec[package-upgrade]/returns: change from notrun to 0 failed: yum -y update returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Vswitch::Ovs/Service[openvswitch]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Ntp::Service/Service[ntp]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Ntp/Anchor[ntp::end]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron::Plugins::Ovs::Bridge[datacentre:br-ex]/Vs_bridge[br-ex]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Package[neutron-ovs-agent]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[ovs/local_ip]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/extensions]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/enable_distributed_routing]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[ovs/tunnel_bridge]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/prevent_arp_spoofing]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[ovs/enable_tunneling]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[ovs/integration_bridge]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[securitygroup/firewall_driver]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Service[ovs-cleanup-service]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[ovs/bridge_mappings]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/arp_responder]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/polling_interval]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/drop_flows_on_start]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/vxlan_udp_port]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Snmp/Service[snmptrapd]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute::Libvirt/Service[messagebus]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute::Libvirt/Service[libvirt]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/l2_population]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Snmp/Service[snmpd]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Neutron_agent_ovs[agent/tunnel_types]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova/Exec[networking-refresh]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute::Rbd/File[/etc/nova/secret.xml]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute::Rbd/Exec[get-or-set virsh secret]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Nova::Compute::Rbd/Exec[set-secret-value virsh]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Ceilometer::Agent::Compute/Service[ceilometer-agent-compute]: Skipping because of failed dependencies\u001b[0m\n", "deploy_status_code": 6 The issue reproduces. When I attempt to run "yum update" on the new computes, I get: Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. There are no enabled repos. Run "yum repolist all" to see the repos you have. You can enable repos with yum-config-manager --enable <repo> And the exit code from the above command is "1". But should it fail the scale up?