Summary: | OSP11 -> OSP12 upgrade: libvirtd service is running on host after upgrade and nova_libvirt container keeps restarting | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | openstack-tripleo-heat-templates | Assignee: | Emilien Macchi <emacchi> |
Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 12.0 (Pike) | CC: | dbecker, jfrancoa, mandreou, mbultel, mburns, morazi, owalsh, rhel-osp-director-maint, sathlang, tvignaud |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-7.0.0-0.20170805163046.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-13 21:48:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | |||
Bug Blocks: | Red Hat1399762 |
Description
Marius Cornea
2017-08-01 08:46:33 UTC
o/ @mcornea - marking as triaged and first pass here, questions please: 1. can you confirm what is in your roles_data.yaml? in particular do you have disable_upgrade_deployment set and for which roles please? ` 2. can you confirm your upgrade workflow and env files (i.e. environments/major-upgrade-composable-steps-docker.yaml , then upgrade-non-controller.sh for computes afaics from comment 0 which is when this happens/is seen. So I see on master we still have the disable_upgrade_deployment flag [1] and the tripleo_upgrade_node.sh [2] is still being delivered [3] to the nodes based on that flag. There _is_ an appropriate "stop and disable libvirtd service" ansible task @ [4] but it isn't being executed during the upgrade, again because of that flag. I have just posted [5] (and adding to trackers & the upstream bug for it) which adds the systemctl stop and disable into the tripleo_upgrade_node.sh. Not sure that is all that is needed though, but its a start. In particular I'm concerned that only puppet is being executed in that tripleo_upgrade_node.sh [2] and not the docker tasks (I guess those are happening on converge?) but lets see after testing with [5] thanks, marios [1] https://github.com/openstack/tripleo-heat-templates/blob/5f313f27c9120b0e3bac905d155c2b6d234d27bb/roles/Compute.yaml#L13 [2] https://github.com/openstack/tripleo-heat-templates/blob/29a8a46d9833f095d503941d32ec500f63abf675/extraconfig/tasks/tripleo_upgrade_node.sh [3] https://github.com/openstack/tripleo-heat-templates/blob/c54e9b681b44ab962c4503cf1d88c44b683a972e/puppet/major_upgrade_steps.j2.yaml#L41 [4] https://github.com/openstack/tripleo-heat-templates/blob/a8442ba386082cef7188c3ff8001f8995b1d7ff7/docker/services/nova-libvirt.yaml#L181-L184 [5] https://review.openstack.org/489619 (In reply to marios from comment #1) > o/ @mcornea - marking as triaged and first pass here, questions please: > > 1. can you confirm what is in your roles_data.yaml? in particular do you > have disable_upgrade_deployment set and for which roles please? I was using the default roles_data.yaml provided by tht so disable_upgrade_deployment was set for compute and object store role. > 2. can you confirm your upgrade workflow and env files (i.e. > environments/major-upgrade-composable-steps-docker.yaml , then > upgrade-non-controller.sh for computes afaics from comment 0 which is when > this happens/is seen. 1st - the major-upgrade-composable-steps-docker: openstack overcloud deploy \ --templates /usr/share/openstack-tripleo-heat-templates \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml \ -e /home/stack/docker-osp12.yaml \ then compute upgrade: upgrade-non-controller.sh --upgrade compute-0 > So I see on master we still have the disable_upgrade_deployment flag [1] and > the tripleo_upgrade_node.sh [2] is still being delivered [3] to the nodes > based on that flag. There _is_ an appropriate "stop and disable libvirtd > service" ansible task @ [4] but it isn't being executed during the upgrade, > again because of that flag. > > I have just posted [5] (and adding to trackers & the upstream bug for it) > which adds the systemctl stop and disable into the tripleo_upgrade_node.sh. > Not sure that is all that is needed though, but its a start. In particular > I'm concerned that only puppet is being executed in that > tripleo_upgrade_node.sh [2] and not the docker tasks (I guess those are > happening on converge?) but lets see after testing with [5] > With the patch applied I wasn't able to reproduce the initial error anymore so it looks good. Code merged upstream. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |