Description of problem: minor upgrade OSP12 failed: nova_compute service is stuck in restarting state. Version-Release number of selected component (if applicable): $ cat sh_-c_rpm_--nodigest_-qa_--qf_NAME_-_VERSION_-_RELEASE_._ARCH_INSTALLTIME_date_awk_-F_printf_-59s_s_n_1_2_sort_-f | grep nova* novnc-0.6.1-1.el7ost.noarch Thu Dec 7 01:32:09 2017 openstack-nova-api-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:01 2017 openstack-nova-common-16.0.2-3.el7ost.noarch Thu Dec 7 01:59:15 2017 openstack-nova-compute-16.0.2-3.el7ost.noarch Thu Dec 7 01:59:56 2017 openstack-nova-conductor-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:01 2017 openstack-nova-console-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:00 2017 openstack-nova-migration-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:00 2017 openstack-nova-novncproxy-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:00 2017 openstack-nova-placement-api-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:01 2017 openstack-nova-scheduler-16.0.2-3.el7ost.noarch Thu Dec 7 02:00:00 2017 puppet-nova-11.4.0-2.el7ost.noarch Thu Dec 7 02:09:00 2017 python-nova-16.0.2-3.el7ost.noarch Thu Dec 7 01:55:05 2017 python-novaclient-9.1.1-1.el7ost.noarch Thu Dec 7 01:47:00 2017 How reproducible: NA Steps to Reproduce: NA Actual results: # docker restart nova_compute nova_compute status is stuck in Restarting... ------------- [heat-admin@os1-prd-nova09 nova]$ sudo docker ps | grep nova 5aa3306eb77e 10.7.103.10:8787/rhosp12/openstack-nova-libvirt:12.0-20180104.2 "kolla_start" 12 days ago Up 15 hours nova_libvirt cae8d51fbf93 10.7.103.10:8787/rhosp12/openstack-nova-compute:12.0-20180104.2 "kolla_start" 6 weeks ago Up 15 hours (unhealthy) nova_migration_target 59f92983cab7 10.7.103.10:8787/rhosp12/openstack-nova-compute:12.0-20180104.2 "kolla_start" 6 weeks ago Restarting (1) About an hour ago nova_compute nova-compute.log file -------------- 2018-09-09 13:27:16.088 1 ERROR nova.service [-] Service error occurred during cleanup_host: ValueError: Field value network-vif-plugged-cd5fcd76-53b1-47a9-9b95 is invalid 2018-09-09 13:27:16.088 1 ERROR nova.service Traceback (most recent call last): 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/nova/service.py", line 266, in stop 2018-09-09 13:27:16.088 1 ERROR nova.service self.manager.cleanup_host() 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1174, in cleanup_host 2018-09-09 13:27:16.088 1 ERROR nova.service self.instance_events.cancel_all_events() 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 414, in cancel_all_events 2018-09-09 13:27:16.088 1 ERROR nova.service tag=tag, data={}) 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 307, in __init__ 2018-09-09 13:27:16.088 1 ERROR nova.service setattr(self, key, kwargs[key]) 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 72, in setter 2018-09-09 13:27:16.088 1 ERROR nova.service field_value = field.coerce(self, name, value) 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/fields.py", line 195, in coerce 2018-09-09 13:27:16.088 1 ERROR nova.service return self._type.coerce(obj, attr, value) 2018-09-09 13:27:16.088 1 ERROR nova.service File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/fields.py", line 317, in coerce 2018-09-09 13:27:16.088 1 ERROR nova.service raise ValueError(msg) 2018-09-09 13:27:16.088 1 ERROR nova.service ValueError: Field value network-vif-plugged-cd5fcd76-53b1-47a9-9b95 is invalid 2018-09-09 13:27:16.088 1 ERROR nova.service (END) Expected results: nova_compute service should get started so that minor upgrade can be carried out. Additional info: This seems similar to upstream https://bugs.launchpad.net/nova/+bug/1760303 and fixed in openstack/nova 18.0.0.0b1 Can we backport this to RHOSP12? Or can we have some workaround to get the nova-compute service started?
This issue should not be resolved in the above build. Would it be possible to test this and ensure it resolve the issue highlighted?
According to our records, this should be resolved by openstack-nova-16.1.5-3.el7ost. This build is available now.
Moving to verified. After an update to the right version of nova, the container stayed running. The main steps from the update logs are below: [root@compute-0 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9514ff506bdb 192.168.24.1:8787/rhosp12/openstack-cron:2018-11-03.1 "kolla_start" 17 minutes ago Up 17 minutes logrotate_crond 8accb565c682 192.168.24.1:8787/rhosp12/openstack-nova-compute:2018-11-03.1 "kolla_start" 18 minutes ago Up 17 minutes (healthy) nova_migration_target 4f899ab7ee85 192.168.24.1:8787/rhosp12/openstack-ceilometer-compute:2018-11-03.1 "kolla_start" 18 minutes ago Up 18 minutes ceilometer_agent_compute fed2d16fbd1c 192.168.24.1:8787/rhosp12/openstack-nova-compute:2018-11-03.1 "kolla_start" 18 minutes ago Up 18 minutes (healthy) nova_compute 0fc5db71bb9f 192.168.24.1:8787/rhosp12/openstack-iscsid:2018-11-03.1 "kolla_start" 23 minutes ago Up 23 minutes iscsid 163ad80f87ac 192.168.24.1:8787/rhosp12/openstack-nova-libvirt:2018-11-03.1 "kolla_start" 23 minutes ago Up 23 minutes nova_libvirt e4a443c91298 192.168.24.1:8787/rhosp12/openstack-nova-libvirt:2018-11-03.1 "kolla_start" 23 minutes ago Up 23 minutes nova_virtlogd [root@compute-0 ~]# docker exec -u root -it nova_compute /bin/bash ()[root@compute-0 /]# yum list installed | grep nova openstack-nova-common.noarch 1:16.1.4-6.el7ost @rhos-12.0 openstack-nova-compute.noarch 1:16.1.4-6.el7ost @rhos-12.0 openstack-nova-migration.noarch 1:16.1.4-6.el7ost @rhos-12.0 python-nova.noarch 1:16.1.4-6.el7ost @rhos-12.0 python-novaclient.noarch 1:9.1.2-1.el7ost @rhos-12.0 (undercloud) [stack@undercloud-0 ~]$ openstack undercloud upgrade ############################################################################# Undercloud upgrade complete. The file containing this installation's passwords is at /home/stack/undercloud-passwords.conf. There is also a stackrc file at /home/stack/stackrc. These files are needed to interact with the OpenStack services, and should be secured. ############################################################################# (undercloud) [stack@undercloud-0 ~]$ sudo reboot (undercloud) [stack@undercloud-0 ~]$ yum list installed | grep nova openstack-nova-api.noarch 1:16.1.5-3.el7ost @rhelosp-12.0-puddle openstack-nova-common.noarch 1:16.1.5-3.el7ost @rhelosp-12.0-puddle openstack-nova-compute.noarch 1:16.1.5-3.el7ost @rhelosp-12.0-puddle openstack-nova-conductor.noarch openstack-nova-placement-api.noarch openstack-nova-scheduler.noarch puppet-nova.noarch 11.5.0-6.el7ost @rhelosp-12.0-puddle python-nova.noarch 1:16.1.5-3.el7ost @rhelosp-12.0-puddle python-novaclient.noarch 1:9.1.2-1.el7ost @rhelosp-12.0-puddle (undercloud) [stack@undercloud-0 ~]$ openstack overcloud container image upload --verbose --config-file my-raw-images.yaml 2>&1 | tee my-upload.log (undercloud) [stack@undercloud-0 ~]$ openstack overcloud update stack --init-minor-update --container-registry-file local-docker-img.yaml 2>&1 | tee oc-update-init.log 2018-12-19 20:45:37Z [AllNodesDeploySteps]: UPDATE_COMPLETE state changed 2018-12-19 20:45:52Z [overcloud]: UPDATE_COMPLETE Stack UPDATE completed successfully Stack overcloud UPDATE_COMPLETE Heat stack update init on overcloud complete. Started Mistral Workflow tripleo.package_update.v1.get_config. Execution ID: 1ab7e71b-8c39-4aae-a9a5-0745fa35b481 Success Init minor update on stack overcloud complete. (undercloud) [stack@undercloud-0 ~]$ openstack overcloud update stack --nodes Controller 2>&1 | tee oc-update-Controller.log u'PLAY RECAP *********************************************************************', u'192.168.24.13 : ok=131 changed=65 unreachable=0 failed=0 ', u''] Success (undercloud) [stack@undercloud-0 ~]$ openstack overcloud update stack --nodes Compute 2>&1 | tee oc-update-Compute.log u'PLAY RECAP *********************************************************************', u'192.168.24.16 : ok=60 changed=14 unreachable=0 failed=0 ', u''] Success (undercloud) [stack@undercloud-0 ~]$ sudo reboot [root@compute-0 ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b9e9068eb4a2 192.168.24.1:8787/rhosp12/openstack-cron:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes logrotate_crond c8b0e4aface9 192.168.24.1:8787/rhosp12/openstack-nova-compute:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes (healthy) nova_migration_target 58e4af1f0940 192.168.24.1:8787/rhosp12/openstack-ceilometer-compute:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes ceilometer_agent_compute 266bbfdc0eba 192.168.24.1:8787/rhosp12/openstack-nova-compute:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes (healthy) nova_compute 307a3310ca86 192.168.24.1:8787/rhosp12/openstack-iscsid:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes iscsid 92a48f1afe78 192.168.24.1:8787/rhosp12/openstack-nova-libvirt:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes nova_libvirt 4a2d2f147dde 192.168.24.1:8787/rhosp12/openstack-nova-libvirt:2018-11-28.1 "kolla_start" 10 minutes ago Up 10 minutes nova_virtlogd [root@compute-0 ~]# docker exec -u root -it nova_compute /bin/bash ()[root@compute-0 /]# yum list installed | grep nova openstack-nova-common.noarch 1:16.1.5-3.el7ost @rhos-12.0 openstack-nova-compute.noarch 1:16.1.5-3.el7ost @rhos-12.0 openstack-nova-migration.noarch 1:16.1.5-3.el7ost @rhos-12.0 python-nova.noarch 1:16.1.5-3.el7ost @rhos-12.0 python-novaclient.noarch 1:9.1.2-1.el7ost @rhos-12.0