Description of problem: OSP10 -> OSP11 upgrade: the upgrade process is stuck for ~13 minutes waiting for nova services to get started after packages were updated: We can see in yum.log - see the diference between the time nova-api got updated 16:47:00 and the time when the following package got updated - 16:59:53 eb 26 16:47:00 Updated: 1:openstack-nova-common-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-novncproxy-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-console-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-cert-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-conductor-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-scheduler-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-compute-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:47:00 Updated: 1:openstack-nova-api-15.0.1-0.20170222170803.10a32dd.el7ost.noarch Feb 26 16:59:53 Updated: python-oslo-serialization-2.16.0-0.20170208094041.ac5787d.el7ost.noarch During this timeframe: [root@overcloud-controller-0 heat-admin]# systemctl list-units | grep nova openstack-nova-api.service loaded activating start start OpenStack Nova API Server openstack-nova-consoleauth.service loaded activating start start OpenStack Nova VNC console auth Server openstack-nova-novncproxy.service loaded active running OpenStack Nova NoVNC Proxy Server openstack-nova-scheduler.service loaded activating start start OpenStack Nova Scheduler Server [root@overcloud-controller-0 heat-admin]# tail -5 /var/log/nova/nova-api.log 2017-02-26 16:56:12.351 383452 ERROR nova value = self._do_get(name, group, namespace) 2017-02-26 16:56:12.351 383452 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2715, in _do_get 2017-02-26 16:56:12.351 383452 ERROR nova % (opt.name, str(ve))) 2017-02-26 16:56:12.351 383452 ERROR nova ConfigFileValueError: Value for option osapi_compute_workers is not valid: Should be greater than or equal to 1 2017-02-26 16:56:12.351 383452 ERROR nova [root@overcloud-controller-0 heat-admin]# tail -5 /var/log/nova/nova-consoleauth.log 2017-02-26 16:56:17.420 383492 ERROR nova value = self._do_get(name, group, namespace) 2017-02-26 16:56:17.420 383492 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2715, in _do_get 2017-02-26 16:56:17.420 383492 ERROR nova % (opt.name, str(ve))) 2017-02-26 16:56:17.420 383492 ERROR nova ConfigFileValueError: Value for option metadata_workers is not valid: Should be greater than or equal to 1 2017-02-26 16:56:17.420 383492 ERROR nova [root@overcloud-controller-0 heat-admin]# tail -5 /var/log/nova/nova-scheduler.log 2017-02-26 16:55:20.597 382652 WARNING oslo_db.sqlalchemy.engines [req-40dc07a2-5c32-429d-8b9a-57254463e473 - - - - -] SQL connection failed. 6 attempts left. 2017-02-26 16:55:33.608 382652 WARNING oslo_db.sqlalchemy.engines [req-40dc07a2-5c32-429d-8b9a-57254463e473 - - - - -] SQL connection failed. 5 attempts left. 2017-02-26 16:55:46.619 382652 WARNING oslo_db.sqlalchemy.engines [req-40dc07a2-5c32-429d-8b9a-57254463e473 - - - - -] SQL connection failed. 4 attempts left. 2017-02-26 16:55:59.637 382652 WARNING oslo_db.sqlalchemy.engines [req-40dc07a2-5c32-429d-8b9a-57254463e473 - - - - -] SQL connection failed. 3 attempts left. 2017-02-26 16:56:12.648 382652 WARNING oslo_db.sqlalchemy.engines [req-40dc07a2-5c32-429d-8b9a-57254463e473 - - - - -] SQL connection failed. 2 attempts left. We can notice the nova processes CPU usage during this timeframe but the overall upgrade process doesn't show any progress at all: [root@overcloud-controller-0 heat-admin]# pidstat 1 Linux 3.10.0-514.6.1.el7.x86_64 (overcloud-controller-0.localdomain) 02/26/2017 _x86_64_ (4 CPU) 04:58:48 PM UID PID %usr %system %guest %CPU CPU Command 04:58:49 PM 0 1 0.00 0.99 0.00 0.99 3 systemd 04:58:49 PM 0 9 0.00 0.99 0.00 0.99 0 rcu_sched 04:58:49 PM 996 139233 0.99 0.00 0.00 0.99 2 neutron-server 04:58:49 PM 0 349039 6.93 0.00 0.00 6.93 0 ansible-playboo 04:58:49 PM 162 384699 93.07 6.93 0.00 100.00 2 nova-consoleaut 04:58:49 PM 0 384706 0.00 0.99 0.00 0.99 0 pidstat 04:58:49 PM 162 384709 59.41 7.92 0.00 67.33 3 nova-scheduler 04:58:49 PM 162 384710 46.53 7.92 0.00 54.46 1 nova-api 04:58:49 PM UID PID %usr %system %guest %CPU CPU Command 04:58:50 PM 0 544 0.00 1.00 0.00 1.00 0 systemd-journal 04:58:50 PM 0 1045 0.00 1.00 0.00 1.00 0 ovs-vswitchd 04:58:50 PM 0 349039 9.00 1.00 0.00 10.00 1 ansible-playboo 04:58:50 PM 162 384699 99.00 0.00 0.00 99.00 2 nova-consoleaut 04:58:50 PM 162 384709 94.00 6.00 0.00 100.00 3 nova-scheduler 04:58:50 PM 162 384710 92.00 6.00 0.00 98.00 1 nova-api 04:58:50 PM UID PID %usr %system %guest %CPU CPU Command 04:58:51 PM 0 990 1.00 0.00 0.00 1.00 2 ovsdb-server 04:58:51 PM 0 349039 9.00 1.00 0.00 10.00 0 ansible-playboo 04:58:51 PM 0 384706 0.00 1.00 0.00 1.00 0 pidstat 04:58:51 PM 162 384709 99.00 0.00 0.00 99.00 3 nova-scheduler 04:58:51 PM 162 384710 100.00 0.00 0.00 100.00 2 nova-api 04:58:51 PM 162 384731 47.00 8.00 0.00 55.00 1 nova-consoleaut 04:58:51 PM UID PID %usr %system %guest %CPU CPU Command 04:58:52 PM 0 349039 9.00 0.00 0.00 9.00 3 ansible-playboo 04:58:52 PM 162 384709 1.00 0.00 0.00 1.00 3 nova-scheduler 04:58:52 PM 162 384710 94.00 7.00 0.00 100.00 2 nova-api 04:58:52 PM 162 384731 92.00 7.00 0.00 99.00 0 nova-consoleaut 04:58:52 PM UID PID %usr %system %guest %CPU CPU Command 04:58:53 PM 0 1045 1.00 0.00 0.00 1.00 0 ovs-vswitchd 04:58:53 PM 0 349039 8.00 1.00 0.00 9.00 2 ansible-playboo 04:58:53 PM 162 380921 1.00 0.00 0.00 1.00 1 nova-novncproxy 04:58:53 PM 0 384706 1.00 0.00 0.00 1.00 1 pidstat 04:58:53 PM 162 384731 100.00 0.00 0.00 100.00 0 nova-consoleaut 04:58:53 PM 162 384741 49.00 6.00 0.00 55.00 3 nova-api 04:58:53 PM UID PID %usr %system %guest %CPU CPU Command 04:58:54 PM 0 349039 9.00 1.00 0.00 10.00 3 ansible-playboo 04:58:54 PM 0 384706 0.00 1.00 0.00 1.00 1 pidstat 04:58:54 PM 162 384741 92.00 7.00 0.00 99.00 0 nova-api 04:58:54 PM 162 384749 70.00 10.00 0.00 80.00 2 nova-consoleaut 04:58:54 PM UID PID %usr %system %guest %CPU CPU Command 04:58:55 PM 0 349039 9.00 0.00 0.00 9.00 0 ansible-playboo 04:58:55 PM 0 384706 0.00 1.00 0.00 1.00 2 pidstat 04:58:55 PM 162 384741 98.00 3.00 0.00 100.00 3 nova-api 04:58:55 PM 162 384749 95.00 5.00 0.00 100.00 1 nova-consoleaut 04:58:55 PM UID PID %usr %system %guest %CPU CPU Command 04:58:56 PM 0 1 0.00 1.00 0.00 1.00 2 systemd 04:58:56 PM 0 9 0.00 1.00 0.00 1.00 0 rcu_sched 04:58:56 PM 81 690 1.00 0.00 0.00 1.00 1 dbus-daemon 04:58:56 PM 996 139236 1.00 0.00 0.00 1.00 2 neutron-server 04:58:56 PM 0 349039 8.00 1.00 0.00 9.00 0 ansible-playboo 04:58:56 PM 162 384741 98.00 1.00 0.00 99.00 3 nova-api 04:58:56 PM 162 384761 5.00 0.00 0.00 5.00 1 nova-consoleaut 04:58:56 PM UID PID %usr %system %guest %CPU CPU Command 04:58:57 PM 0 349039 9.00 0.00 0.00 9.00 3 ansible-playboo 04:58:57 PM 0 383753 0.00 1.00 0.00 1.00 2 systemctl 04:58:57 PM 162 384761 90.00 10.00 0.00 100.00 0 nova-consoleaut 04:58:57 PM 162 384769 24.00 5.00 0.00 29.00 1 nova-api 04:58:57 PM UID PID %usr %system %guest %CPU CPU Command 04:58:58 PM 0 349039 9.00 1.00 0.00 10.00 0 ansible-playboo 04:58:58 PM 0 384706 1.00 1.00 0.00 2.00 1 pidstat 04:58:58 PM 162 384761 96.00 4.00 0.00 100.00 2 nova-consoleaut 04:58:58 PM 162 384769 92.00 8.00 0.00 100.00 3 nova-api 04:58:58 PM UID PID %usr %system %guest %CPU CPU Command 04:58:59 PM 0 544 0.00 1.00 0.00 1.00 1 systemd-journal 04:58:59 PM 0 687 1.00 0.00 0.00 1.00 2 systemd-logind 04:58:59 PM 0 349039 9.00 1.00 0.00 10.00 0 ansible-playboo 04:58:59 PM 162 384769 98.00 2.00 0.00 100.00 3 nova-api 04:58:59 PM 162 384782 5.00 1.00 0.00 6.00 2 nova-consoleaut 04:58:59 PM UID PID %usr %system %guest %CPU CPU Command 04:59:00 PM 0 349039 9.00 1.00 0.00 10.00 2 ansible-playboo 04:59:00 PM 0 384706 0.00 1.00 0.00 1.00 1 pidstat 04:59:00 PM 162 384769 96.00 5.00 0.00 100.00 3 nova-api 04:59:00 PM 162 384782 88.00 10.00 0.00 98.00 0 nova-consoleaut 04:59:00 PM UID PID %usr %system %guest %CPU CPU Command 04:59:01 PM 0 9 0.00 1.00 0.00 1.00 3 rcu_sched 04:59:01 PM 0 349039 9.00 0.00 0.00 9.00 2 ansible-playboo 04:59:01 PM 162 384782 96.00 4.00 0.00 100.00 1 nova-consoleaut 04:59:01 PM 162 384792 25.00 5.00 0.00 30.00 0 nova-api 04:59:01 PM UID PID %usr %system %guest %CPU CPU Command 04:59:02 PM 0 1 1.00 0.00 0.00 1.00 2 systemd 04:59:02 PM 0 655 0.00 1.00 0.00 1.00 3 auditd 04:59:02 PM 0 349039 8.00 2.00 0.00 10.00 2 ansible-playboo 04:59:02 PM 0 384706 0.00 1.00 0.00 1.00 1 pidstat 04:59:02 PM 162 384792 91.00 8.00 0.00 99.00 0 nova-api 04:59:02 PM 162 384800 5.00 1.00 0.00 6.00 3 nova-consoleaut 04:59:02 PM UID PID %usr %system %guest %CPU CPU Command 04:59:03 PM 0 349039 9.00 0.00 0.00 9.00 0 ansible-playboo 04:59:03 PM 162 384792 99.00 1.00 0.00 100.00 3 nova-api 04:59:03 PM 162 384800 87.00 12.00 0.00 99.00 2 nova-consoleaut Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-6.0.0-0.20170222195630.46117f4.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Upgrade OSP10->OSP11 2. watch openstack stack list --nested | grep PROGRESS Actual results: At some point the nested stacks don't change the status. Checking the controller for upgrade related activity we can notice in the yum.log that openstack-nova-api was the last package updated and it doesn't go further. Expected results: I'm not sure exactly what the yum update is waiting for but it looks the upgrade process is stuck during this large timeframe. Additional info:
The reordering of the step should fix the issue here. Waiting for Marius for confirmation.
(In reply to Sofer Athlan-Guyot from comment #1) > The reordering of the step should fix the issue here. Waiting for Marius > for confirmation. This is fixed in latest build: Mar 22 10:21:42 Updated: 1:openstack-nova-common-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-scheduler-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Installed: 1:openstack-nova-placement-api-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-conductor-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-compute-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-cert-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-console-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-api-15.0.0-2.el7ost.noarch Mar 22 10:21:43 Updated: 1:openstack-nova-novncproxy-15.0.0-2.el7ost.noarch Mar 22 10:22:00 Updated: python-oslo-serialization-2.16.0-0.20170208094041.ac5787d.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245