rhel-osp-director: 7.3->8.0 upgrade fails with ERROR: Timed out waiting for a reply to message ID 84a44ca3ed724eda991ba689cc364852. Environment: openstack-tripleo-heat-templates-kilo-0.8.9-1.el7ost.noarch instack-undercloud-2.2.4-1.el7ost.noarch openstack-puppet-modules-7.0.12-1.el7ost.noarch openstack-tripleo-heat-templates-0.8.9-1.el7ost.noarch Steps to reproduce: 1. Deploy 7.3 (3 controllers +2 computes) with network isolation. Deployment command: openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server x.x.x.x --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml 2. Upgrade the undercloud to 8.0 3. Attempt to update the overcloud with: openstack overcloud deploy --templates tripleo-heat-templates -e tripleo-heat-templates/overcloud-resource-registry-puppet.yaml -e tripleo-heat-templates/environments/puppet-pacemaker.yaml -e tripleo-heat-templates/environments/network-isolation.yaml -e tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e network-environment.yaml -e tripleo-heat-templates/environments/major-upgrade-script-delivery.yaml Result: 2016-03-07 15:31:50 [NodeTLSData]: UPDATE_COMPLETE state changed 2016-03-07 15:31:51 [ControllerConfig]: UPDATE_IN_PROGRESS state changed 2016-03-07 15:31:52 [NetworkConfig]: UPDATE_COMPLETE state changed 2016-03-07 15:31:52 [NodeTLSCAData]: UPDATE_IN_PROGRESS state changed 2016-03-07 15:31:53 [ControllerConfig]: CREATE_IN_PROGRESS state changed 2016-03-07 15:31:54 [ControllerConfig]: CREATE_COMPLETE state changed 2016-03-07 15:31:55 [NodeTLSCAData]: UPDATE_COMPLETE state changed 2016-03-07 15:31:55 [NodeTLSData]: UPDATE_IN_PROGRESS state changed 2016-03-07 15:31:55 [ControllerDeployment]: UPDATE_IN_PROGRESS state changed 2016-03-07 15:31:57 [NodeTLSData]: UPDATE_COMPLETE state changed 2016-03-07 15:31:57 [ControllerConfig]: UPDATE_IN_PROGRESS state changed 2016-03-07 15:31:58 [ControllerConfig]: CREATE_IN_PROGRESS state changed 2016-03-07 15:31:59 [ControllerConfig]: CREATE_COMPLETE state changed 2016-03-07 15:32:19 [UpdateDeployment]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-03-07 15:32:19 [UpdateDeployment]: UPDATE_COMPLETE state changed 2016-03-07 15:32:20 [ControllerDeployment]: UPDATE_IN_PROGRESS state changed Broadcast message from systemd-journald (Mon 2016-03-07 12:48:12 EST): haproxy[27435]: proxy ironic has no server available! ERROR: Timed out waiting for a reply to message ID 84a44ca3ed724eda991ba689cc364852 Checking the os-collect-config for errors - (repeating messages): Mar 07 19:04:10 overcloud-controller-0.localdomain os-collect-config[3829]: 2016-03-07 19:04:10.710 3829 WARNING os_collect_config.ec2 [-] 500 Server Error: Internal Server Error Mar 07 19:04:41 overcloud-controller-0.localdomain os-collect-config[3829]: 2016-03-07 19:04:41.352 3829 WARNING os_collect_config.ec2 [-] 500 Server Error: Internal Server Error Mar 07 19:05:12 overcloud-controller-0.localdomain os-collect-config[3829]: 2016-03-07 19:05:12.036 3829 WARNING os_collect_config.ec2 [-] 500 Server Error: Internal Server Error Mar 07 19:05:42 overcloud-controller-0.localdomain os-collect-config[3829]: 2016-03-07 19:05:42.642 3829 WARNING os_collect_config.ec2 [-] 500 Server Error: Internal Server Error Expected result: Successful update of the overcloud.
Created attachment 1133904 [details] nova-api.log
So I can confirm that I've hit this many many times testing the upgrades in a virt environment. The fix discussed on irc yesterday, to restart openstack-nova-api after upgrading the undercloud seems to fix it for me. I've added the restart to the tripleoclient undercloud upgrade @ https://review.openstack.org/#/c/293960/
Verified: Environment: python-tripleoclient-0.3.4-2.el7ost.noarch Was able to upgrade OC 7.3 to 8.0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0604.html