Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1371274

Summary: Osp-Director-10 : Undercloud upgrade OSP9 -> OSP10 fails (Error: /etc/systemd/system.conf:#DefaultTimeoutStartSec=90s too slow? )
Product: Red Hat OpenStack Reporter: Omri Hochman <ohochman>
Component: rhosp-directorAssignee: Angus Thomas <athomas>
Status: CLOSED NOTABUG QA Contact: Omri Hochman <ohochman>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: augol, dbecker, jcoufal, mandreou, mburns, mcornea, morazi, ohochman, rhel-osp-director-maint, sathlang
Target Milestone: ---Keywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-14 15:48:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1337794    
Attachments:
Description Flags
Full-Nova-Compute.log none

Description Omri Hochman 2016-08-29 18:24:44 UTC
Osp-Director-10 :  Undercloud  upgrade  OSP9 -> OSP10 fails (IncompatibleObjectVersion: Version 1.19 of ServiceList is not supported)

Environment: 
------------
instack-5.0.0-0.20160802165724.5aabf5c.el7ost.noarch
instack-undercloud-5.0.0-0.20160818065636.41ef775.el7ost.noarch
puppet-heat-9.1.0-0.20160815142726.d364553.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-33.el7ost.noarch
openstack-heat-api-7.0.0-0.20160822053245.7c70288.el7ost.noarch
openstack-heat-engine-7.0.0-0.20160822053245.7c70288.el7ost.noarch
python-heatclient-1.3.0-0.20160802194627.44dfe53.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160820164503.6c537d2.1.el7ost.noarch
openstack-heat-api-cfn-7.0.0-0.20160822053245.7c70288.el7ost.noarch
openstack-heat-common-7.0.0-0.20160822053245.7c70288.el7ost.noarch
openstack-heat-templates-0.0.1-0.20160802165947.051822a.el7ost.noarch
python-heat-tests-7.0.0-0.20160822053245.7c70288.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch


Steps:
-------
(1) Deploy Latest puddle of OSP9 

(2) Follow the Guide to upgrade undercloud to OSP10 https://gitlab.cee.redhat.com/sathlang/ospd-9-to-10-upgrade

yum repolist -v enabled
sudo systemctl list-units 'openstack-*'

# doesn't need 10-director anymore.
sudo rhos-release -P 10 -d
sudo yum-config-manager --disable 'rhelosp-9.0*'
yum repolist -v enabled

sudo yum update -y

openstack undercloud upgrade

---------
Results : 
---------
Undercloud Upgrade from OSP9 to OSP10 Fails .  

Console-view ( suggest issue with nova-compute) :
-----------------------------------------------

2016-08-24 00:12:03 - Error: Could not start Service[nova-compute]: Execution of '/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed because the control process exited with error code. See "systemctl status openstack-nova-compute.service" and "journalctl -xe" for details.
2016-08-24 00:12:03 - Wrapped exception:
2016-08-24 00:12:03 - Execution of '/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed because the control process exited with error code. See "systemctl status openstack-nova-compute.service" and "journalctl -xe" for details.
2016-08-24 00:12:03 - Error: /Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]/ensure: change from stopped to running failed: Could not start Service[nova-compute]: Execution of '/bin/systemctl start openstack-nova-compute' returned 1: Job for openstack-nova-compute.service failed because the control process exited with error code. See "systemctl status openstack-nova-compute.service" and "journalctl -xe" for details.
2016-08-24 00:12:03 - Notice: /Stage[main]/Nova::Compute/Nova::Generic_service[compute]/Service[nova-compute]: Triggered 'refresh' from 1 events
2016-08-24 00:12:03 - Notice: /Stage[main]/Nova::Deps/Anchor[nova::service::end]: Dependency Service[nova-compute] has failures: true
2016-08-24 00:12:03 - Warning: /Stage[main]/Nova::Deps/Anchor[nova::service::end]: Skipping because of failed dependencies
2016-08-24 00:12:03 - Notice: /Stage[main]/Nova::Deps/Anchor[nova::service::end]: Triggered 'refresh' from 5 events
2016-08-24 00:12:03 - Notice: /Stage[main]/Nova/Exec[networking-refresh]: Dependency Service[nova-compute] has failures: true
2016-08-24 00:12:03 - Warning: /Stage[main]/Nova/Exec[networking-refresh]: Skipping because of failed dependencies
2016-08-24 00:12:13 - Notice: /Stage[main]/Zaqar::Keystone::Auth/Keystone::Resource::Service_identity[zaqar]/Keystone_user_role[zaqar@service]/ensure: created
2016-08-24 00:12:18 - Notice: /Stage[main]/Mistral::Keystone::Auth/Keystone::Resource::Service_identity[mistral]/Keystone_user_role[mistral@service]/ensure: created
2016-08-24 00:12:23 - Notice: /Stage[main]/Zaqar::Keystone::Auth_websocket/Keystone::Resource::Service_identity[zaqar-websocket]/Keystone_user_role[zaqar-websocket@service]/ensure: created
2016-08-24 00:12:46 - Notice: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]/password: changed password
2016-08-24 00:13:07 - Notice: /Stage[main]/Glance::Deps/Anchor[glance::service::begin]: Triggered 'refresh' from 3 events
2016-08-24 00:13:07 - Notice: /Stage[main]/Glance::Registry/Service[glance-registry]: Triggered 'refresh' from 1 events
2016-08-24 00:13:08 - Notice: /Stage[main]/Glance::Api/Service[glance-api]: Triggered 'refresh' from 1 events
2016-08-24 00:13:08 - Notice: /Stage[main]/Glance::Deps/Anchor[glance::service::end]: Triggered 'refresh' from 2 events
2016-08-24 00:13:08 - Notice: /Stage[main]/Swift::Storage::Object/Swift::Service[swift-object-updater]/Service[swift-object-updater]: Triggered 'refresh' from 1 events
2016-08-24 00:13:08 - Notice: /Stage[main]/Swift::Storage::Container/Swift::Service[swift-container-updater]/Service[swift-container-updater]: Triggered 'refresh' from 1 events
2016-08-24 00:13:08 - Notice: /Stage[main]/Swift::Deps/Anchor[swift::service::end]: Triggered 'refresh' from 13 events
2016-08-24 00:13:11 - Notice: /Stage[main]/Neutron::Deps/Anchor[neutron::config::end]: Triggered 'refresh' from 27 events
2016-08-24 00:14:08 - Notice: /Stage[main]/Neutron::Db::Sync/Exec[neutron-db-sync]/returns: executed successfully
2016-08-24 00:14:11 - Notice: /Stage[main]/Neutron::Db::Sync/Exec[neutron-db-sync]: Triggered 'refresh' from 1 events
2016-08-24 00:14:11 - Notice: /Stage[main]/Neutron::Deps/Anchor[neutron::dbsync::end]: Triggered 'refresh' from 2 events
2016-08-24 00:14:11 - Notice: /Stage[main]/Neutron::Deps/Anchor[neutron::service::begin]: Triggered 'refresh' from 2 events
2016-08-24 00:14:49 - Notice: /Stage[main]/Neutron::Server/Service[neutron-server]: Triggered 'refresh' from 1 events
2016-08-24 00:14:50 - Notice: /Stage[main]/Neutron::Agents::Dhcp/Service[neutron-dhcp-service]: Triggered 'refresh' from 1 events
2016-08-24 00:16:21 - Notice: /Stage[main]/Neutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]: Triggered 'refresh' from 1 events
2016-08-24 00:16:21 - Notice: /Stage[main]/Neutron::Deps/Anchor[neutron::service::end]: Triggered 'refresh' from 3 events
2016-08-24 00:16:21 - Notice: /Stage[main]/Heat::Deps/Anchor[heat::config::end]: Triggered 'refresh' from 17 events
2016-08-24 00:17:09 - Notice: /Stage[main]/Heat::Db::Sync/Exec[heat-dbsync]/returns: executed successfully
2016-08-24 00:17:11 - Notice: /Stage[main]/Heat::Db::Sync/Exec[heat-dbsync]: Triggered 'refresh' from 1 events
2016-08-24 00:17:11 - Notice: /Stage[main]/Heat::Deps/Anchor[heat::dbsync::end]: Triggered 'refresh' from 2 events
2016-08-24 00:17:11 - Notice: /Stage[main]/Heat::Deps/Anchor[heat::service::begin]: Triggered 'refresh' from 2 events
2016-08-24 00:17:11 - Notice: /Stage[main]/Heat::Api_cfn/Service[heat-api-cfn]: Triggered 'refresh' from 1 events
2016-08-24 00:17:12 - Notice: /Stage[main]/Heat::Engine/Service[heat-engine]: Triggered 'refresh' from 1 events
2016-08-24 00:17:12 - Notice: /Stage[main]/Heat::Api/Service[heat-api]: Triggered 'refresh' from 1 events
2016-08-24 00:17:12 - Notice: /Stage[main]/Heat::Deps/Anchor[heat::service::end]: Triggered 'refresh' from 3 events
2016-08-24 00:17:13 - Notice: /Firewall[998 log all]: Dependency Service[nova-compute] has failures: true
2016-08-24 00:17:13 - Warning: /Firewall[998 log all]: Skipping because of failed dependencies
2016-08-24 00:17:13 - Notice: /Firewall[999 drop all]: Dependency Service[nova-compute] has failures: true
2016-08-24 00:17:13 - Warning: /Firewall[999 drop all]: Skipping because of failed dependencies
2016-08-24 00:17:14 - Notice: Finished catalog run in 921.03 seconds
+ rc=6
+ set -e
+ echo 'puppet apply exited with exit code 6'
puppet apply exited with exit code 6
+ '[' 6 '!=' 2 -a 6 '!=' 0 ']'
+ exit 6
[2016-08-24 00:17:19,468] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 1]

[2016-08-24 00:17:19,468] (os-refresh-config) [ERROR] Aborting...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1069, in install
    _run_orc(instack_env)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 947, in _run_orc
    _run_live_command(args, instack_env, 'os-refresh-config')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 509, in _run_live_command
    raise RuntimeError('%s failed. See log for details.' % name)
RuntimeError: os-refresh-config failed. See log for details.
Command 'instack-install-undercloud' returned non-zero exit status 1
[stack@undercloud72 ~]$


Checking the service (looks Up and Running ) :
--------------------------------------------------


[root@undercloud72 log]# systemctl status openstack-nova-compute.service
● openstack-nova-compute.service - OpenStack Nova Compute Server
   Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-08-24 00:12:08 EDT; 2h 53min ago
 Main PID: 23156 (nova-compute)
   CGroup: /system.slice/openstack-nova-compute.service
           └─23156 /usr/bin/python2 /usr/bin/nova-compute

Aug 24 00:12:03 undercloud72.localdomain systemd[1]: Starting OpenStack Nova Compute Server...
Aug 24 00:12:08 undercloud72.localdomain systemd[1]: Started OpenStack Nova Compute Server.


less /var/log/nova/nova-compute.log:
-------------------------------------
2016-08-23 23:15:24.260 6256 CRITICAL nova [req-2a6607b8-9829-413c-a79d-45122babd9e8 - - - - -] RemoteError: Remote error: IncompatibleObjectVersion Version 1.19 of ServiceList is not suppor
ted
[u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply\n', u'  File "/usr/lib/python2.7/site
-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch\n    :param in
coming: incoming message\n', u'  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in object_class_action_versions\n    objclass = nova_object.NovaObject.obj_class_
from_name(\n', u'  File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 374, in obj_class_from_name\n', u'IncompatibleObjectVersion: Version 1.19 of ServiceList is not
 supported\n'].
2016-08-23 23:15:24.260 6256 ERROR nova Traceback (most recent call last):
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/bin/nova-compute", line 10, in <module>
2016-08-23 23:15:24.260 6256 ERROR nova     sys.exit(main())
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 60, in main
2016-08-23 23:15:24.260 6256 ERROR nova     db_allowed=CONF.conductor.use_local)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 218, in create
2016-08-23 23:15:24.260 6256 ERROR nova     db_allowed=db_allowed)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 92, in __init__
2016-08-23 23:15:24.260 6256 ERROR nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 535, in __init__
2016-08-23 23:15:24.260 6256 ERROR nova     self.driver = driver.load_compute_driver(self.virtapi, compute_driver)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/driver.py", line 1618, in load_compute_driver
2016-08-23 23:15:24.260 6256 ERROR nova     virtapi)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_utils/importutils.py", line 44, in import_object
2016-08-23 23:15:24.260 6256 ERROR nova     return import_class(import_str)(*args, **kwargs)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 149, in __init__
2016-08-23 23:15:24.260 6256 ERROR nova     self._refresh_hash_ring(nova_context.get_admin_context())
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 526, in _refresh_hash_ring
2016-08-23 23:15:24.260 6256 ERROR nova     ctxt, self._get_hypervisor_type())
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in wrapper
2016-08-23 23:15:24.260 6256 ERROR nova     args, kwargs)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 236, in object_class_action_versions
2016-08-23 23:15:24.260 6256 ERROR nova     args=args, kwargs=kwargs)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call
2016-08-23 23:15:24.260 6256 ERROR nova     retry=self.retry)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 96, in _send
2016-08-23 23:15:24.260 6256 ERROR nova     timeout=timeout, retry=retry)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
2016-08-23 23:15:24.260 6256 ERROR nova     retry=retry)
2016-08-23 23:15:24.260 6256 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 455, in _send
2016-08-23 23:15:24.260 6256 ERROR nova     raise result
2016-08-23 23:15:24.260 6256 ERROR nova RemoteError: Remote error: IncompatibleObjectVersion Version 1.19 of ServiceList is not supported
2016-08-23 23:15:24.260 6256 ERROR nova [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch\n    :param incoming: incoming message\n', u'  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in object_class_action_versions\n    objcsion: Version 1.19 of ServiceList is not supported\n'].
2016-08-23 23:15:24.260 6256 ERROR nova 
2016-08-23 23:15:30.143 6274 WARNING oslo_reports.guru_meditation_report [-] Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2016-08-23 23:15:30.144 6274 WARNING oslo_config.cfg [-] Option "use_local" from group "conductor" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.146 6274 WARNING oslo_config.cfg [-] Option "compute_manager" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.218 6274 INFO oslo_service.periodic_task [-] Skipping periodic task _sync_power_states because its interval is negative
2016-08-23 23:15:30.240 6274 INFO nova.virt.driver [-] Loading compute driver 'ironic.IronicDriver'
2016-08-23 23:15:30.291 6274 WARNING oslo_config.cfg [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] Option "rabbit_virtual_host" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.292 6274 WARNING oslo_config.cfg [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] Option "rabbit_hosts" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.293 6274 WARNING oslo_config.cfg [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] Option "rabbit_userid" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.293 6274 WARNING oslo_config.cfg [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] Option "rabbit_password" from group "oslo_messaging_rabbit" is deprecated for removal.  Its value may be silently ignored in the future.
2016-08-23 23:15:30.332 6274 DEBUG oslo_messaging._drivers.amqpdriver [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] CALL msg_id: 8375ae4dcd454994a56a5c723a80bf6d exchange 'nova' topic 'conductor' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:448
2016-08-23 23:15:30.347 6274 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 8375ae4dcd454994a56a5c723a80bf6d __call__ /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:296
2016-08-23 23:15:30.349 6274 CRITICAL nova [req-491a231f-4868-41e6-83ef-c5a8cf67d01a - - - - -] RemoteError: Remote error: IncompatibleObjectVersion Version 1.19 of ServiceList is not supported
[u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch\n    :param incoming: incoming message\n', u'  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in object_class_action_versions\n    objclass = nova_object.NovaObject.obj_class_from_name(\n', u'  File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 374, in obj_class_from_name\n', u'IncompatibleObjectVersion: Version 1.19 of ServiceList is not supported\n'].
2016-08-23 23:15:30.349 6274 ERROR nova Traceback (most recent call last):
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/bin/nova-compute", line 10, in <module>
2016-08-23 23:15:30.349 6274 ERROR nova     sys.exit(main())
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 60, in main
2016-08-23 23:15:30.349 6274 ERROR nova     db_allowed=CONF.conductor.use_local)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 218, in create
2016-08-23 23:15:30.349 6274 ERROR nova     db_allowed=db_allowed)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 92, in __init__
2016-08-23 23:15:30.349 6274 ERROR nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 535, in __init__
2016-08-23 23:15:30.349 6274 ERROR nova     self.driver = driver.load_compute_driver(self.virtapi, compute_driver)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/driver.py", line 1618, in load_compute_driver
2016-08-23 23:15:30.349 6274 ERROR nova     virtapi)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_utils/importutils.py", line 44, in import_object
2016-08-23 23:15:30.349 6274 ERROR nova     return import_class(import_str)(*args, **kwargs)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 149, in __init__
2016-08-23 23:15:30.349 6274 ERROR nova     self._refresh_hash_ring(nova_context.get_admin_context())
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 526, in _refresh_hash_ring
2016-08-23 23:15:30.349 6274 ERROR nova     ctxt, self._get_hypervisor_type())
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in wrapper
2016-08-23 23:15:30.349 6274 ERROR nova     args, kwargs)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 236, in object_class_action_versions
2016-08-23 23:15:30.349 6274 ERROR nova     args=args, kwargs=kwargs)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call
2016-08-23 23:15:30.349 6274 ERROR nova     retry=self.retry)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 96, in _send
2016-08-23 23:15:30.349 6274 ERROR nova     timeout=timeout, retry=retry)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
2016-08-23 23:15:30.349 6274 ERROR nova     retry=retry)
2016-08-23 23:15:30.349 6274 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 455, in _send
2016-08-23 23:15:30.349 6274 ERROR nova     raise result
2016-08-23 23:15:30.349 6274 ERROR nova RemoteError: Remote error: IncompatibleObjectVersion Version 1.19 of ServiceList is not supported
2016-08-23 23:15:30.349 6274 ERROR nova [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch\n', u'  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch\n    :param incoming: incoming message\n', u'  File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in object_class_action_versions\n    objclass = nova_object.NovaObject.obj_class_from_name(\n', u'  File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 374, in obj_class_from_name\n', u'IncompatibleObjectVersion: Version 1.19 of ServiceList is not supported\n'].
2016-08-23 23:15:30.349 6274 ERROR nova

Comment 2 Omri Hochman 2016-08-29 18:47:55 UTC
Created attachment 1195461 [details]
Full-Nova-Compute.log

Comment 3 Omri Hochman 2016-08-29 18:50:30 UTC
The error was : Error: Duplicate declaration: Ironic_config[DEFAULT/my_ip] is already declared in file /etc/puppet/modules/ironic/manifests/init.pp:417; cannot redeclare at /tmp/tmp.M48wkq881P/puppet-stack- 
(01:48:34 PM) ohochman: undercloud72

Comment 4 Sofer Athlan-Guyot 2016-08-29 20:10:47 UTC
Hi,

When logged into the server the nova-compute service was running fine.

Then, running /usr/share/instack-undercloud/puppet-stack-config/os-refresh-config/configure.d/50-puppet-stack-config  a second time on the server worked well.


Here the timeline of the events:


- 2016-08-23 23:44:00.437:
  - restart of the nova-compute with the wrong value for scheduler_host_manager
  - puppet has not changed it yet but the package is updated |
- 2016-08-24 00:12:03.283 5383 ERROR nova
  - last message from nova before restart
- 2016-08-24 00:12:03
  - systemctl gives up on restarting nova-compute, and puppet fails accordingly
- 2016-08-24 00:12:08.525
  - nova-compute starting successfuly (this is 5s too late !)

So this really looks like a timeout issue. By default this is
90s, if I interpret this line correctly:

/etc/systemd/system.conf:#DefaultTimeoutStartSec=90s

For the error mentionned in the title. It happens at 23:15 and could be related to the packages being updated but not completely.  This error disappeared later when nova-compute get updated.  The timeline for this one is :

2016-08-23 23:15:24.260 6256 CRITICAL -> first error
2016-08-23 23:15:40.875 6391 CRITICAL -> last error
2016-08-23 23:15:45.369 -> restart of nova-compute
no more error, but the ConfigFileValueError seen above.  This prove that nova-compute is updated, match the ServiceList version, but as puppet has not run yet the value of scheduler_host_manager is wrong.

All in all:
 1. the error in the title is a transient error due to package being updated, it disappears when nova-compute is updated (this is the yum upgrade -y step)
 2. the nova-compute service takes more than 90s to restart, but restart.

We should attempt to increase the  DefaultTimeoutStartSec=90s using 

crudini --set /etc/systemd/system.conf Manager DefaultTimeoutStartSec 180s

Let me know, if using this workaround, you still have the error, then we will have another look at what takes nova-compute so long to restart.

Comment 5 Sofer Athlan-Guyot 2016-08-29 20:23:18 UTC
Maybe #DefaultTimeoutStopSec=90s should be increased also.  I cannot infer which one was slowest and made the restart failed.  Maybe a specialist of systemd could help to debug here.

Comment 6 Sofer Athlan-Guyot 2016-08-29 20:28:33 UTC
As a side note, to help debugging, you cloud modify this this file:

/usr/share/instack-undercloud/puppet-stack-config/os-refresh-config/configure.d/50-puppet-stack-config
19c19
< puppet_apply puppet apply --detailed-exitcodes /etc/puppet/manifests/puppet-stack-config.pp
---
> puppet_apply puppet apply --debug --detailed-exitcodes /etc/puppet/manifests/puppet-stack-config.pp


after "yum upgrade -y", and before "openstack undercloud upgrade"

We should see the systemctl restart command used by puppet and at what time and the time it takes.

Comment 7 Omri Hochman 2016-08-30 18:09:55 UTC
The workaround seems to be valid: 

Execute before upgrade: 
------------------------
sudo crudini --set /etc/systemd/system.conf Manager DefaultTimeoutStartSec 180s 
sudo crudini --set /etc/systemd/system.conf Manager DefaultTimeoutStopSec  180s


Result: 

#############################################################################
Undercloud install complete.

The file containing this installation's passwords is at
/home/stack/undercloud-passwords.conf.

There is also a stackrc file at /home/stack/stackrc.

These files are needed to interact with the OpenStack services, and should be
secured.

#############################################################################

Comment 8 Omri Hochman 2016-08-31 17:31:42 UTC
After re-run the all test from the beginning and applying the workaround , the undercloud upgrade still fails at the first attempt,  

then it succeeds after running 'openstack undercloud upgrade' for the second time.

Comment 9 Sofer Athlan-Guyot 2016-08-31 18:49:56 UTC
Hi,

digging more into the log we can see that:

    2016-08-31 14:10:27.515 11426 ERROR oslo.messaging._drivers.impl_rabbit [-] [94a8d6a7-0238-4408-9ad3-c352015e5f21] AMQP server on 192.0.2.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 6 seconds
    . Client port: None
    2016-08-31 14:10:33.883 11426 INFO oslo.messaging._drivers.impl_rabbit [-] [94a8d6a7-0238-4408-9ad3-c352015e5f21] Reconnected to AMQP server on 192.0.2.1:5672 via [amqp] clientwith port 41228.
    2016-08-31 14:20:07.279 11426 WARNING nova.conductor.api [req-438297e9-4d36-49c8-855e-fc329f13d1f3 - - - - -] Timed out waiting for nova-conductor.  Is it running? Or did this service start before nova-conductor?  Reattempting establishment of nova-conductor connection...
    2016-08-31 14:20:07.289 11426 DEBUG oslo_messaging._drivers.amqpdriver [req-438297e9-4d36-49c8-855e-fc329f13d1f3 - - - - -] CALL msg_id: 0cfa0ef8c553448d9163fea06c927c75 exchange 'nova' topic 'conductor' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:448
    2016-08-31 14:20:07.292 11426 ERROR oslo.messaging._drivers.impl_rabbit [req-438297e9-4d36-49c8-855e-fc329f13d1f3 - - - - -] [a2241e20-b695-4f38-9855-ba31addec64f] AMQP server on 192.0.2.1:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 1 seconds. Client port: 40514
    2016-08-31 14:20:08.314 11426 INFO oslo.messaging._drivers.impl_rabbit [req-438297e9-4d36-49c8-855e-fc329f13d1f3 - - - - -] [a2241e20-b695-4f38-9855-ba31addec64f] Reconnected


Well in case formating don't work:

   14:10:33 -> message of nova-compute
   14:20:07 ->

WARNING nova.conductor.api ... Timed out waiting for nova-conductor.
Is it running? Or did this service start before nova-conductor?
Reattempting establishment of nova-conductor connection...

and then it works fine.


Meanwhile the nova-conductor has restarted just before and was fine.

Comment 10 Marios Andreou 2016-10-12 14:20:19 UTC
Since we are now adding the service stop before upgrading the undercloud, as per https://bugzilla.redhat.com/show_bug.cgi?id=1372040#c6 - is this still a valid bug?

I suspect you may not hit this if you add the service stop first?

Comment 11 Marios Andreou 2016-10-14 15:48:50 UTC
I am going to go ahead and close this as not a bug for now (assuming comment #10 is true), omri please reopen if you disagree/are still hitting this,

thanks

Comment 14 Amit Ugol 2018-05-02 10:54:06 UTC
closed, no need for needinfo.