Description of problem: Complete all steps to major-upgrade-pacemaker-converge.yaml Running the command comes back with a failure, whereas before I have had success: Version-Release number of selected component (if applicable): How reproducible: This is happened twice now Steps to Reproduce: 1. Completge Minor Update 2. Follow Upgrade instrucitons 3. Shutdown all non-migrated instance and complete compute/storage upgradce 4. run command per guide for major-upgrade-pacemaker-converge.yaml Actual results: Update Failed Expected results: Update Computle --------------- Error from deployment-show [rlp@paisley-dir ~]$ heat deployment-show ed372707-bdb5-4707-8136-2025d6a5dbb0 WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead { "status": "FAILED", "server_id": "f8e038e1-7e52-418c-9fc6-18f2debdd902", "config_id": "0b5f1d99-bfa6-4135-bc49-26c869c2d689", "output_values": { "deploy_stdout": "\u001b[mNotice: Compiled catalog for overcloud-controller-0.localdomain in environment production in 14.42 seconds\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Keys/Ceph::Key[client.openstack]/Exec[ceph-key-client.openstack]/returns: + ceph-authtool /etc/ceph/ceph.client.openstack.keyring --name client.openstack --add-key '' --cap mon 'allow r' --cap osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics'\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Keys/Ceph::Key[client.openstack]/Exec[ceph-key-client.openstack]/returns: executed successfully\u001b[0m\nam72XCbNtTCnkNX2PRF8knWdA\npassword\nregionOne\n-1\nTrue\nrabbit\nrcFby9wdYpMbrptBbF49uxh7K\n192.168.140.104,192.168.140.103,192.168.140.105\nredis://:gGydTWau2zVUcKWj3Y7d6KcTg.140.100:6379/\n600\nnotifications\n0.0.0.0\nDefault\nDefault\nTrue\ndatabase\nFalse\nhttp://192.168.140.101:5000/v2.0\ndatabase\n4952\nhttp://192.168.140.101:5000\nhttp://192.168.140.101:35357\n\u001b[mNotice: /Stage[main]/Gnocchi::Storage::Ceph/Package[python-cradox]/ensure: created\u001b[0m\n/var/log/ceilometer\n192.168.140.104\n\u001b[mNotice: /Stage[main]/Aodh::Client/Package[python-aodhclient]/ensure: created\u001b[0m\nservice\nceilometer\n/\n60\nservice\nguest\n2\nrcFby9wdYpMbrptBbF49uxh7K\nceilometer\n-1\nmongodb://192.168.140.104:27017,192.168.140.103:27017,192.168.140.105:27017/ceilometer?replicaSet=tripleo\nFalse\n8777\nservice\nhttp://192.168.140.101:8041\ngnocchi_resources.yaml\nlow\n\u001b[mNotice: /Stage[main]/Swift::Proxy/Swift::Service[swift-proxy-server]/Service[swift-proxy-server]/enable: enable changed 'true' to 'false'\u001b[0m\nGnmeJpcZTBha7R7D3qVNQ9MrY\ninternalURL\n\u001b[mNotice: /Stage[main]/Main/Exec[galera-ready]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:17.808 4211 INFO gnocchi.cli [-] Upgrading indexer <gnocchi.indexer.sqlalchemy.SQLAlchemyIndexer object at 0x3e34ad0>\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:20.840 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 10 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:33.847 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 9 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:46.863 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 8 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:59.878 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 7 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:43:12.893 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 6 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:43:25.909 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 5 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:43:38.923 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 4 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:43:51.940 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 3 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:04.955 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 2 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:17.971 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 1 attempts left.\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 CRITICAL gnocchi [-] DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi Traceback (most recent call last):\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/bin/gnocchi-upgrade\", line 10, in <module>\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi sys.exit(upgrade())\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/gnocchi/cli.py\", line 56, in upgrade\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi create_legacy_resource_types=conf.create_legacy_resource_types)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/gnocchi/indexer/sqlalchemy.py\", line 249, in upgrade\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi with self.facade.writer_connection() as connection:\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi return self.gen.next()\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 759, in _transaction_scope\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi allow_async=self._allow_async) as resource:\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi return self.gen.next()\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 461, in _connection\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi mode=self.mode)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 262, in _create_connection\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi self._start()\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 338, in _start\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi engine_args, maker_args)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 362, in _setup_for_connection\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi sql_connection=sql_connection, **engine_kwargs)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 152, in create_engine\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi test_conn = _test_connection(engine, max_retries, retry_interval)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 334, in _test_connection\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi six.reraise(type(de_ref), de_ref)\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi File \"<string>\", line 2, in reraise\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')\u001b[0m\n\u001b[mNotice: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 ERROR gnocchi \u001b[0m\n\u001b[mNotice: /Stage[main]/Keystone::Deps/Anchor[keystone::service::end]: Triggered 'refresh' from 2 events\u001b[0m\n\u001b[mNotice: Finished catalog run in 162.66 seconds\u001b[0m\n", "deploy_stderr": "\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Swift]): swift_hash_suffix has been deprecated and should be replaced with swift_hash_path_suffix, this will be removed as part of the N-cycle\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Keystone]): Execution of db_sync does not depend on $enabled anymore. Please use sync_db instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Api]): The known_stores parameter is deprecated, use stores instead\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Api]): default_store not provided, it will be automatically set to glance.store.http.Store\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Registry]): Execution of db_sync does not depend on $manage_service or $enabled anymore. Please use sync_db instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Api]): ec2_listen_port, ec2_workers and keystone_ec2_url are deprecated and have no effect. Deploy openstack/ec2-api instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_host'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_protocol'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_port'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_path'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Server]): identity_uri, auth_tenant, auth_user, auth_password, auth_region configuration options are deprecated in favor of auth_plugin and related options\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Dhcp]): The dhcp_delete_namespaces parameter was removed in Mitaka, it does not take any affect\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::L3]): parameter external_network_bridge is deprecated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::L3]): parameter router_delete_namespaces was removed in Mitaka, it does not take any affect\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_password parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_tenant parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_url parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer::Api]): The keystone_auth_uri parameter is deprecated. Please use auth_uri instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer::Api]): The keystone_identity_uri parameter is deprecated. Please use identity_uri instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Heat]): \"admin_user\", \"admin_password\", \"admin_tenant_name\" configuration options are deprecated in favor of auth_plugin and related options\u001b[0m\n\u001b[1;31mWarning: You cannot collect exported resources without storeconfigs being set; the collection will be ignored on line 123 in file /etc/puppet/modules/gnocchi/manifests/api.pp\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mError: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: Failed to call refresh: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy-resource-types returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy-resource-types returned 1 instead of one of [0]\u001b[0m\n", "deploy_status_code": 6 }, "creation_time": "2017-01-16T15:40:59", "updated_time": "2017-01-16T15:44:35", "input_values": { "step": 3, "update_identifier": { "deployment_identifier": 1484578322, "controller_config": { "1": "os-apply-config deployment 4be8e48c-ac35-4f01-9391-93f4c531592d completed,Root CA cert injection not enabled.,TLS not enabled.,None,", "0": "os-apply-config deployment e149bd9d-92ad-4e03-b833-3ec14c8b900b completed,Root CA cert injection not enabled.,TLS not enabled.,None,", "2": "os-apply-config deployment df40fd45-7bb8-41e9-bdd6-4f0e3dbf6227 completed,Root CA cert injection not enabled.,TLS not enabled.,None," }, "allnodes_extra": "none" } }, "action": "CREATE", "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 6", "id": "ed372707-bdb5-4707-8136-2025d6a5dbb0" Additional info:
Hi, So the error is a failure during the /usr/bin/gnocchi-upgrade /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:17.808 4211 INFO gnocchi.cli [-] Upgrading indexer <gnocchi.indexer.sqlalchemy.SQLAlchemyIndexer object at 0x3e34ad0> /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:42:20.840 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 10 attempts left. ... attempt to connect ... /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:17.971 4211 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 1 attempts left. ... and finally appears to connect ... but got the (2013, 'Lost connection to MySQL server during query') /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]/returns: 2017-01-16 15:44:27.982 4211 CRITICAL gnocchi [-] DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') To start debug we would need: from all three controllers: - /var/log/mariadb/mariadb.log - the output of journalctl -u mariadb - journalctl -u os-collect-config the /var/log/gnocchi/gnocchi-upgrade.log which should be on the bootstrap controller.
The install this was done on is no longer available, but you should be able to replicate the issue very easy in your lab.
I have replicated the error again . Pulling SOS Report foryou.
Created attachment 1243026 [details] sosreport partc
Created attachment 1243027 [details] sopsreport partb
Created attachment 1243028 [details] sosreport part a
Error from this run was: "deploy_stderr": "\u001b[1;31mWarning: Scope(Class[Mongodb::Server]): Replset specified, but no replset_members or replset_config provided.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Swift]): swift_hash_suffix has been deprecated and should be replaced with swift_hash_path_suffix, this will be removed as part of the N-cycle\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Keystone]): Execution of db_sync does not depend on $enabled anymore. Please use sync_db instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Api]): The known_stores parameter is deprecated, use stores instead\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Api]): default_store not provided, it will be automatically set to glance.store.http.Store\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Glance::Registry]): Execution of db_sync does not depend on $manage_service or $enabled anymore. Please use sync_db instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Api]): ec2_listen_port, ec2_workers and keystone_ec2_url are deprecated and have no effect. Deploy openstack/ec2-api instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_host'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_protocol'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_port'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::compute::vncproxy_path'; class ::nova::compute has not been evaluated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Server]): identity_uri, auth_tenant, auth_user, auth_password, auth_region configuration options are deprecated in favor of auth_plugin and related options\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Dhcp]): The dhcp_delete_namespaces parameter was removed in Mitaka, it does not take any affect\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::L3]): parameter external_network_bridge is deprecated\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::L3]): parameter router_delete_namespaces was removed in Mitaka, it does not take any affect\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_password parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_tenant parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Neutron::Agents::Metadata]): The auth_url parameter is deprecated and was removed in Mitaka release.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer::Api]): The keystone_auth_uri parameter is deprecated. Please use auth_uri instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Ceilometer::Api]): The keystone_identity_uri parameter is deprecated. Please use identity_uri instead.\u001b[0m\n\u001b[1;31mWarning: Scope(Class[Heat]): \"admin_user\", \"admin_password\", \"admin_tenant_name\" configuration options are deprecated in favor of auth_plugin and related options\u001b[0m\n\u001b[1;31mWarning: You cannot collect exported resources without storeconfigs being set; the collection will be ignored on line 123 in file /etc/puppet/modules/gnocchi/manifests/api.pp\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mWarning: Not collecting exported resources without storeconfigs\u001b[0m\n\u001b[1;31mError: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: Failed to call refresh: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy-resource-types returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Gnocchi::Db::Sync/Exec[gnocchi-db-sync]: gnocchi-upgrade --config-file /etc/gnocchi/gnocchi.conf --skip-storage --create-legacy-resource-types returned 1 instead of one of [0]\u001b[0m\n",
Command used to Upgrade I added the --force-postconfig openstack overcloud deploy --log-file ~/pilot/upgrade_converge_deployment.log -t 120 --force-postconfig --templates ~/pilot/templates/overcloud -e ~/pilot/templates/overcloud/environments/network-isolation.yaml -e ~/pilot/templates/overcloud/environments/storage-environment.yaml -e ~/pilot/templates/overcloud/environments/puppet-pacemaker.yaml -e ~/pilot/templates/overcloud/environments/major-upgrade-pacemaker-converge.yaml -e ~/pilot/templates/dell-environment.yaml -e ~/pilot/templates/network-environment.yaml --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage --swift-storage-flavor swift-storage --block-storage-flavor block-storage --neutron-public-interface bond1 --neutron-network-type vlan --neutron-disable-tunneling --control-scale 3 --compute-scale 3 --ceph-storage-scale 3 --ntp-server 10.127.1.3 --neutron-network-vlan-ranges physint:201:220,physext --neutron-bridge-mappings physint:br-tenant,physext:br-ex
Hi, The internal CI and myself have tested an OSP8 upgrade and we couldn't reproduce the error. In the logs I could find a lot of errors, not just the gnocchi-db-upgrade error. So I can't explain why the deployment went on and didn't stop at the first error. Here are the errors from the Jan 20 (there are others from the days before, but I'll focus on the latest): Jan 20 19:22:18: [0] - exit 1: ControllerLoadBalancerDeployment_Step1: Duplicate declaration: User[hacluster] Jan 20 19:22:34: [1] - exit 1: ControllerOvercloudServicesDeployment_Step4: Duplicate declaration: User[hacluster] This duplicate declaration error is very strange. It indicates that we use this code from OSP8 https://github.com/openstack/tripleo-heat-templates/blob/b0ba9e8e09d70cb5871a6f343a698e3b481ac297/puppet/manifests/overcloud_controller_pacemaker.pp#L71-L73 but it's not present in mitaka (OSP9) in the log from this point on we can see that the db is not available on controller0, and pacemaker is moving resources around. Which indicate a messing pcs cluster status. Then we have the gnocchi-db-upgrade error, and the duplicate declaration error again: Jan 20 19:25:45: [2] - exit 6: ControllerOvercloudServicesDeployment_Step4: gnocchi-db-upgrade error, cannot reach db. Jan 20 19:25:50: [3] - exit 1: ControllerOvercloudServicesDeployment_Step5: duplicate declaration Here we have another error with exactly the same resource name: "ControllerOvercloudServicesDeployment_Step5" Jan 20 19:29:29: [4] - exit 6: ControllerOvercloudServicesDeployment_Step5: gnocchi-db-upgrade And so on Jan 20 19:29:34: [5] - exit 1: ControllerOvercloudServicesDeployment_Step6: duplicate declaration Jan 20 19:54:24: [6] - exit 6: ControllerOvercloudServicesDeployment_Step6: gnocchi-db-upgrade Jan 20 19:54:40: [7] - exit 1: ControllerServicesBaseDeployment_Step2: duplicate declaration Jan 20 20:08:20: [8] - exit 6: ControllerOvercloudServicesDeployment_Step4: gnocchi-db-upgrade Eventually we have this last gnocchi-db-upgrade error. Jan 20 21:23:54: [9] - exit 6: ControllerOvercloudServicesDeployment_Step4: gnocchi-db-upgrade I don't really get why the deployment didn't stop at the first error, and how we can have the duplicate declaration error which would indicate the use of osp8 templates on the undercloud. So maybe I'm missing something here. I hope that it helps, (please have a look at the PPS) PS: All those errors should be available from the sos report: grep -E 'deploy_status_code[^0-9]+[1-9]' sosreport-overcloud-controller-0.localdomain-20170120212759/sos_commands/pacemaker/crm_report/overcloud-controller-0.localdomain/journal.log On the controller node you can use: journalctl -u os-collect-config | grep -E 'deploy_status_code[^0-9]+[1-9]' PPS: As a side note, while digging into the log I noticed that this point was note done in https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/paged/upgrading-red-hat-openstack-platform/chapter-3-director-based-environments-performing-upgrades-to-major-versions in section 3.4.1. Pre-Upgrade Notes for the Overcloud you have to generate a ceph key: key=$(ssh heat-admin@${ceph_node} ceph-authtool --gen-print-key) cat > ceph-client-key.yaml <<EOF parameter_defaults: CephClientKey: '${key}' EOF The ceph key I saw in the logs was empty. ==== ERRORS ==== [0]: Jan 20 19:22:18 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:22:18,003] (heat-config) [DEBUG] [2017-01-20 19:22:12,238] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/e83b313e-d02f-41a5-a2ec-4a61243b2829" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerLoadBalancerDeployment_Step1" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/e83b313e-d02f-41a5-a2ec-4a61243b2829.pp Jan 20 19:22:18 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:22:17,997] (heat-config) [INFO] Return code 1 Jan 20 19:22:18 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:22:17,997] (heat-config) [INFO] Error: Duplicate declaration: User[hacluster] is already declared in file /var/lib/heat-config/heat-config-puppet/e83b313e-d02f-41a5-a2ec-4a61243b2829.pp:91; cannot redeclare at /etc/puppet/modules/pacemaker/manifests/corosync.pp:121 on node overcloud-controller-0.localdomain [1]: Jan 20 19:22:34 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:22:34,452] (heat-config) [DEBUG] [2017-01-20 19:22:30,590] (heat-config) [DEBUG] Running FACTER_heat _outputs_path="/var/run/heat-config/heat-config-puppet/f7e4dfa7-9634-4e19-9669-df33a19bce40" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudS ervicesDeployment_Step4" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/f7e4dfa7-9634-4e19-9669-df33a19bce40.pp Jan 20 19:22:34 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:22:34,446] (heat-config) [INFO] Return code 1 [2]: Jan 20 19:25:45 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:25:45,462] (heat-config) [DEBUG] [2017-01-20 19:22:35,150] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/c64676fa-0b3b-4ae3-b8a7-f1cab5ed6fa5" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step4" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/c64676fa-0b3b-4ae3-b8a7-f1cab5ed6fa5.pp Jan 20 19:25:45 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:25:45,456] (heat-config) [INFO] Return code 6 [3]: Jan 20 19:25:50 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:25:50,022] (heat-config) [DEBUG] [2017-01-20 19:25:46,227] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/5efb6561-178f-4e15-9767-abb53e2da51f" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step5" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/5efb6561-178f-4e15-9767-abb53e2da51f.pp Jan 20 19:25:50 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:25:50,017] (heat-config) [INFO] Return code 1 [4]: Jan 20 19:29:29 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:29:29,938] (heat-config) [DEBUG] [2017-01-20 19:25:50,834] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/558bc8c7-90f1-4f74-b489-5bedb84c94c5" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step5" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/558bc8c7-90f1-4f74-b489-5bedb84c94c5.pp Jan 20 19:29:29 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:29:29,933] (heat-config) [INFO] Return code 6 [5]: Jan 20 19:29:34 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:29:34,445] (heat-config) [DEBUG] [2017-01-20 19:29:30,691] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/318a872d-7ef7-4d26-877c-d87d800e4dbb" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step6" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/318a872d-7ef7-4d26-877c-d87d800e4dbb.pp Jan 20 19:29:34 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:29:34,441] (heat-config) [INFO] Return code 1 [6]: Jan 20 19:54:24 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:54:24,942] (heat-config) [DEBUG] [2017-01-20 19:29:35,134] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/46062b89-3f61-4bd5-9c0b-313e03fc3682" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step6" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/46062b89-3f61-4bd5-9c0b-313e03fc3682.pp Jan 20 19:54:24 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:54:24,928] (heat-config) [INFO] Return code 6 [7] Jan 20 19:54:40 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:54:40,319] (heat-config) [DEBUG] [2017-01-20 19:54:36,412] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/5a012e7a-726e-41f1-a273-bf355886703a" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerServicesBaseDeployment_Step2" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/5a012e7a-726e-41f1-a273-bf355886703a.pp Jan 20 19:54:40 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 19:54:40,313] (heat-config) [INFO] Return code 1 [8] Jan 20 20:08:20 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 20:08:20,372] (heat-config) [DEBUG] [2017-01-20 20:05:12,642] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/ef993960-bd4e-47a4-84b4-fd62c2e7abb7" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step4" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/ef993960-bd4e-47a4-84b4-fd62c2e7abb7.pp Jan 20 20:08:20 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 20:08:20,366] (heat-config) [INFO] Return code 6 [9] Jan 20 21:23:54 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 21:23:54,286] (heat-config) [DEBUG] [2017-01-20 21:20:47,794] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/8aff46f5-6f7b-4fda-acbf-0fd02ab3df84" FACTER_fqdn="overcloud-controller-0.localdomain" FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step4" puppet apply --detailed-exitcodes /var/lib/heat-config/heat-config-puppet/8aff46f5-6f7b-4fda-acbf-0fd02ab3df84.pp Jan 20 21:23:54 overcloud-controller-0.localdomain os-collect-config[5189]: [2017-01-20 21:23:54,280] (heat-config) [INFO] Return code 6
I will look into the Ceph key. Not sure what to do about the rest.
Hi, if you still have the platform, or plan to deploy it again, I would really like to have the output of: journalctl -u os-collect-config right at the end of the deployment on the controller nodes (all the nodes would be even better) The logs I've seen in the sos report spans on several days, so maybe I'm missing something about the way it's handled. The idea here would be to check if that exact same pattern error reproduce, that would indicate a problem in the way the upgrade is done and/or in the environment. Regards,
Created attachment 1243938 [details] OS-Collect-Config cntl0
Created attachment 1243939 [details] OS-Collect-Config cntl1
Created attachment 1243940 [details] OS-Collect-Config cntl2
Hi, Thanks for the logs. So for cntl0 there are the same as the one I've got, so nothing new here. In cntl1 we have those errors: Jan 20 19:25:29 overcloud-controller-1.localdomain os-collect-config[4761]: [2017-01-20 19:25:29,577] (heat-config) [INFO] {"deploy_stdout": "", "deploy_stderr": "\u001b[1;31mError: Duplicate declaration: User[hacluster] is already declared in file /var/lib/heat-config/heat-config-puppet/e989f9ea-c90b-4ebf-bd20-53dfaa9f4b53.pp:91; cannot redeclare at /etc/puppet/modules/pacemaker/manifests/corosync.pp:121 on node overcloud-controller-1.localdomain\u001b[0m\n\u001b[1;31mError: Duplicate declaration: User[hacluster] is already declared in file /var/lib/heat-config/heat-config-puppet/e989f9ea-c90b-4ebf-bd20-53dfaa9f4b53.pp:91; cannot redeclare at /etc/puppet/modules/pacemaker/manifests/corosync.pp:121 on node overcloud-controller-1.localdomain\u001b[0m\n", "deploy_status_code": 1} With are the duplicate hacluster error as seen on cntl0. This again could indicate the use of the wrong templates on the undercloud during the upgrade. If we get a look at /var/lib/heat-config/heat-config-puppet/e989f9ea-c90b-4ebf-bd20-53dfaa9f4b53.pp we should see that it matches a osp8 puppet configuration. As another check, I would like to have a look at /var/lib/heat-config/ on cntl0 and cntl1 (we can ignore cntl2, should the same as cntl1) tar cfJ ctnlX.tar.xz /var/lib/heat-config/ The idea here would be to confirm that we are polluted we osp8 configuration scripts during the upgrade. Next would be to redo the deployment using the cephclientkey configuration as it will put that out of the way.
I have take the cluster down and working on installing again. Has anyone else made it to this step and been successful?
We have cluster ready for testing here.
Just recapping a bit what we discussed/observed in the call today. The symptom that was observed after the convergence step is that a bunch of pacemaker resourced did not start. The reason for this was that there was an /etc/my.cnf.d/server.cnf file generated by puppet that contained bind-address = 127.0.0.1. The investigation will continue tomorrow and we will try to figure out why and at which step such a file would be created by puppet. The odd thing is that such a file should not even be created/managed. This would only happen if the hiera variable "enable_galera" was set to false (which we observed being set to true on the controllers). We will do another session where the setup has been through the upgrades up to step 6 (but not included). Before doing step 6 (major-upgrade-pacemaker step). We will verify in what state the controllers are (is server.cnf present, services, os-collect-config state. etc) We observed both today and in the sosreport attached to this case that server.cnf is dated before the galera.cnf file (in today's lab by 30mins and the sosreports by a day) which suggests that server.cnf gets created in a step before the convergence one (although we can't yet be 100% sure). We tried looking at the logs in this sosreports but they actually got rotated after the server.cnf creation date, so we cannot infer much as to what created it.
I have asked for a listing of /etc/my.cnf.d between each step during this install and if a server.cnf appears stop the process.
Can we do something quick short term? Like removing file from puppet control? Or adding a manual step that we can automate in upgrade script to change an entry in a file. We should do the right solution including pushing it upstream correctly. But this is a release blocker and we need workaround for release yesterday... Why QE had not bumped into it?
FYI - server.cnf is installed as part of OSP8 here are the contents and as you can see most are commented out: [heat-admin@overcloud-controller-0 ~]$ cat /etc/my.cnf.d/server.cnf # # These groups are read by MariaDB server. # Use it for options that only the server (but not clients) should see # # See the examples of server my.cnf files in /usr/share/mysql/ # # this is read by the standalone daemon and embedded servers [server] # this is only for the mysqld standalone daemon [mysqld] # # * Galera-related settings # [galera] # Mandatory settings #wsrep_provider= #wsrep_cluster_address= #binlog_format=row #default_storage_engine=InnoDB #innodb_autoinc_lock_mode=2 #bind-address=0.0.0.0 # # Optional setting #wsrep_slave_threads=1 #innodb_flush_log_at_trx_commit=0 # this is only for embedded server [embedded] # This group is only read by MariaDB-5.5 servers. # If you use the same .cnf file for MariaDB of different versions, # use this group for options that older servers don't understand [mysqld-5.5] # These two groups are only read by MariaDB servers, not by MySQL. # If you use the same .cnf file for MySQL and MariaDB, # you can put MariaDB-only options here [mariadb] log-error=/var/log/mariadb/mariadb.log pid-file=/var/run/mariadb/mariadb.pid [mariadb-5.5] [heat-admin@overcloud-controller-0 ~]$
Hi, so the root of the problem is the patch for having the vm migration working between controller upgrade and convergence. It was created there https://bugzilla.redhat.com/show_bug.cgi?id=1385143 . This would explain why QA didn't bump into it as it's not yet officially released. Part of it is applied during controller upgrade. It creates the default /etc/my.cnf.d/server.cnf, but as the mysqld is not restarted at that time it stays unnoticed. At convergence time when the mysqld is restarted the new bind-address is taken into account and breaks the haproxy/mysql link. Adding a new review to: - fix this bug (adding the associated upstream bug) - ensure working vm migration/creation during all stages. The code is still currently WIP, I will update the bugzilla when it will be ready for consumption. So currently the best course of action would be to not apply the patches for having the vm migration working between controller upgrade and convergence. Furthermore, from the sosreport, it appears that you bump into the error during keystone-migration due to the installation of the the version of the puppet modules. The relevant bug is described there https://bugzilla.redhat.com/show_bug.cgi?id=1414784 . The solution here is to remove the point 2.a/2.b where yum -y update openstack-puppet-modules" is done.
Sofer, thank you for getting to the bottom of it. So how do we deliver fixes for this BZ and https://bugzilla.redhat.com/show_bug.cgi?id=1385143 at the same time? We need both to deliver upgrade minimizing disruption on data plane. Without VM migration experience is getting worse with the increased #s of compute nodes. Solving one without another is not very useful.
Hi, Arkady, yes the goal is to be able to manipulate vm in all those states: - controllers upgraded/compute not upgraded - controllers upgraded/part of compute upgraded - controllers upgraded/all computes upgraded/no convergence Randy, this is still WIP. It looks like it works for vm creation in all those states. VM migration hasn't been tested yet. It still requires manual intervention. But it's very close. With the latest revision of the patch the server.cnf is to not updated anymore, but it still misses a parameter for the nova_api database. I'm going to continue the work and should have most of it covered by Monday. The relevant patch is https://review.openstack.org/#/c/428093/. When it's working I will update here about how to apply it on OSP8 cleanly.
Hi, Updated a new version that should solved the last manual trick. Still not fully tested. Nevertheless here is how you would apply it: # BZ 1413686 curl https://review.openstack.org/changes/408669/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 curl https://review.openstack.org/changes/422837/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 curl https://review.openstack.org/changes/428093/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 On the undercloud, and before the controller upgrade step. Regards,
(In reply to arkady kanevsky from comment #23) > Sofer, thank you for getting to the bottom of it. > > So how do we deliver fixes for this BZ and > https://bugzilla.redhat.com/show_bug.cgi?id=1385143 at the same time? > We need both to deliver upgrade minimizing disruption on data plane. > Without VM migration experience is getting worse with the increased #s of > compute nodes. > > Solving one without another is not very useful. These should be able to coexist (and Sofer has indicated that he is performing his testing with one layered on top of the other as described in: https://bugzilla.redhat.com/show_bug.cgi?id=1413686#c25
Thank You for the update. We are now testing the series of patches in sequence.
Thank you Sofer and Mike. Much appreciate that it was done in such a short time. One correction to Sofer instructions for documentation/scripting. Last patch https://review.openstack.org/#/c/428093 had not been merged yet and hence can be changed before final merge. Suggest that for our release we do not add depedency on unknown code. Instead use the current version that Sofer submitted: https://review.openstack.org/#/c/428093/11/ Consistent with our philosophy of locked bits and controlling what code is being used by customer.
The tests are in and now we are receiving an error with CephStorage: { "status": "FAILED", "server_id": "87c1ee96-8cb1-46ae-ac19-9391c325ce59", "config_id": "1dd17b4c-01e0-48dd-a5fd-60d823020397", "output_values": { "deploy_stdout": "\u001b[mNotice: Compiled catalog for overcloud-cephstorage-0.localdomain in environment production in 0.96 seconds\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_revocation_interval]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_url]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_s3_auth_use_keystone]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_init_timeout]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_admin_token]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Keys/Ceph::Key[client.openstack]/Exec[ceph-key-client.openstack]/returns: + ceph-authtool /etc/ceph/ceph.client.openstack.keyring --name client.openstack --add-key '' --cap mon 'allow r' --cap osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics'\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Keys/Ceph::Key[client.openstack]/Exec[ceph-key-client.openstack]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_make_new_tenants]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_accepted_roles]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Conf/Ceph_config[client.radosgw.gateway/rgw_keystone_token_cache size]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Ntp::Config/File[/etc/ntp.conf]/content: content changed '{md5}04ef455e1ab8ac186bb2055a3ae65754' to '{md5}895e208998c1be1ae515236df50aef64'\u001b[0m\n\u001b[mNotice: /Stage[main]/Ntp::Service/Service[ntp]: Triggered 'refresh' from 1 events\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: + test -b /dev/sdm\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: + ceph-disk prepare /dev/sdm /dev/sdc\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: Could not create partition 2 from 34 to 20480033\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: Unable to set partition 2's name to 'ceph journal'!\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: Could not change partition 2's type code to 45b0969e-9b03-4f30-b4c6-b4b80ceff106!\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: Error encountered; not saving changes.\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: ceph-disk: Error: Command '['/usr/sbin/sgdisk', '--new=2:0:+10000M', '--change-name=2:ceph journal', '--partition-guid=2:ca5ffc4f-c062-427e-afd7-37c49ad73ab5', '--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106', '--mbrtogpt', '--', '/dev/sdc']' returned non-zero exit status 4\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + test -b /dev/sde\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + test -b /dev/sde\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + test -b /dev/sde1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + test -b /dev/sde1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: + ceph-disk activate /dev/sde1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: === osd.10 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: Starting Ceph osd.10 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sde]/Exec[ceph-osd-activate-/dev/sde]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + test -b /dev/sdd\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + test -b /dev/sdd\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + test -b /dev/sdd1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + test -b /dev/sdd1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: + ceph-disk activate /dev/sdd1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: === osd.0 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: Starting Ceph osd.0 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdd]/Exec[ceph-osd-activate-/dev/sdd]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-activate-/dev/sdm]: Dependency Exec[ceph-osd-prepare-/dev/sdm] has failures: true\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + test -b /dev/sdo\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + test -b /dev/sdo\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + test -b /dev/sdo1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + test -b /dev/sdo1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: + ceph-disk activate /dev/sdo1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: === osd.8 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: Starting Ceph osd.8 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdo]/Exec[ceph-osd-activate-/dev/sdo]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + test -b /dev/sdi\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + test -b /dev/sdi\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + test -b /dev/sdi1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + test -b /dev/sdi1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: + ceph-disk activate /dev/sdi1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: === osd.9 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: Starting Ceph osd.9 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdi]/Exec[ceph-osd-activate-/dev/sdi]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + test -b /dev/sdk\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + test -b /dev/sdk\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + test -b /dev/sdk1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + test -b /dev/sdk1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: + ceph-disk activate /dev/sdk1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: === osd.6 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: Starting Ceph osd.6 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdk]/Exec[ceph-osd-activate-/dev/sdk]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + test -b /dev/sdf\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + test -b /dev/sdf\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + test -b /dev/sdf1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + test -b /dev/sdf1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: + ceph-disk activate /dev/sdf1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: === osd.24 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: Starting Ceph osd.24 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdf]/Exec[ceph-osd-activate-/dev/sdf]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + test -b /dev/sdj\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + test -b /dev/sdj\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + test -b /dev/sdj1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + test -b /dev/sdj1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: + ceph-disk activate /dev/sdj1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: === osd.30 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: Starting Ceph osd.30 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdj]/Exec[ceph-osd-activate-/dev/sdj]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + test -b /dev/sdg\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + test -b /dev/sdg\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + test -b /dev/sdg1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + test -b /dev/sdg1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: + ceph-disk activate /dev/sdg1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: === osd.33 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: Starting Ceph osd.33 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdg]/Exec[ceph-osd-activate-/dev/sdg]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: + test -b /dev/sdn\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: + ceph-disk prepare /dev/sdn /dev/sdc\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: Could not create partition 2 from 34 to 20480033\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: Unable to set partition 2's name to 'ceph journal'!\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: Could not change partition 2's type code to 45b0969e-9b03-4f30-b4c6-b4b80ceff106!\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: Error encountered; not saving changes.\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: ceph-disk: Error: Command '['/usr/sbin/sgdisk', '--new=2:0:+10000M', '--change-name=2:ceph journal', '--partition-guid=2:90a0c431-9082-4436-a432-0410c763952c', '--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106', '--mbrtogpt', '--', '/dev/sdc']' returned non-zero exit status 4\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-activate-/dev/sdn]: Dependency Exec[ceph-osd-prepare-/dev/sdn] has failures: true\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + test -b /dev/sdl\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + test -b /dev/sdl\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + test -b /dev/sdl1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + test -b /dev/sdl1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: + ceph-disk activate /dev/sdl1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: === osd.27 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: Starting Ceph osd.27 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdl]/Exec[ceph-osd-activate-/dev/sdl]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + test -b /dev/sdh\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + test -b /dev/sdh\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + test -b /dev/sdh1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + test -f /usr/lib/udev/rules.d/95-ceph-osd.rules.disabled\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + test -b /dev/sdh1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: + ceph-disk activate /dev/sdh1\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: === osd.21 === \u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: Starting Ceph osd.21 on overcloud-cephstorage-0...already running\u001b[0m\n\u001b[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdh]/Exec[ceph-osd-activate-/dev/sdh]/returns: executed successfully\u001b[0m\n\u001b[mNotice: Finished catalog run in 10.63 seconds\u001b[0m\n", "deploy_stderr": "\u001b[1;31mError: /bin/true # comment to satisfy puppet syntax requirements\nset -ex\nif ! test -b /dev/sdm ; then\n mkdir -p /dev/sdm\nfi\nceph-disk prepare /dev/sdm /dev/sdc\nudevadm settle\n returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-prepare-/dev/sdm]/returns: change from notrun to 0 failed: /bin/true # comment to satisfy puppet syntax requirements\nset -ex\nif ! test -b /dev/sdm ; then\n mkdir -p /dev/sdm\nfi\nceph-disk prepare /dev/sdm /dev/sdc\nudevadm settle\n returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdm]/Exec[ceph-osd-activate-/dev/sdm]: Skipping because of failed dependencies\u001b[0m\n\u001b[1;31mError: /bin/true # comment to satisfy puppet syntax requirements\nset -ex\nif ! test -b /dev/sdn ; then\n mkdir -p /dev/sdn\nfi\nceph-disk prepare /dev/sdn /dev/sdc\nudevadm settle\n returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mError: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-prepare-/dev/sdn]/returns: change from notrun to 0 failed: /bin/true # comment to satisfy puppet syntax requirements\nset -ex\nif ! test -b /dev/sdn ; then\n mkdir -p /dev/sdn\nfi\nceph-disk prepare /dev/sdn /dev/sdc\nudevadm settle\n returned 1 instead of one of [0]\u001b[0m\n\u001b[1;31mWarning: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/sdn]/Exec[ceph-osd-activate-/dev/sdn]: Skipping because of failed dependencies\u001b[0m\n", "deploy_status_code": 6 }, "creation_time": "2017-02-03T21:24:28", "updated_time": "2017-02-03T21:24:53", "input_values": { "update_identifier": { "cephstorage_config": { "1": "os-apply-config deployment 6b20a379-046b-462a-95bd-4a9e2614e238 completed,Root CA cert injection not enabled.,None,", "0": "os-apply-config deployment 548c2f07-7a31-4aa6-b06b-4256dcf71341 completed,Root CA cert injection not enabled.,None,", "2": "os-apply-config deployment fc341e8e-7f56-4131-ba68-7de5228bd740 completed,Root CA cert injection not enabled.,None," }, "deployment_identifier": 1486155926, "allnodes_extra": "none" } }, "action": "CREATE", "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 6", "id": "c4fa66bd-8c56-45b8-bf55-8deea9fde674" }
I ran through Upgrade again successfully this time!! Looks like the patch worked! Thank you!
Hi Audra, that's really good news. So using the command in https://bugzilla.redhat.com/show_bug.cgi?id=1413686#c25 you were able, yesterday, to have a successful upgrade with vm migration during the upgrade. That is you must have use https://review.openstack.org/#/c/428093/18 . In my tests of the last version of my patch I was able to create vm during all on upgraded/non-upgraded compute nodes between controller upgrade and convergence stages. I'm going to have this merged as soon as possible upstream. A final check will be done by red-hat QE before landing, especially for migration during upgrade that I could not test myself. Regards,
Adding another necessary patch.
Hi, so to have this working you need to apply those patch. Assuming the templates are in /usr/share/openstack/tripleo-heat-templates, the necessary commands are: curl https://review.openstack.org/changes/408669/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 curl https://review.openstack.org/changes/422837/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 curl https://review.openstack.org/changes/428093/revisions/current/patch?download | \ base64 -d | \ sudo patch -d /usr/share/openstack-tripleo-heat-templates -p1 The reviews are merged upstream and won't change anymore.
We just verified that all three patches are applied to our install.
(In reply to Randy Perryman from comment #34) > We just verified that all three patches are applied to our Upgrade patch list. We are going to validate them and be sure they are in the running template directory and see is we missed one.
(In reply to Randy Perryman from comment #35) > (In reply to Randy Perryman from comment #34) > > We just verified that all three patches are applied to our Upgrade patch list. > We are going to validate them and be sure they are in the running template > directory and see is we missed one. We had missed one. After re-running, Upgrade completes successfully!
*** Bug 1426253 has been marked as a duplicate of this bug. ***
*** Bug 1382127 has been marked as a duplicate of this bug. ***
*** Bug 1385143 has been marked as a duplicate of this bug. ***
*** Bug 1396360 has been marked as a duplicate of this bug. ***
*** Bug 1396365 has been marked as a duplicate of this bug. ***
*** Bug 1388521 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0859