Bug 1598891
Summary: | Running "openstack overcloud upgrade run --roles Controller --skip-tags validation" fails | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Darin Sorrentino <dsorrent> | ||||||||||
Component: | openstack-tripleo-heat-templates | Assignee: | Emilien Macchi <emacchi> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Marius Cornea <mcornea> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 13.0 (Queens) | CC: | aschultz, bdobreli, ccamacho, cjanisze, cjeanner, dmacpher, emacchi, hbrock, jslagle, mbracho, mburns, mcornea, morazi, sgolovat | ||||||||||
Target Milestone: | z2 | Keywords: | Reopened, Triaged, ZStream | ||||||||||
Target Release: | 13.0 (Queens) | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | openstack-tripleo-heat-templates-8.0.2-47.el7ost | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | |||||||||||||
: | 1601348 (view as bug list) | Environment: | |||||||||||
Last Closed: | 2018-08-29 16:37:56 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 1601348 | ||||||||||||
Attachments: |
|
Description
Darin Sorrentino
2018-07-06 18:38:10 UTC
Created attachment 1457062 [details]
controller-0 sosreport
Created attachment 1457063 [details]
controller-1 sosreport
Created attachment 1457064 [details]
controller-2 sosreport
Gnocchi containers look like they are on the undercloud: (undercloud) [stack@ds-hf-ca-undercloud ~]$ curl -X GET http://172.16.0.11:8787/v2/_catalog {"repositories":["rhosp13/openstack-aodh-api","rhosp13/openstack-aodh-evaluator","rhosp13/openstack-aodh-listener","rhosp13/openstack-aodh-notifier","rhosp13/openstack-ceilometer-central","rhosp13/openstack-ceilometer-compute","rhosp13/openstack-ceilometer-notification","rhosp13/openstack-cinder-api","rhosp13/openstack-cinder-scheduler","rhosp13/openstack-cinder-volume","rhosp13/openstack-cron","rhosp13/openstack-glance-api","rhosp13/openstack-gnocchi-api","rhosp13/openstack-gnocchi-metricd","rhosp13/openstack-gnocchi-statsd","rhosp13/openstack-haproxy","rhosp13/openstack-heat-api","rhosp13/openstack-heat-api-cfn","rhosp13/openstack-heat-engine","rhosp13/openstack-horizon","rhosp13/openstack-iscsid","rhosp13/openstack-keystone","rhosp13/openstack-mariadb","rhosp13/openstack-memcached","rhosp13/openstack-neutron-dhcp-agent","rhosp13/openstack-neutron-l3-agent","rhosp13/openstack-neutron-metadata-agent","rhosp13/openstack-neutron-openvswitch-agent","rhosp13/openstack-neutron-server","rhosp13/openstack-nova-api","rhosp13/openstack-nova-compute","rhosp13/openstack-nova-conductor","rhosp13/openstack-nova-consoleauth","rhosp13/openstack-nova-libvirt","rhosp13/openstack-nova-novncproxy","rhosp13/openstack-nova-placement-api","rhosp13/openstack-nova-scheduler","rhosp13/openstack-panko-api","rhosp13/openstack-rabbitmq","rhosp13/openstack-redis","rhosp13/openstack-swift-account","rhosp13/openstack-swift-container","rhosp13/openstack-swift-object","rhosp13/openstack-swift-proxy-server"]} (undercloud) [stack@ds-hf-ca-undercloud ~]$ Leaving here as a reference an open Gnocchi issue which shows the same ConnectionError: (\'Connection aborted.\', BadStatusLine(\\"\'\'\\",))" error: https://github.com/gnocchixyz/gnocchi/issues/509 Re-opening this as it happened again on another run through. Darin hit this issue again today and after investigating the failed environment the root cause is that gnocchi_db_sync container runs before the swift_proxy container which obviously fails as the storage backend for gnocchi is swift which is not available at that moment. We need to start gnocchi_db_sync only after the swift_proxy container becomes available. Start time of the swift_proxy container: 19:52:02 [root@overcloud-controller-0 ~]# docker inspect swift_proxy | grep StartedAt "StartedAt": "2018-07-10T19:52:02.828526308Z", Start time of the gnocchi db sync failure: 19:51:39 2018-07-10 19:51:34,456 [19] INFO gnocchi.service: Gnocchi version 4.2.3 2018-07-10 19:51:36,896 [19] INFO gnocchi.cli.manage: Upgrading indexer SQLAlchemyIndexer: mysql+pymysql://gnocchi:DVmtaW8AKKchYgAmVq7K4gW8K.1.16/gnocchi?read_default_group=tripleo&read_default_file=/etc/my.cnf.d/tripleo.cnf 2018-07-10 19:51:37,505 [19] INFO gnocchi.cli.manage: Upgrading storage SwiftStorage: gnocchi 2018-07-10 19:51:37,511 [19] INFO gnocchi.cli.manage: Upgrading incoming storage SwiftStorage 2018-07-10 19:51:39,998 [19] CRITICAL root: Traceback (most recent call last): File "/usr/bin/gnocchi-upgrade", line 10, in <module> sys.exit(upgrade()) File "/usr/lib/python2.7/site-packages/gnocchi/cli/manage.py", line 73, in upgrade i.upgrade(conf.sacks_number) File "/usr/lib/python2.7/site-packages/gnocchi/incoming/__init__.py", line 71, in upgrade self.set_storage_settings(num_sacks) File "/usr/lib/python2.7/site-packages/gnocchi/incoming/swift.py", line 47, in set_storage_settings self.swift.put_container(self.CFG_PREFIX) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1773, in put_container query_string=query_string) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry service_token=self.service_token, **kwargs) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1030, in put_container conn.request(method, path, '', headers) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request files=files, **self.requests_args) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request return self.request_session.request(*arg, **kwarg) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send r = adapter.send(request, **kwargs) File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send raise ConnectionError(err, request=request) ConnectionError: ('Connection aborted.', BadStatusLine("''",)) Swift logs around that time: [root@overcloud-controller-0 ~]# grep -v haproxy /var/log/containers/swift/swift.log | grep 'Jul 10 19:5[1-2]' Jul 10 19:51:21 overcloud-controller-0 container-server: Starting 1 Jul 10 19:51:23 overcloud-controller-0 object-server: Started child 26 Jul 10 19:51:23 overcloud-controller-0 object-server: Started child 27 Jul 10 19:51:26 overcloud-controller-0 object-server: Starting 1 Jul 10 19:51:26 overcloud-controller-0 object-server: Begin object audit "forever" mode (ZBF) Jul 10 19:51:26 overcloud-controller-0 object-server: Object audit (ZBF). Since Tue Jul 10 19:51:26 2018: Locally: 1 passed, 0 quarantined, 0 errors, files/sec: 370.95, bytes/sec: 0.00, Total time: 0.00, Auditing time: 0.00, Rate: 0.00 Jul 10 19:51:26 overcloud-controller-0 object-server: Begin object audit "forever" mode (ALL) Jul 10 19:51:27 overcloud-controller-0 account-server: Starting 1 Jul 10 19:51:39 overcloud-controller-0 object-server: Object audit (ZBF) "forever" mode completed: 12.76s. Total quarantined: 0, Total errors: 0, Total files/sec: 50.14, Total bytes/sec: 0.00, Auditing time: 12.49, Rate: 0.98 Jul 10 19:51:41 overcloud-controller-0 account-server: Starting 1 Jul 10 19:51:46 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:46 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:46 overcloud-controller-0 container-server: Started child 20 Jul 10 19:51:46 overcloud-controller-0 container-server: Started child 21 Jul 10 19:51:46 overcloud-controller-0 object-server: Object audit (ALL). Since Tue Jul 10 19:51:26 2018: Locally: 1 passed, 0 quarantined, 0 errors, files/sec: 0.05, bytes/sec: 9998697.01, Total time: 20.48, Auditing time: 0.00, Rate: 0.00 Jul 10 19:51:47 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:47 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:47 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:47 overcloud-controller-0 container-server: Option allow_versions is deprecated. Configure the versioned_writes middleware in the proxy-server instead. This option will be ignored in a future release. Jul 10 19:51:48 overcloud-controller-0 object-server: Starting 1 Jul 10 19:51:48 overcloud-controller-0 object-server: Starting object replicator in daemon mode. Jul 10 19:51:48 overcloud-controller-0 object-server: Starting object replication pass. Jul 10 19:51:54 overcloud-controller-0 account-server: Starting 1 Jul 10 19:51:55 overcloud-controller-0 account-server: Beginning replication run Jul 10 19:51:55 overcloud-controller-0 account-server: ERROR reading HTTP response from {'index': 1, u'replication_port': 6002, u'weight': 100.0, u'zone': 1, u'ip': u'172.16.4.13', u'region': 1, u'id': 2, u'replication_ip': u'172.16.4.13', u'meta': u'', u'device': u'd1', u'port': 6002}: Connection refused Jul 10 19:51:55 overcloud-controller-0 account-server: ERROR reading HTTP response from {'index': 2, u'replication_port': 6002, u'weight': 100.0, u'zone': 1, u'ip': u'172.16.4.18', u'region': 1, u'id': 1, u'replication_ip': u'172.16.4.18', u'meta': u'', u'device': u'd1', u'port': 6002}: Connection refused Jul 10 19:51:55 overcloud-controller-0 account-server: Replication run OVER Jul 10 19:51:55 overcloud-controller-0 account-server: Attempted to replicate 1 dbs in 0.00957 seconds (104.49079/s) Jul 10 19:51:55 overcloud-controller-0 account-server: Removed 0 dbs Jul 10 19:51:55 overcloud-controller-0 account-server: 0 successes, 2 failures Jul 10 19:51:55 overcloud-controller-0 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:0 remote_merge:0 rsync:0 ts_repl:0 Jul 10 19:51:56 overcloud-controller-0 object-server: Starting 1 Jul 10 19:51:56 overcloud-controller-0 container-server: Starting 1 Jul 10 19:51:57 overcloud-controller-0 object-expirer: Starting 1 Jul 10 19:51:58 overcloud-controller-0 container-server: Starting 1 Jul 10 19:52:02 overcloud-controller-0 account-server: Started child 20 Jul 10 19:52:02 overcloud-controller-0 account-server: Started child 21 Jul 10 19:52:09 overcloud-controller-0 object-server: Begin object audit "forever" mode (ZBF) Jul 10 19:52:09 overcloud-controller-0 object-server: Object audit (ZBF). Since Tue Jul 10 19:52:09 2018: Locally: 1 passed, 0 quarantined, 0 errors, files/sec: 5.51, bytes/sec: 0.00, Total time: 0.18, Auditing time: 0.00, Rate: 0.00 Jul 10 19:52:10 overcloud-controller-0 container-server: Beginning replication run Jul 10 19:52:12 overcloud-controller-0 container-server: Replication run OVER Jul 10 19:52:12 overcloud-controller-0 container-server: Attempted to replicate 26 dbs in 1.84044 seconds (14.12709/s) Jul 10 19:52:12 overcloud-controller-0 container-server: Removed 0 dbs Jul 10 19:52:12 overcloud-controller-0 container-server: 52 successes, 0 failures Jul 10 19:52:12 overcloud-controller-0 container-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:52 remote_merge:0 rsync:0 ts_repl:0 Jul 10 19:52:13 overcloud-controller-0 proxy-server: Adding required filter listing_formats to pipeline at position 4 Jul 10 19:52:13 overcloud-controller-0 proxy-server: Adding required filter gatekeeper to pipeline at position 1 Jul 10 19:52:13 overcloud-controller-0 proxy-server: Pipeline was modified. New pipeline is "catch_errors gatekeeper healthcheck proxy-logging cache listing_formats ratelimit bulk tempurl formpost authtoken keystone staticweb copy container_quotas account_quotas slo dlo versioned_writes proxy-logging proxy-server". Jul 10 19:52:13 overcloud-controller-0 proxy-server: object_post_as_copy=true is deprecated; This option is now ignored Jul 10 19:52:13 overcloud-controller-0 proxy-server: Starting Keystone auth_token middleware Jul 10 19:52:13 overcloud-controller-0 proxy-server: AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True. Jul 10 19:52:13 overcloud-controller-0 proxy-server: Using /var/cache/swift as cache directory for signing certificate Jul 10 19:52:13 overcloud-controller-0 proxy-server: signing_dir mode is 0755 instead of 0700 Jul 10 19:52:13 overcloud-controller-0 proxy-server: Started child 26 Jul 10 19:52:13 overcloud-controller-0 proxy-server: Started child 27 Jul 10 19:52:15 overcloud-controller-0 proxy-server: Adding required filter listing_formats to pipeline at position 4 Jul 10 19:52:15 overcloud-controller-0 proxy-server: Adding required filter gatekeeper to pipeline at position 1 Jul 10 19:52:15 overcloud-controller-0 proxy-server: Pipeline was modified. New pipeline is "catch_errors gatekeeper healthcheck proxy-logging cache listing_formats ratelimit bulk tempurl formpost authtoken keystone staticweb copy container_quotas account_quotas slo dlo versioned_writes proxy-logging proxy-server". Jul 10 19:52:15 overcloud-controller-0 proxy-server: Adding required filter listing_formats to pipeline at position 4 Jul 10 19:52:15 overcloud-controller-0 proxy-server: Adding required filter gatekeeper to pipeline at position 1 Jul 10 19:52:15 overcloud-controller-0 proxy-server: Pipeline was modified. New pipeline is "catch_errors gatekeeper healthcheck proxy-logging cache listing_formats ratelimit bulk tempurl formpost authtoken keystone staticweb copy container_quotas account_quotas slo dlo versioned_writes proxy-logging proxy-server". Jul 10 19:52:15 overcloud-controller-0 proxy-server: object_post_as_copy=true is deprecated; This option is now ignored Jul 10 19:52:15 overcloud-controller-0 proxy-server: Starting Keystone auth_token middleware Jul 10 19:52:15 overcloud-controller-0 proxy-server: object_post_as_copy=true is deprecated; This option is now ignored Jul 10 19:52:15 overcloud-controller-0 proxy-server: Starting Keystone auth_token middleware Jul 10 19:52:16 overcloud-controller-0 proxy-server: AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True. Jul 10 19:52:16 overcloud-controller-0 proxy-server: AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True. Jul 10 19:52:16 overcloud-controller-0 proxy-server: Using /var/cache/swift as cache directory for signing certificate Jul 10 19:52:16 overcloud-controller-0 proxy-server: signing_dir mode is 0755 instead of 0700 Jul 10 19:52:16 overcloud-controller-0 proxy-server: Using /var/cache/swift as cache directory for signing certificate Jul 10 19:52:16 overcloud-controller-0 proxy-server: signing_dir mode is 0755 instead of 0700 Jul 10 19:52:22 overcloud-controller-0 object-server: Object audit (ZBF) "forever" mode completed: 12.95s. Total quarantined: 0, Total errors: 0, Total files/sec: 49.43, Total bytes/sec: 0.00, Auditing time: 12.32, Rate: 0.95 Jul 10 19:52:25 overcloud-controller-0 account-server: Beginning replication run Jul 10 19:52:25 overcloud-controller-0 account-server: Replication run OVER Jul 10 19:52:25 overcloud-controller-0 account-server: Attempted to replicate 1 dbs in 0.03578 seconds (27.94853/s) Jul 10 19:52:25 overcloud-controller-0 account-server: Removed 0 dbs Jul 10 19:52:25 overcloud-controller-0 account-server: 2 successes, 0 failures Jul 10 19:52:25 overcloud-controller-0 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0 Jul 10 19:52:31 overcloud-controller-0 object-server: 971/971 (100.00%) partitions replicated in 43.38s (22.38/sec, 0s remaining) Jul 10 19:52:31 overcloud-controller-0 object-server: 1942 successes, 0 failures Jul 10 19:52:31 overcloud-controller-0 object-server: 640 suffixes checked - 0.00% hashed, 0.00% synced Jul 10 19:52:31 overcloud-controller-0 object-server: Partition times: max 1.2668s, min 0.0058s, med 0.0340s Jul 10 19:52:31 overcloud-controller-0 object-server: Object replication complete. (0.72 minutes) Jul 10 19:52:40 overcloud-controller-0 container-server: Beginning replication run Jul 10 19:52:41 overcloud-controller-0 container-server: Replication run OVER Jul 10 19:52:41 overcloud-controller-0 container-server: Attempted to replicate 26 dbs in 0.87985 seconds (29.55065/s) Jul 10 19:52:41 overcloud-controller-0 container-server: Removed 0 dbs Jul 10 19:52:41 overcloud-controller-0 container-server: 52 successes, 0 failures Jul 10 19:52:41 overcloud-controller-0 container-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:52 remote_merge:0 rsync:0 ts_repl:0 Jul 10 19:52:52 overcloud-controller-0 object-server: Begin object audit "forever" mode (ZBF) Jul 10 19:52:52 overcloud-controller-0 object-server: Object audit (ZBF). Since Tue Jul 10 19:52:52 2018: Locally: 1 passed, 0 quarantined, 0 errors, files/sec: 406.35, bytes/sec: 0.00, Total time: 0.00, Auditing time: 0.00, Rate: 0.00 Jul 10 19:52:55 overcloud-controller-0 account-server: Beginning replication run Jul 10 19:52:55 overcloud-controller-0 account-server: Replication run OVER Jul 10 19:52:55 overcloud-controller-0 account-server: Attempted to replicate 1 dbs in 0.02268 seconds (44.09931/s) Jul 10 19:52:55 overcloud-controller-0 account-server: Removed 0 dbs Jul 10 19:52:55 overcloud-controller-0 account-server: 2 successes, 0 failures Jul 10 19:52:55 overcloud-controller-0 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0 [root@overcloud-controller-0 ~]# Let me know when we get some draft text here. I'll regenerate the OSP13 release notes include it. This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2574 |