Bug 1702213

Summary: Cinder backup service fails to initialize - "Could not determine which Swift endpoint to use
Product: Red Hat OpenStack Reporter: Tzach Shefi <tshefi>
Component: openstack-cinderAssignee: Alan Bishop <abishop>
Status: CLOSED ERRATA QA Contact: Tzach Shefi <tshefi>
Severity: high Docs Contact: Tana <tberry>
Priority: medium    
Version: 15.0 (Stein)CC: abishop, cschwede, mburns
Target Milestone: betaKeywords: Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-cinder-14.0.1-0.20190507170400.bec06e6.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:21:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Overcloud_deploy.sh log and Cinder backup log none

Description Tzach Shefi 2019-04-23 08:38:42 UTC
Created attachment 1557485 [details]
Overcloud_deploy.sh log and Cinder backup log

Description of problem: On a pre installed overcloud I'd added cinder backup and barbican yamls to the overcloud_deploy.sh. Reran overcloud_deploy.sh I now have Cinder backup service enabled on controller-1 yet it's state is down. 

Below is the error i noticed on cinder-backup.log, looks like a config issue with Cinder backup's backend Swift settings. As this worked on previous release up to 14, I suspect it's not Cinder or Cinder backup's fault but rather a missing or incomplete THT setting. Then again it might be a change in Cinder's backup config then this bug should be moved to cinder. 


2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall [-] Fixed interval looping call 'cinder.backup.manager.BackupManager._setup_backup_driver' failed: cinder.exception.BackupDriverException: Could not determine which Swift endpoint to use. This can either be set in the service catalog or with the cinder.conf config option 'backup_swift_url'.
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall Traceback (most recent call last):
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall     result = func(*self.args, **self.kw)
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/cinder/backup/manager.py", line 151, in _setup_backup_driver
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall     backup_service = self.service(context=ctxt, db=self.db)
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/cinder/backup/drivers/swift.py", line 159, in __init__
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall     self.initialize()
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/cinder/backup/drivers/swift.py", line 249, in initialize
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall     "Could not determine which Swift endpoint to use. This "
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall cinder.exception.BackupDriverException: Could not determine which Swift endpoint to use. This can either be set in the service catalog or with the cinder.conf config option 'backup_swift_url'.
2019-04-23 07:20:49.511 35 ERROR oslo.service.loopingcall

Version-Release number of selected component (if applicable):
rhel8

openstack-cinder-14.0.1-0.20190411150403.dee7292.el8ost.noarch

Wasn't sure which one is relevant
python3-heat-agent-puppet-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
python3-heat-agent-docker-cmd-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
python3-heat-agent-apply-config-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
openstack-heat-engine-12.0.0-0.20190410170351.8fa8cc3.el8ost.noarch
python3-heatclient-1.17.0-0.20190312144725.8af5deb.el8ost.noarch
python3-heat-agent-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
python3-heat-agent-json-file-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
openstack-heat-common-12.0.0-0.20190410170351.8fa8cc3.el8ost.noarch
openstack-heat-monolith-12.0.0-0.20190410170351.8fa8cc3.el8ost.noarch
python3-tripleoclient-heat-installer-11.4.1-0.20190412180345.5ef79e3.el8ost.noarch
python3-heat-agent-ansible-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
puppet-heat-14.4.1-0.20190328231137.67be493.el8ost.noarch
openstack-heat-agents-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
openstack-heat-api-12.0.0-0.20190410170351.8fa8cc3.el8ost.noarch
heat-cfntools-1.4.2-6.el8ost.noarch
python3-heat-agent-hiera-1.8.1-0.20190402070337.ad2a5d1.el8ost.noarch
openstack-tripleo-heat-templates-10.4.1-0.20190418001525.991fa08.el8ost.noarch


How reproducible:
Unsure first time I hit his. 

Steps to Reproduce:
1. Deployed overcloud without Cinder backup service.

2. Add Cinder backup (and Barbican) to the overcloud_deploy.sh
I suspect Barbican can be ignored as it's irrelevant to this bz. 

-e /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/barbican.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/barbican-backend-simple-crypto.yaml \

Rerun overcloud_deploy.sh

3. Deployment completed Cinder's backup service is enabled which is great yet it's status is down. 
 
(overcloud) [stack@undercloud-0 ~]$ cinder service-list
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+
| Binary           | Host                    | Zone | Status  | State | Updated_at                 | Disabled Reason |
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+
| cinder-backup    | controller-1            | nova | enabled | down  | 2019-04-23T07:20:49.000000 | -               |
| cinder-scheduler | controller-0            | nova | enabled | up    | 2019-04-23T07:24:51.000000 | -               |
| cinder-scheduler | controller-1            | nova | enabled | up    | 2019-04-23T07:24:52.000000 | -               |
| cinder-scheduler | controller-2            | nova | enabled | up    | 2019-04-23T07:24:52.000000 | -               |
| cinder-volume    | hostgroup@tripleo_iscsi | nova | enabled | up    | 2019-04-23T07:24:53.000000 | -               |
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+



Actual results:
Backup service is down


Expected results:
Backup service should be enabled and running/up. 

Additional info:

Comment 1 Alan Bishop 2019-04-23 19:51:33 UTC
I examined Tzach's system and the problem is none of the backup settings are configured in cinder.conf. I think I spotted an error in the new "flattened" THT for the cinder-backup service.

Comment 2 Alan Bishop 2019-04-25 17:22:46 UTC
I tracked this problem to something in cinder. There's another problem with the TripleO templates, but I'll file a separate bug for that issue.

Comment 5 Christian Schwede (cschwede) 2019-05-08 11:24:56 UTC
Build included in latest puddle.

Comment 6 Tzach Shefi 2019-05-14 07:17:48 UTC
Alan, 

Unsure why isn't on_qa (yet), 
as fixed-in version is present on my deployment:
15-trunk  -p RHOS_TRUNK-15.0-RHEL-8-20190509.n.1
 
Any way FYI
On a Ceph backed deployment, Cinder backup(ceph backed) service started up fine as expected. 
I'll retry on a Cinder backup(swift backed) deployment and report once I have results.

Comment 7 Alan Bishop 2019-05-14 19:50:07 UTC
No doc text required. This was a regression introduced and fixed in stein prior to the release of OSP-15.

Comment 9 Tzach Shefi 2019-05-19 07:39:57 UTC
Verified on: 
15-trunk  -p RHOS_TRUNK-15.0-RHEL-8-20190509.n.1
openstack-cinder-14.0.1-0.20190507170400.bec06e6.el8ost.noarch

Deployed LVM based system with Cinder backup service,
service is enabled and now (as apposed to before) state is up, argo bug verified. 

(overcloud) [stack@undercloud-0 ~]$ cinder service-list
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+
| Binary           | Host                    | Zone | Status  | State | Updated_at                 | Disabled Reason |
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+
| cinder-backup    | controller-0            | nova | enabled | up    | 2019-05-19T07:29:54.000000 | -               |
| cinder-scheduler | controller-0            | nova | enabled | up    | 2019-05-19T07:29:57.000000 | -               |
| cinder-scheduler | controller-1            | nova | enabled | up    | 2019-05-19T07:29:58.000000 | -               |
| cinder-scheduler | controller-2            | nova | enabled | up    | 2019-05-19T07:29:57.000000 | -               |
| cinder-volume    | hostgroup@tripleo_iscsi | nova | enabled | up    | 2019-05-19T07:29:57.000000 | -               |
+------------------+-------------------------+------+---------+-------+----------------------------+-----------------+

Comment 12 errata-xmlrpc 2019-09-21 11:21:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811