Bug 1434344

Summary: openstack-gnocchi-statsd.service can sometimes start only on 1/3 controllers when using swift as backend
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: Gurenko Alex <agurenko>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: dbecker, jdanjou, mburns, mcornea, morazi, pkilambi, rhel-osp-director-maint, tshefi, tvignaud
Target Milestone: rcKeywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.1-0.20170927205937.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1568466 (view as bug list) Environment:
Last Closed: 2017-12-13 21:18:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1568466    

Description Marius Cornea 2017-03-21 10:18:17 UTC
Description of problem:
openstack-gnocchi-statsd.service can sometimes start only on 1/3 controllers when using swift as backend.

Deployment with 3 controllers, 1 compute node and 2 nodes running Heat only services. After deployment:

controller-0 | SUCCESS | rc=0 >>
active


cmd: systemctl is-active openstack-gnocchi-statsd

start: 2017-03-21 10:09:59.074116

end: 2017-03-21 10:09:59.085453

delta: 0:00:00.011337

stdout: active
controller-1 | SUCCESS | rc=0 >>
active


cmd: systemctl is-active openstack-gnocchi-statsd

start: 2017-03-21 10:09:59.305255

end: 2017-03-21 10:09:59.327560

delta: 0:00:00.022305

stdout: active
controller-2 | FAILED | rc=3 >>
failed


cmd: systemctl is-active openstack-gnocchi-statsd

start: 2017-03-21 10:09:59.319084

end: 2017-03-21 10:09:59.337314

delta: 0:00:00.018230

stdout: failed

Checking the logs on the nodes where the service is failed we can see:

[root@controller-2 heat-admin]# tail /var/log/gnocchi/statsd.log 
2017-03-20 22:01:54.502 95788 ERROR gnocchi   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1635, in _retry
2017-03-20 22:01:54.502 95788 ERROR gnocchi     self.url, self.token = self.get_auth()
2017-03-20 22:01:54.502 95788 ERROR gnocchi   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1587, in get_auth
2017-03-20 22:01:54.502 95788 ERROR gnocchi     timeout=self.timeout)
2017-03-20 22:01:54.502 95788 ERROR gnocchi   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 662, in get_auth
2017-03-20 22:01:54.502 95788 ERROR gnocchi     auth_version=auth_version)
2017-03-20 22:01:54.502 95788 ERROR gnocchi   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 580, in get_auth_keystone
2017-03-20 22:01:54.502 95788 ERROR gnocchi     raise ClientException(msg)
2017-03-20 22:01:54.502 95788 ERROR gnocchi ClientException: Unauthorized. Check username, password and tenant name/id.
2017-03-20 22:01:54.502 95788 ERROR gnocchi 


Version-Release number of selected component (if applicable):
openstack-gnocchi-indexer-sqlalchemy-3.0.4-3.el7ost.noarch
openstack-gnocchi-statsd-3.0.4-3.el7ost.noarch
openstack-gnocchi-common-3.0.4-3.el7ost.noarch
openstack-gnocchi-metricd-3.0.4-3.el7ost.noarch
puppet-gnocchi-9.5.0-1.el7ost.noarch
python-gnocchi-3.0.4-3.el7ost.noarch
openstack-gnocchi-api-3.0.4-3.el7ost.noarch
openstack-gnocchi-carbonara-3.0.4-3.el7ost.noarch
python-gnocchiclient-2.6.0-1.el7ost.noarch
openstack-tripleo-heat-templates-6.0.0-0.20170307170102.3134785.0rc2.el7ost.noarch

How reproducible:
Only on this kind of deployment so far.

Steps to Reproduce:
1. Deploy OSP10 with 3 controllers, 1 compute and 2 nodes running Heat only services
2. Check openstack-gnocchi-statsd service on all controllers

Actual results:
Service is failed on 2/3 controllers.

Expected results:
Service gets started on all controllers.

Additional info:
Adding the sosreports. 

Workaround: manually start the service.

Comment 14 errata-xmlrpc 2017-12-13 21:18:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462