Bug 1392752

Summary: Traceback: SSLError Too many open files - in metricd.log
Product: Red Hat OpenStack Reporter: Yurii Prokulevych <yprokule>
Component: openstack-gnocchiAssignee: Mehdi ABAAKOUK <mabaakou>
Status: CLOSED CURRENTRELEASE QA Contact: Sasha Smolyak <ssmolyak>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: apevec, jdanjou, jschluet, lhh, mabaakou, pkilambi
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-21 06:41:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Yurii Prokulevych 2016-11-08 07:03:25 UTC
Description of problem:
-----------------------
Traceback in gnocchi/metricd.log:

2016-11-08 06:52:46.040 18266 DEBUG gnocchi.storage [-] Expunging deleted metrics process_background_tasks /usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py:191
2016-11-08 06:52:47.888 18267 DEBUG gnocchi.storage [-] Processing new and to delete measures process_background_tasks /usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py:183
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage [-] Unexpected error during measures processing
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage Traceback (most recent call last):
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 185, in process_background_tasks
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     self.process_measures(index, block_size, sync)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/gnocchi/storage/_carbonara.py", line 322, in process_measures
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     block_size, full=sync)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 142, in _list_metric_with_measures_to_process
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     limit=limit)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1638, in get_container
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     full_listing=full_listing, headers=headers)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1565, in _retry
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     service_token=self.service_token, **kwargs)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 870, in get_container
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     conn.request(method, '%s?%s' % (cont_path, qs), '', headers)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 401, in request
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     files=files, **self.requests_args)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 384, in _request
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     return self.request_session.request(*arg, **kwarg)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     resp = self.send(prep, **send_kwargs)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     r = adapter.send(request, **kwargs)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 477, in send
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage     raise SSLError(e, request=request)
2016-11-08 06:52:47.891 18267 ERROR gnocchi.storage SSLError: [Errno 24] Too many open files


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-gnocchi-indexer-sqlalchemy-2.1.3-3.el7ost.noarch
openstack-gnocchi-statsd-2.1.3-3.el7ost.noarch
python-gnocchiclient-2.2.0-1.el7ost.noarch
python-gnocchi-2.1.3-3.el7ost.noarch
openstack-gnocchi-api-2.1.3-3.el7ost.noarch
openstack-gnocchi-common-2.1.3-3.el7ost.noarch
openstack-gnocchi-metricd-2.1.3-3.el7ost.noarch
openstack-gnocchi-carbonara-2.1.3-3.el7ost.noarch

openstack-swift-2.7.0-2.el7ost.noarch
openstack-swift-object-2.7.0-2.el7ost.noarch
openstack-swift-plugin-swift3-1.10-1.el7ost.noarch
openstack-swift-container-2.7.0-2.el7ost.noarch
openstack-swift-proxy-2.7.0-2.el7ost.noarch
python-swiftclient-3.0.0-1.el7ost.noarch
openstack-swift-account-2.7.0-2.el7ost.noarch

Steps to Reproduce:
1. Deploy RHOS-9 with SSL on overcloud
2. Configure Gnocchi to use Swift backend
3. Change polling interval on computes to 60seconds
4. Spawn few vms
5. Monitor gnocchi/metricd.log

Additional info:
----------------
Virtual setup: 3controllers + 1compute + 1ceph
Restarting openstack-gnocchi-metricd helps to get rid of traceback.

Comment 1 Mehdi ABAAKOUK 2016-11-08 08:44:05 UTC
I haven't found yet the root cause of this but we have the eventpoll fd open too many times (>=1000):

# lsof -p 1437 | tail -10
gnocchi-m 1437 gnocchi 1014u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1015u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1016u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1017u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1018u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1019u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1020u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1021u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1022u  a_inode                0,9         0     5796 [eventpoll]
gnocchi-m 1437 gnocchi 1023u  a_inode                0,9         0     5796 [eventpoll]

This is not due to a thread leak:

# grep Threads /proc/1437/status
Threads:        2

So something, reopen a new epoll fd without closing the old one.

Comment 2 Mehdi ABAAKOUK 2016-11-28 12:38:44 UTC
This is doesn't occurs on OSP10

Comment 4 Julien Danjou 2017-07-21 06:41:48 UTC
As this does not happen in OSP10 and it's not clear what caused it in OSP9, marking this as closed.