Bug 1280487 - Nova compute manager fails to connect to Neutron; errors from nova-ceph due to file descriptor limits
Nova compute manager fails to connect to Neutron; errors from nova-ceph due t...
Status: CLOSED NOTABUG
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
6.0 (Juno)
Unspecified Unspecified
high Severity high
: ---
: 6.0 (Juno)
Assigned To: Eoghan Glynn
nlevinki
: Unconfirmed, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-11 15:53 EST by Jeremy
Modified: 2015-11-17 15:55 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-17 15:55:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeremy 2015-11-11 15:53:47 EST
Description of problem:

We are seeing the following errors in the /var/log/nova-compute.log on all our compute nodes. We need to determine what is causing this:

2015-11-11 07:41:50.239 20900 ERROR nova.compute.manager [-] [instance: 2ae6a0ea-dacf-4944-a915-038547533667] An error occurred while refreshing the network cache.
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667] Traceback (most recent call last):
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 5432, in _heal_instance_info_cache
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     self._get_instance_nw_info(context, instance, use_slave=True)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1292, in _get_instance_nw_info
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     instance)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 599, in get_instance_nw_info
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     port_ids)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 613, in _get_instance_nw_info
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     port_ids)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 1349, in _build_network_info_model
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     data = client.list_ports(**search_opts)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/__init__.py", line 84, in wrapper
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = obj(*args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 98, in with_params
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = self.function(instance, *args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 312, in list_ports
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     **_params)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/__init__.py", line 84, in wrapper
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = obj(*args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1334, in list
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     for r in self._pagination(collection, path, **params):
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1347, in _pagination
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     res = self.get(path, params=params)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/__init__.py", line 84, in wrapper
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = obj(*args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1320, in get
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     headers=headers, params=params)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/__init__.py", line 84, in wrapper
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = obj(*args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1297, in retry_request
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     headers=headers, params=params)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/__init__.py", line 84, in wrapper
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     ret = obj(*args, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 1240, in do_request
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     content_type=self.content_type())
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/client.py", line 187, in do_request
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     self.endpoint_url + url, method, **kwargs)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]   File "/usr/lib/python2.7/site-packages/neutronclient/client.py", line 133, in _cs_request
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]     raise exceptions.ConnectionFailed(reason=e)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667] ConnectionFailed: Connection to neutron failed: HTTPConnectionPool(host='10.29.1.30', port=9696): Max retries exceeded with url: /v2.0/ports.json?tenant_id=6a8d91c2ae924da8b3f2fdb2ec7bd2f9&device_id=2ae6a0ea-dacf-4944-a915-038547533667 (Caused by <class 'socket.error'>: [Errno 24] Too many open files)
2015-11-11 07:41:50.239 20900 TRACE nova.compute.manager [instance: 2ae6a0ea-dacf-4944-a915-038547533667]

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 4 Kashyap Chamarthy 2015-11-13 09:48:28 EST
I'm no Ceph expert, but, for the "Too many open files" error from the 2015-11-06-14xxxx/lshkvm211a/var/log/nova/rbd-nova.log, seems like you may have to tune `ulimit` settings:

The below enumerates your hard and soft limits:

    ulimit -Sa
    ulimit -Ha

If that's not sufficient for Ceph, you may have to tune them.

See also related thread on Ceph users list:

    http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-June/040630.html

Note You need to log in before you can comment on or make changes to this bug.