Description of problem: OpenStack metadata service is unstable. From time to time, it stops behaving properly, which translates into HTTP/500 errors seen by instances when booting: cloud-init[845]: 2014-08-08 04:13:38,554 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [77/120s]: bad status code [500] I've traced this to the following errors in /var/log/nova/api.log: 2014-08-08 10:13:10.184 4198 WARNING keystoneclient.middleware.auth_token [-] Unable to find authentication token in headers 2014-08-08 10:13:10.795 4199 ERROR nova.api.metadata.handler [-] Failed to get metadata for instance id: 810ff4bd-245d-44b9-96af-439618e0484a 2014-08-08 10:13:10.795 4199 ERROR nova.api.ec2 [-] FaultWrapper: 'NoneType' object has no attribute 'instance' 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 Traceback (most recent call last): 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/nova/api/ec2/__init__.py", line 87, in __call__ 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 return req.get_response(self.application) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/request.py", line 1296, in send 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 application, catch_exc_info=False) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/request.py", line 1260, in call_application 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 app_iter = application(self.environ, start_response) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/dec.py", line 130, in __call__ 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 resp = self.call_func(req, *args, **self.kwargs) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/dec.py", line 195, in call_func 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 return self.func(req, *args, **kwargs) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/nova/api/ec2/__init__.py", line 99, in __call__ 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 rv = req.get_response(self.application) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/request.py", line 1296, in send 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 application, catch_exc_info=False) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/request.py", line 1260, in call_application 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 app_iter = application(self.environ, start_response) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/dec.py", line 130, in __call__ 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 resp = self.call_func(req, *args, **self.kwargs) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/WebOb-1.2.3-py2.6.egg/webob/dec.py", line 195, in call_func 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 return self.func(req, *args, **kwargs) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/nova/api/metadata/handler.py", line 107, in __call__ 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 meta_data = self._handle_instance_id_request(req) 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 File "/usr/lib/python2.6/site-packages/nova/api/metadata/handler.py", line 203, in _handle_instance_id_request 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 if meta_data.instance['project_id'] != tenant_id: 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 AttributeError: 'NoneType' object has no attribute 'instance' 2014-08-08 10:13:10.795 4199 TRACE nova.api.ec2 We also found the following errors in /var/log/neutron/metadata-agent.log: 2014-08-08 06:01:47.493 21793 TRACE neutron.agent.metadata.agent 2014-08-08 06:01:48.504 21793 DEBUG neutron.agent.metadata.agent [-] Request: GET /2009-04-04/meta-data/instance-id HTTP/1.0^M Accept-Encoding: gzip, deflate^M Content-Type: text/plain^M Host: 169.254.169.254^M User-Agent: Python-httplib2/0.7.2 (gzip)^M X-Forwarded-For: 192.168.203.7^M X-Neutron-Router-Id: b61f1285-d307-48be-95b6-b21ddc08d37a __call__ /usr/lib/python2.6/site-packages/neutron/agent/metadata/agent.py:116 2014-08-08 06:01:48.505 21793 DEBUG neutron.openstack.common.lockutils [-] Got semaphore "('neutron.agent.metadata.agent.MetadataProxyHandler._get_router_networks', 'b61f1285-d307-48be-95b6-b21ddc08d37a')" lock /usr/lib/python2.6/site-packages/neutron/openstack/common/lockutils.py:168 2014-08-08 06:01:48.505 21793 DEBUG neutron.openstack.common.lockutils [-] Got semaphore "('neutron.agent.metadata.agent.MetadataProxyHandler._get_ports_for_remote_address', '192.168.203.7', (u'cd7e5f8d-9ba6-4b2e-b83f-3da366a2e4c2',))" lock /usr/lib/python2.6/site-packages/neutron/openstack/common/lockutils.py:168 2014-08-08 06:01:48.565 21793 ERROR neutron.agent.metadata.agent [-] Unexpected error. 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent Traceback (most recent call last): 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent File "/usr/lib/python2.6/site-packages/neutron/agent/metadata/agent.py", line 120, in __call__ 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent return self._proxy_request(instance_id, tenant_id, req) 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent File "/usr/lib/python2.6/site-packages/neutron/agent/metadata/agent.py", line 227, in _proxy_request 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent raise Exception(_('Unexpected response code: %s') % resp.status) 2014-08-08 06:01:48.565 21793 TRACE neutron.agent.metadata.agent Exception: Unexpected response code: 400 After spending quite some time trying to figure out why 'NoneType' was being returned for meta_data, couldn't figure out anything. I decided to restart Neutron metadata agent service: # service neutron-metadata-agent restart This solved the issue, and metadata requests were working again. However, I find this workaround not acceptable in a production environment. Could we please look into this issue and figure out why it is happening?
At this moment I have api.log and metadata.log. I already asked for the rest attaching files
Created attachment 925625 [details] api
Created attachment 925626 [details] metadata
Created attachment 925716 [details] server_log
# rpm -qa | grep python-neutron python-neutron-2013.2.2-5.el6ost.noarch python-neutronclient-2.3.1-3.el6ost.noarch
So the used versions of both neutron and neutronclient lack multiple significant patches both for performance and stability of metadata agent. I would try to upgrade to latest package versions first before proceeding with debugging.
I also encountered this problem, is there any solution?
haichuan: this bug was opened against RHEL-OSP 4.0. More recent versions (5.0, 6.0) include newer packages which in theory address the comments Ihar made in #10. If you are experiencing this bug in a more recent release of RHEL-OSP, please open a new bug against that version, and attach log files from nova-api and neutron-metadata-agent.
OSP 4 is EOL, customer upgraded to OSP 5 and issue was resolved.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days