Bug 1308693
| Summary: | 30 seconds response time to ironic node-list on a fresh Red Hat OSP 7 Director undercloud installation | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Andreas Karis <akaris> |
| Component: | openstack-ironic | Assignee: | Lucas Alvares Gomes <lmartins> |
| Status: | CLOSED NOTABUG | QA Contact: | Toure Dunnon <tdunnon> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 (Kilo) | CC: | akaris, dyocum, jwaterwo, lmartins, mburns, rhel-osp-director-maint, sbaker, yeylon |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | 7.0 (Kilo) | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-04-04 16:01:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Andreas Karis
2016-02-15 18:42:52 UTC
additionally, ironic-conductor does not report anything during this time. ironic-api shows nothing for the first 10 seconds, followed by a token message: ironic-api.log -------------------------- <pause> 2016-02-15 11:53:11.127 7494 DEBUG keystonemiddleware.auth_token [-] Removing headers from request environment: X-Service-Catalog,X-Identity-Status,X-Service-Identity-Status,X-Roles,X-Service-Roles,X-Domain-Name,X-Service-Domain-Name,X-Project-Id,X-Service-Project-Id,X-Project-Domain-Name,X-Service-Project-Domain-Name,X-User-Id,X-Service-User-Id,X-User-Name,X-Service-User-Name,X-Project-Name,X-Service-Project-Name,X-User-Domain-Id,X-Service-User-Domain-Id,X-Domain-Id,X-Service-Domain-Id,X-User-Domain-Name,X-Service-User-Domain-Name,X-Project-Domain-Id,X-Service-Project-Domain-Id,X-Role,X-User,X-Tenant-Name,X-Tenant-Id,X-Tenant _remove_auth_headers /usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py:672 -------------------------- I've seen this when the undercloud has entries in /etc/resolv.conf for name servers that cannot be reached. Can you confirm this is causing your issue? If so, the workaround is obvious, but I wonder if ironic could be fixed to behave better when there is bogus name server config. Only when the nameserver cannot be reached or also when it cannot resolve its own reverse DNS entries? I see lots of reverse DNS lookups in the strace for its own IP address. I will check with the customer. Also, I see lots of these. However, in a lab, I can completely stop rabbitmq and ironic list commands still work fine. 2016-02-11 11:52:45.819 20412 DEBUG keystonemiddleware.auth_token [-] Authenticating user token __call__ /usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py:581 2016-02-11 11:52:45.824 20412 DEBUG oslo_messaging._drivers.impl_rabbit [-] Received recoverable error from kombu: on_error /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:810 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit Traceback (most recent call last): 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/utils/__init__.py", line 243, in retry_over_time 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit return fun(*args, **kwargs) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 237, in connect 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit return self.connection 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 741, in connection 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit self._connection = self._establish_connection() 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 696, in _establish_connection 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit conn = self.transport.establish_connection() 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 112, in establish_connection 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit conn = self.Connection(**opts) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 183, in __init__ 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit return self._x_open(virtual_host) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 625, in _x_open 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit self._send_method((10, 40), args) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 56, in _send_method 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit self.channel_id, method_sig, args, content, 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 221, in write_method 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit write_frame(1, channel, payload) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 182, in write_frame 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit frame_type, channel, size, payload, 0xce, 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 377, in sendall 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit tail = self.send(data, flags) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 359, in send 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit total_sent += fd.send(data[total_sent:], flags) 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit error: [Errno 32] Broken pipe 2016-02-11 11:52:45.824 20412 TRACE oslo_messaging._drivers.impl_rabbit 2016-02-11 11:52:45.827 20412 ERROR oslo_messaging._drivers.impl_rabbit [-] AMQP server on 172.24.252.205:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 2 seconds. I only saw this with unreachable name servers, specifically when my VPN connection went down. Hello, adding the node's br-ctlplane IP address into /etc/hosts fixed the issue for 1 out of 2 customers. I will verify with the second customer as well. It seems that ironic is trying to resolv PTR for its own address. If this is the solution for both customers, then we should definitely update the documentation. Let me get back to you to see if it works for the second customer as well. Thanks. Long pauses in keystone requests indicate that the keystone.tokens table has grown without bound. Have you enabled a cron job to clear this out every 24 hours? This is in the tuning section of the Director Install and Config guide. You should also clear out the unusaged heat raw templates table as well since it can grow w/o bound, too: # Flush keystone tokens 0 04 * * * /bin/keystone-manage token_flush # Remove unused raw_templates 0 04 * * * /bin/heat-manage purge_deleted -g days 1 We resolved this issue a long time ago. As far as I remember, it was due to DNS issues. Thanks, Andreas |