| Summary: | Undercloud Ironic sync_power_state failed & lost instance_uuid/info (Switching node to maintenance) | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Anil Dhingra <adhingra> |
| Component: | openstack-ironic | Assignee: | Lucas Alvares Gomes <lmartins> |
| Status: | CLOSED WONTFIX | QA Contact: | Raviv Bar-Tal <rbartal> |
| Severity: | urgent | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.0 (Kilo) | CC: | adhingra, dtantsur, lhh, mburns, rhel-osp-director-maint, srevivo |
| Target Milestone: | async | ||
| Target Release: | 10.0 (Newton) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-10-14 16:18:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Anil Dhingra
2016-09-06 06:25:51 UTC
uploaded complete nova / ironic logs to collab-shell.usersys.redhat.com (In reply to Anil Dhingra from comment #11) > uploaded complete nova / ironic logs to collab-shell.usersys.redhat.com Thanks for that, I looked at the logs and it's all very strage. E.g: $ grep -Ri deployed ironic-logs-060916.tar.gz/var/log/ironic/ironic-conductor.log | wc -l 0 $ grep -R provision ironic-logs-060916.tar.gz/var/log/ironic/ironic-api.log 2016-09-06 13:17:18.023 16162 INFO ironic.cmd.api [-] conductor.check_provision_state_interval = 60 I can't find a single line showing the request to deploy an instance in Ironic, maybe it happened before the logs date (it starts at 2016-05-25 03:58:32.179) Also, looking at the nova-compute logs from 2016-07-24 to 2016-07-29 they are filled up with errors, more specifically: 2016-07-28 03:31:41.174 19392 ERROR nova.compute.manager [-] [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] Periodic sync_power_state task had an error while processing an instance. 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] Traceback (most recent call last): 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6100, in _sync 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] query_driver_power_state_and_sync() 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 445, in inner 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] return f(*args, **kwargs) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6097, in query_driver_power_state_and_sync 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] self._query_driver_power_state_and_sync(context, db_instance) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6127, in _query_driver_power_state_and_sync 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] vm_instance = self.driver.get_info(db_instance) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 557, in get_info 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] node = _validate_instance_and_node(self.ironicclient, instance) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 127, in _validate_instance_and_node 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] return ironicclient.call("node.get_by_instance_uuid", instance.uuid) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/nova/virt/ironic/client_wrapper.py", line 126, in call 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] return self._multi_getattr(client, method)(*args, **kwargs) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 147, in get_by_instance_uuid 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] nodes = self._list(self._path(path), 'nodes') 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/ironicclient/common/base.py", line 121, in _list 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] resp, body = self.api.json_request('GET', url) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 353, in json_request 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] resp, body_iter = self._http_request(url, method, **kwargs) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 162, in wrapper 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] return func(self, url, method, **kwargs) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 298, in _http_request 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] resp = conn.getresponse() 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib64/python2.7/httplib.py", line 1051, in getresponse 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] response.begin() 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib64/python2.7/httplib.py", line 415, in begin 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] version, status, reason = self._read_status() 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] File "/usr/lib64/python2.7/httplib.py", line 379, in _read_status 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] raise BadStatusLine(line) 2016-07-28 03:31:41.174 19392 TRACE nova.compute.manager [instance: 23b34ae9-81a3-4405-859f-105ce9e6c803] BadStatusLine: '' $ grep -R "error while processing an instance" nova-compute.log-20160726 nova-compute.log-20160728 nova-compute.log-20160730 | wc -l 7000 ... So again, this environment does not seem reliable at all. I can only guess at this point and I think that something external caused the failures, I still think that something happened with the database (so please check the logs as asked on comment #8). But that said, I really don't think there's much we can do here, the environment is just not right. Sorry, we can't really do anything, if the underlying database is not reliable. |