Bug 1622072
Summary: | Openstack didn't remove volume on instance deletion | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Pablo Iranzo Gómez <pablo.iranzo> | |
Component: | openstack-nova | Assignee: | Francois Palin <fpalin> | |
Status: | CLOSED EOL | QA Contact: | OSP DFG:Compute <osp-dfg-compute> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 10.0 (Newton) | CC: | astupnik, dasmith, eglynn, fpalin, geguileo, igarciam, jhakimra, kchamart, lyarwood, nlevinki, pablo.iranzo, sbauza, scohen, sgordon, srevivo, tvvcox, vromanso | |
Target Milestone: | --- | Keywords: | Triaged, ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1827413 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-07 10:38:34 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1827413, 1827416, 1827419, 1827420 |
Description
Pablo Iranzo Gómez
2018-08-24 11:34:25 UTC
I see errors in the volume logs caused by a missing default volume type named HBSDVSPG200. We should check the configuration to see if this is expected, which is possible since these are happening on the Cinder-API. With the logs on INFO level is hard to tell what's going on with precision, but it all points to Nova ignoring an error on the call to terminate connection (so the volume is still attached) and then trying to delete the volume, which cannot be deleted since it's still attached. The error that Nova is ignoring, is Cinder timing out at the API service on what I assume is the terminate connection call, but we cannot know why since there are no log entries on the Volume service during the minute that the API is waiting before timing out. We would need DEBUG log levels on the Cinder services to tell what's going on on the terminate connection. (In reply to Gorka Eguileor from comment #4) > I see errors in the volume logs caused by a missing default volume type > named HBSDVSPG200. We should check the configuration to see if this is > expected, which is possible since these are happening on the Cinder-API. > > With the logs on INFO level is hard to tell what's going on with precision, > but it all points to Nova ignoring an error on the call to terminate > connection (so the volume is still attached) and then trying to delete the > volume, which cannot be deleted since it's still attached. I can see the os-terminate_connection failures due to RPC timeouts to c-vol in the c-api logs but I can't match it up to anything on the n-cpu side. Most of these appear to be successful anyway AFAICT. Looking at the n-cpu code in Newton I can see how failures in os-terminate_connection could result in this behaviour as we wouldn't call Cinder to actually detach the volume from the server but there's zero evidence of this happening in the logs. > The error that Nova is ignoring, is Cinder timing out at the API service on > what I assume is the terminate connection call, but we cannot know why since > there are no log entries on the Volume service during the minute that the > API is waiting before timing out. > > We would need DEBUG log levels on the Cinder services to tell what's going > on on the terminate connection. Pablo, can we get DEBUG logs from Nova and Cinder, along with an example instance UUID so I can trace this, the example UUID in c#0 isn't present anywhere in the sosreports. Hello. May I ask you to update this bug and let me know if support could provide something for you? BR, Alex. |