Bug 1056037
| Summary: | GlusterFS Snapshot delete of attached volume fails if it runs > 10 minutes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Yogev Rabl <yrabl> | ||||
| Component: | openstack-cinder | Assignee: | Eric Harney <eharney> | ||||
| Status: | CLOSED UPSTREAM | QA Contact: | Dafna Ron <dron> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 4.0 | CC: | bkopilov, eharney, scohen, yeylon | ||||
| Target Milestone: | --- | Keywords: | TestBlocker, ZStream | ||||
| Target Release: | 6.0 (Juno) | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | storage | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
Cause: Cinder has a fixed timeout for GlusterFS driver snapshot create and delete operations
Consequence: If a snapshot create/delete operation takes longer than 10 minutes to complete, Cinder will fail it even if it is still working correctly.
Fix: Have Nova send Cinder updates during the process so it knows that the job is still active.
Result: Snapshot operations can take as long as required without timing out as long as activity is still reported.
|
Story Points: | --- | ||||
| Clone Of: | |||||||
| : | 1066167 1078975 (view as bug list) | Environment: | |||||
| Last Closed: | 2014-10-09 13:27:29 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1066167 | ||||||
| Bug Blocks: | 1033652, 1040711, 1045196 | ||||||
| Attachments: |
|
||||||
|
Description
Yogev Rabl
2014-01-21 12:53:27 UTC
This bug blocks the following bugs: https://bugzilla.redhat.com/show_bug.cgi?id=1033652 https://bugzilla.redhat.com/show_bug.cgi?id=1040711 The basic problem here is that Cinder has a fixed time out when waiting for snapshot_delete operations on the Nova side to complete. If they take too long (even when things are functioning correctly) Cinder will prematurely fail the operation. To fix this, we need to have Nova send back updates of job percent complete while the block job is in-progress. Cinder can then reset its timeout window based on these updates. (This should be doable without changing how the APIs work between Cinder and Nova today.) For testing in the meantime: The longest operations are when deleting the only snapshot that exists, because in that case the whole base disk image has to be copied into the snapshot file. Deletions of snapshots when other snapshots exist should be much quicker, which will let you avoid this bug while testing other pieces of this feature. *** Bug 1101504 has been marked as a duplicate of this bug. *** This likely indicates using a version of libvirt which had known bugs in it in this area. Closing pending further info on reproduction. |