Bug 1052969

Summary: GlusterFS: deleting a 30GB snapshot takes more than an hour, times out
Product: Red Hat OpenStack Reporter: Yogev Rabl <yrabl>
Component: openstack-cinderAssignee: Eric Harney <eharney>
Status: CLOSED DUPLICATE QA Contact: Dafna Ron <dron>
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: abaron, eharney, yeylon, yrabl
Target Milestone: ---   
Target Release: 5.0 (RHEL 7)   
Hardware: All   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-15 14:33:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1040711    
Bug Blocks:    
Attachments:
Description Flags
volume.log
none
compute.log none

Description Yogev Rabl 2014-01-14 14:14:03 UTC
Created attachment 849959 [details]
volume.log

Description of problem:
Cinder failed to delete a snapshot create with instance snapshot. 
Cinder is using GlusterFS as its backend. 

1. Cinder is able to delete volumes.
2. Cinder is able to delete snapshots which haven't been created by instance snapshot. 

Version-Release number of selected component (if applicable):
python-cinderclient-1.0.7-2.el6ost.noarch
python-cinder-2013.2.1-4.el6ost.noarch
openstack-cinder-2013.2.1-4.el6ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create an volume from an image.
2. Launch an instance with this volume
3. Take a snapshot of the instance.
4. Delete the volume snapshot.

Actual results:
The snapshot is in the status of 'deleting', until trying to delete it again and then it is in the status of 'error_deleting'.

Expected results:
The snapshot should be deleted. 

Additional info:

The logs are attached.

Comment 2 Dafna Ron 2014-01-14 14:35:34 UTC
is there a trace for the failure in the log? if so, can you paste it?

Comment 4 Eric Harney 2014-01-14 15:19:38 UTC
GlusterFS doesn't do zeroing.  The I/O done depends on which snapshot is being deleted, but in some cases (deleting the active snapshot) where the base volume image must be copied into the snapshot qcow2 file, so the time it takes is dependent on the size of the volume.

Can you attach a Nova compute log from when this occurred?

It is likely that we need Nova to send updates to Cinder while a long operation is occurring so it knows not to time out the operation.

Comment 5 Yogev Rabl 2014-01-14 15:25:35 UTC
Created attachment 850007 [details]
compute.log

Comment 6 Eric Harney 2014-01-14 15:49:53 UTC
Actually, after reviewing the compute log, this looks like an instance of bug 1040711.

Comment 7 Ayal Baron 2014-01-15 14:33:18 UTC
(In reply to Eric Harney from comment #6)
> Actually, after reviewing the compute log, this looks like an instance of
> bug 1040711.

Closing as dup.

*** This bug has been marked as a duplicate of bug 1040711 ***