1001940 – Failed to clear LVM volume snapshot during deletion

Bug 1001940 - Failed to clear LVM volume snapshot during deletion

Summary: Failed to clear LVM volume snapshot during deletion

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-cinder
Sub Component:
Version:	3.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	async
Target Release:	3.0
Assignee:	Eric Harney
QA Contact:	Haim
Docs Contact:
URL:
Whiteboard:	storage
Depends On:
Blocks:	994368
TreeView+	depends on / blocked

Reported:	2013-08-28 07:48 UTC by Yogev Rabl
Modified:	2016-04-26 13:27 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-10-09 13:23:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
the cinder volume log (4.79 KB, text/plain) 2013-08-28 07:48 UTC, Yogev Rabl	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
OpenStack gerrit	45114	0	None	None	None	Never
OpenStack gerrit	45117	0	None	None	None	Never

Description Yogev Rabl 2013-08-28 07:48:27 UTC

Created attachment 791273 [details]
the cinder volume log

Description of problem:
Cinder failed to delete an LVM volume snapshot. The snapshot status after the action is: error_deleting. 

This bug is a regression and was generated from the fix of this bug: bug-994335
The command that fail is (from the /var/log/cinder/volume.log): 

2013-08-28 10:17:42    ERROR [cinder.openstack.common.rpc.amqp] Exception during message handling
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py", line 430, in _process_data
    rval = self.proxy.dispatch(ctxt, version, method, **args)
  File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/dispatcher.py", line 133, in dispatch
    return getattr(proxyobj, method)(ctxt, **kwargs)
  File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 509, in delete_snapshot
    {'status': 'error_deleting'})
  File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
    self.gen.next()
  File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 498, in delete_snapshot
    self.driver.delete_snapshot(snapshot_ref)
  File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 239, in delete_snapshot
    self._delete_volume(snapshot)
  File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 134, in _delete_volume
    self.clear_volume(volume)
  File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 205, in clear_volume
    clearing=True)
  File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 117, in _copy_volume
    *extra_flags, run_as_root=True)
  File "/usr/lib/python2.6/site-packages/cinder/utils.py", line 190, in execute
    cmd=' '.join(cmd))
ProcessExecutionError: Unexpected error while running command.
Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf dd if=/dev/zero of=/dev/mapper/cinder--volumes-_snapshot--366cfc3a--5f4c--4b71--8db0--7fd1a0ead77e count=1024 bs=1M conv=fdatasync
Exit code: 1
Stdout: ''
Stderr: "/bin/dd: fdatasync failed for `/dev/mapper/cinder--volumes-_snapshot--366cfc3a--5f4c--4b71--8db0--7fd1a0ead77e': Input/output error\n1024+0 records in\n1024+0 records out\n1073741824 bytes (1.1 GB) copied, 11.7988 s, 91.0 MB/s\n"




Version-Release number of selected component (if applicable):


How reproducible:
Prepare the RHOS: Configure the Cinder with LVM storage. 

Steps to Reproduce:
1. Create a volume-snapshot: nova volume-snapshot-create <volume id>
2. Verify the snapshot status: nova volume-snapshot-list
3. Delete the volume snapshot: nova volume-snapshot-delete <snapshot-id>
4. Verify the snapshot status again.
5. Check the Cinder-volume log. 

Actual results:
The volume snapshot status is "error_delete".

Expected results:
The Cinder deleted the snapshot. 

Additional info:

Comment 1 Yogev Rabl 2013-08-28 07:55:09 UTC

The Cinder version is: openstack-cinder-2013.1.3-2.el6ost.noarch.rpm

Comment 3 Haim 2013-08-28 10:13:14 UTC

additional info:

[root@cougar08 yum.repos.d(keystone_admin)]# ls of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b 
ls: cannot access of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b: No such file or directory
[root@cougar08 yum.repos.d(keystone_admin)]# ls /dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b 
/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b
[root@cougar08 yum.repos.d(keystone_admin)]# dd if=/dev/zero of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b count=1024 bs=1M conv=fdatasync
dd: fdatasync failed for `/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b': Input/output error
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 1.55012 s, 693 MB/s

Comment 5 Yogev Rabl 2013-08-28 11:27:32 UTC

Grizzly/2013-08-27.1

Comment 7 Eric Harney 2013-08-28 12:59:40 UTC

Here is the situation with this bug:

We had bug 975052 a bit ago covering this same failure in a slightly different context.  The problem here is that Cinder assumes that if you have a snapshot of an x GB volume, you can wipe it by writing x GB of zeros to that snapshot.

Unfortunately, this is not the case with LVM non-thin-provisioned snapshots.  They hold less than x GB of data.  So, the snapshot fills up and things break.

Bug 975902 worked around this by enabling LVM snapshot autoextend ("snapshot_autoextend_threshold" in /etc/lvm/lvm.conf), which grows the snapshot size rather than hitting this error.  But, this is not a proper solution to this issue.  The real fix for this is to enable LVM to wipe the data for us rather than doing it with dd as suggested by agk in bug 975052, but this is a serious LVM-level change and we have not gone all the way down that path yet.  (Bug 984705 is tracking this.)

So, I'm not sure this is a regression from a working version.  Packstack may not set this option in RHOS 3, if so, we should probably get the workaround in place there.

There are other options available such as wiping slightly less than the volume size, but then you run into problems like not being sure that you actually cleared the data as expected, which introduces security concerns.

Comment 8 Giulio Fidente 2013-08-28 13:10:25 UTC

thanks Eric, I've now slightly updated the description of bug 975052 to make it more 'relevant'

Comment 13 Eric Harney 2013-09-23 14:14:17 UTC

There is a new fix for this in upstream Havana that we could backport to RHOS 3.

Note You need to log in before you can comment on or make changes to this bug.