Created attachment 791273 [details] the cinder volume log Description of problem: Cinder failed to delete an LVM volume snapshot. The snapshot status after the action is: error_deleting. This bug is a regression and was generated from the fix of this bug: bug-994335 The command that fail is (from the /var/log/cinder/volume.log): 2013-08-28 10:17:42 ERROR [cinder.openstack.common.rpc.amqp] Exception during message handling Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/amqp.py", line 430, in _process_data rval = self.proxy.dispatch(ctxt, version, method, **args) File "/usr/lib/python2.6/site-packages/cinder/openstack/common/rpc/dispatcher.py", line 133, in dispatch return getattr(proxyobj, method)(ctxt, **kwargs) File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 509, in delete_snapshot {'status': 'error_deleting'}) File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ self.gen.next() File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 498, in delete_snapshot self.driver.delete_snapshot(snapshot_ref) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 239, in delete_snapshot self._delete_volume(snapshot) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 134, in _delete_volume self.clear_volume(volume) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 205, in clear_volume clearing=True) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/lvm.py", line 117, in _copy_volume *extra_flags, run_as_root=True) File "/usr/lib/python2.6/site-packages/cinder/utils.py", line 190, in execute cmd=' '.join(cmd)) ProcessExecutionError: Unexpected error while running command. Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf dd if=/dev/zero of=/dev/mapper/cinder--volumes-_snapshot--366cfc3a--5f4c--4b71--8db0--7fd1a0ead77e count=1024 bs=1M conv=fdatasync Exit code: 1 Stdout: '' Stderr: "/bin/dd: fdatasync failed for `/dev/mapper/cinder--volumes-_snapshot--366cfc3a--5f4c--4b71--8db0--7fd1a0ead77e': Input/output error\n1024+0 records in\n1024+0 records out\n1073741824 bytes (1.1 GB) copied, 11.7988 s, 91.0 MB/s\n" Version-Release number of selected component (if applicable): How reproducible: Prepare the RHOS: Configure the Cinder with LVM storage. Steps to Reproduce: 1. Create a volume-snapshot: nova volume-snapshot-create <volume id> 2. Verify the snapshot status: nova volume-snapshot-list 3. Delete the volume snapshot: nova volume-snapshot-delete <snapshot-id> 4. Verify the snapshot status again. 5. Check the Cinder-volume log. Actual results: The volume snapshot status is "error_delete". Expected results: The Cinder deleted the snapshot. Additional info:
The Cinder version is: openstack-cinder-2013.1.3-2.el6ost.noarch.rpm
additional info: [root@cougar08 yum.repos.d(keystone_admin)]# ls of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b ls: cannot access of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b: No such file or directory [root@cougar08 yum.repos.d(keystone_admin)]# ls /dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b /dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b [root@cougar08 yum.repos.d(keystone_admin)]# dd if=/dev/zero of=/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b count=1024 bs=1M conv=fdatasync dd: fdatasync failed for `/dev/mapper/cinder--volumes-_snapshot--6bd7a29a--5bf7--40fc--8d0e--58c6c5a0d50b': Input/output error 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 1.55012 s, 693 MB/s
Grizzly/2013-08-27.1
Here is the situation with this bug: We had bug 975052 a bit ago covering this same failure in a slightly different context. The problem here is that Cinder assumes that if you have a snapshot of an x GB volume, you can wipe it by writing x GB of zeros to that snapshot. Unfortunately, this is not the case with LVM non-thin-provisioned snapshots. They hold less than x GB of data. So, the snapshot fills up and things break. Bug 975902 worked around this by enabling LVM snapshot autoextend ("snapshot_autoextend_threshold" in /etc/lvm/lvm.conf), which grows the snapshot size rather than hitting this error. But, this is not a proper solution to this issue. The real fix for this is to enable LVM to wipe the data for us rather than doing it with dd as suggested by agk in bug 975052, but this is a serious LVM-level change and we have not gone all the way down that path yet. (Bug 984705 is tracking this.) So, I'm not sure this is a regression from a working version. Packstack may not set this option in RHOS 3, if so, we should probably get the workaround in place there. There are other options available such as wiping slightly less than the volume size, but then you run into problems like not being sure that you actually cleared the data as expected, which introduces security concerns.
thanks Eric, I've now slightly updated the description of bug 975052 to make it more 'relevant'
There is a new fix for this in upstream Havana that we could backport to RHOS 3.