Bug 1091866
| Summary: | volume is disappered after vol-wipe with logical type pool | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | yinwei <wyin> | ||||
| Component: | libvirt | Assignee: | John Ferlan <jferlan> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 7.0 | CC: | dyuan, jferlan, mzhan, pzhang, rbalakri, xuzhang, yanyang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-1.2.7-1.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-03-05 07:34:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1166592 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
Although there was a leap of faith that 'pvcreate' was already accomplished on some existing '/dev/sdX' device and a 'virsh pool-define lv_pool.xml' was run before the 'virsh pool-build lv_pool' - it didn't take "too much time" to figure things out... I already had a /dev/sdb of 11G - doing a 'fdisk /dev/sdb' choosing to create a 'n'ew 'p'rimary partition with a default sector of 2048 and size of 7500M (7.3G), with a Linux LVM type '8e' - followed by a 'pvcreate /dev/sdb1' did the trick.... Once I reproduced the main issue from the bug, I tried the same sequence of wipe, refresh, list using every kind of supported --algorithm option and none caused the phonomena seen, but none worked either as scrub "soft" failed as follows for example: $ scrub -f -p bsi /dev/lv_pool/lv_test -v bsi scrub: using BSI patterns scrub: warning: /dev/lv_pool/lv_test is zero length $ echo $? 0 $ For some reason the code considers /dev/lv_pool/lv_test as a regular file. Going back to the zero/default algorithm for more debugging. The first thing noted is that the sizes chosen in the lv_test.xml: <capacity>274784</capacity> <allocation>174784</allocation> Were less than the blocksize in my configuration of 4096K, thus when the: /usr/sbin/lvcreate --name lv_test -L 171K --virtualsize 269K lv_pool was run it would "round up" to the minimum size (from debug output): Rounding up size to full physical extent 4.00 MiB Rounding up size to full physical extent 4.00 MiB Logical volume "lv_test" created This is also seen in the output: # virsh vol-info lv_test --pool lv_pool Name: lv_test Type: block Capacity: 4.00 MiB Allocation: 4.00 MiB # When wiping this the LV would actually go "INACTIVE" (as seen in the above output as well) with the 'lvs' output showing: # lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert home fedora -wi-ao---- 136.72g root fedora -wi-ao---- 50.00g swap fedora -wi-ao---- 7.64g lv_test lv_pool swi-I-s--- 4.00m [lv_test_vorigin] 100.00 # Being inactive would cause the pool-refresh to not find the LV. The reason for the INACTIVE is probably because a thin logical volume doesn't work like a regular LV since it has metadata associated with it. When wiping - the following messages were blasted to /var/log/messages: Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 1020 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 1021 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 1022 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 1023 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 0 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 1 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 2 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 3 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 Jul 16 14:05:53 localhost kernel: Buffer I/O error on device dm-4, logical block 4 Jul 16 14:05:53 localhost kernel: lost page write due to I/O error on dm-4 NOTE: # ls -al /dev/lv_pool/lv_test lrwxrwxrwx. 1 root root 7 Jul 16 16:37 /dev/lv_pool/lv_test -> ../dm-4 # NOTE: Even if the same sizes were used (allocation == capacity), that didn't change the result although the command is slightly different (/usr/sbin/lvcreate --name lv_test -L 269K lv_pool). The difference being the --virtualsize switch. A recent libvir-list posting describes that using --virtualsize creates this sparse snapshot "feature" for lv: http://www.redhat.com/archives/libvir-list/2014-July/msg00694.html The --virtualsize (or -V) feature was added in commit id '1ffc78b5'. This is after the scrub/algorithm commit id 'adb99a05b' which augmented the original wipe code commit id '73adc0e5'. So, I "assume" the write of a zero buffer has wiped out some important data invalidating the thin LV (as is more or less stated from my online reading). Further debugging reveals that if the lv_test.xml is modified to have the allocation be the blocksize (eg, 4194304) with the capacity being the same, then the vol_wipe works. It continues to work if capacity is less than allocation. However, curiously if capacity was 8388608 and allocation 4194304, then we have the same condition. All of this can be reproduced without libvirt in the picture using lvcreate and a sample program to mimic what libvirt does... Assuming there exists an lv_pool (or whatever pool) use the following: # /usr/sbin/lvcreate --name lv_test -L 4096K --virtualsize 8192K lv_pool Running the attached code in half mode (eg, only 1/2 the complete lv): # ./scrub /dev/lv_pool/lv_test half Open '/dev/lv_pool/lv_test' for scrub About to write extent_len=2097152 Finished writing 2097152 bytes results in: # lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert home fedora -wi-ao---- 136.72g root fedora -wi-ao---- 50.00g swap fedora -wi-ao---- 7.64g lv_test lv_pool swi-a-s--- 4.00m [lv_test_vorigin] 50.39 # lvscan ACTIVE '/dev/fedora/swap' [7.64 GiB] inherit ACTIVE '/dev/fedora/home' [136.72 GiB] inherit ACTIVE '/dev/fedora/root' [50.00 GiB] inherit ACTIVE Snapshot '/dev/lv_pool/lv_test' [4.00 MiB] inherit Running it again, but using full (eg, a write from 0 through the full size): # /home/jferlan/exmplCode/scrub /dev/lv_pool/lv_test all Open '/dev/lv_pool/lv_test' for scrub About to write extent_len=4194304 Finished writing 4194304 bytes # lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert home fedora -wi-ao---- 136.72g root fedora -wi-ao---- 50.00g swap fedora -wi-ao---- 7.64g lv_test lv_pool swi-I-s--- 4.00m [lv_test_vorigin] 100.00 # lvscan ACTIVE '/dev/fedora/swap' [7.64 GiB] inherit ACTIVE '/dev/fedora/home' [136.72 GiB] inherit ACTIVE '/dev/fedora/root' [50.00 GiB] inherit inactive Snapshot '/dev/lv_pool/lv_test' [4.00 MiB] inherit # While not necessarily a libvirt problem, it's a problem because libvirt isn't perhaps using the thin LV's properly... Investigation continues. Created attachment 918532 [details]
Sample code (compile cc -o scrub scrub.c)
Sent patches upstream to address the issue: http://www.redhat.com/archives/libvir-list/2014-July/msg00921.html The result will be : $ virsh vol-wipe /dev/lv_pool/lv_test error: Failed to wipe vol /dev/lv_pool/lv_test error: this function is not supported by the connection driver: logical volue '/dev/lv_pool/lv_test' is sparse, volume wipe not supported $ commit 8a9f7cbecd91932eff669830fd26e775a240afd6
Author: John Ferlan <jferlan>
Date: Thu Jul 17 12:41:43 2014 -0400
storage: Disallow vol_wipe for sparse logical volumes
https://bugzilla.redhat.com/show_bug.cgi?id=1091866
Add a new boolean 'sparse'. This will be used by the logical backend
storage driver to determine whether the target volume is sparse or not
(also known by a snapshot or thin logical volume). Although setting sparse
to true at creation could be seen as duplicitous to setting during
virStorageBackendLogicalMakeVol() in case there are ever other code paths
between Create and FindLVs that need to know about the volume be sparse.
Use the 'sparse' in a new virStorageBackendLogicalVolWipe() to decide whethe
to attempt to wipe the logical volume or not. For now, I have found no
means to wipe the volume without writing to it. Writing to the sparse
volume causes it to be filled. A sparse logical volume is not completely
writeable as there exists metadata which if overwritten will cause the
sparse lv to go INACTIVE which means pool-refresh will not find it.
Access to whatever lvm uses to manage data blocks is not provided by
any API I could find.
git describe: v1.2.6-189-g8a9f7cb
Verify version : libvirt-1.2.8-10.el7.x86_64 qemu-kvm-rhev-2.1.2-15.el7.x86_64 kernel-3.10.0-210.el7.x86_64 Verify steps: 1.according to comment 3 , create logical volume using allocation == capacity , and then try to wipe , volume will be wiped successfully .and will not disappear after wiped . 1>prepare a logical pool #virsh pool-info logical-pool Name: logical-pool UUID: 04e3685b-137b-4f13-8eec-31949a9ee9f5 State: running Persistent: yes Autostart: no Capacity: 50.00 GiB Allocation: 0.00 B Available: 50.00 GiB # virsh vol-list logical-pool Name Path ------------------------------------------------------------------------------ 2>create volume in the pool , and set the value of allocation equal to the value of capacity. # cat logical-volume.xml <volume> <name>lv_test</name> <source> <device path='/dev/sda5'> </device> </source> <capacity>102400</capacity> <allocation>102400</allocation> <target> <path>/dev/logical-pool/lv_test</path> <permissions> <mode>0660</mode> <owner>0</owner> <group>6</group> <label>system_u:object_r:fixed_disk_device_t:s0</label> </permissions> </target> </volume> # virsh vol-create --pool logical-pool logical-volume.xml Vol lv_test created from logical-volume.xml 3>check volume info # virsh vol-list logical-pool Name Path ------------------------------------------------------------------------------ lv_test /dev/logical-pool/lv_test # virsh vol-info lv_test1 --pool logical-pool Name: lv_test1 Type: block Capacity: 4.00 MiB Allocation: 4.00 MiB # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv_test1 logical-pool -wi-a----- 4.00m 4> try to wipe the volume , wipe successfully . # virsh vol-wipe --pool logical-pool lv_test1 <====== wipe successfully. Vol lv_test1 wiped 5>check the volume info after wiped. the volume is still exist . # virsh pool-refresh logical-pool Pool logical-pool refreshed # virsh vol-list logical-pool Name Path ------------------------------------------------------------------------------ lv_test /dev/logical-pool/lv_test log : debug : virCommandRunAsync:2398 : About to run /usr/sbin/lvcreate --name lv_test1 -L 100K logical-pool <========= note : there is no --virtualsize option debug : virStorageBackendVolWipeLocal:1859 : Wiping volume with path '/dev/logical-pool/lv_test1' and algorithm 0 debug : virStorageBackendWipeExtentLocal:1834 : Wrote 4194304 bytes to volume with path '/dev/logical-pool/lv_test1' ...... NOTE: 2.then try to create logical volume and set the value of allocation is different from capacity ( using allocation less than capacity) , then try to wipe it and check volume info . # cat logical-volume.xml <volume> <name>lv_test5</name> <key>raakCv-MQhr-WKIT-R66x-Epn2-e8hG-1Z5gY0</key> <source> <device path='/dev/sda5'> </device> </source> <capacity>2147483648</capacity> <allocation>1073741824</allocation> <target> <path>/dev/logical-pool/lv_test5</path> <permissions> <mode>0660</mode> <owner>0</owner> <group>6</group> <label>system_u:object_r:fixed_disk_device_t:s0</label> </permissions> </target> </volume> in log: debug : virCommandRunAsync:2398 : About to run /usr/sbin/lvcreate --name lv_test -L 1048576K --virtualsize 2097152K logical-pool <====== it has --virtualsize option but now exist a bug : https://bugzilla.redhat.com/show_bug.cgi?id=1166592 logical volume in this situation cannot be created successfully, it can not be list in the logical pool . So i cannot use virsh command vol-XXX to do some operations . I wonder if there is another way i can verify this bug , or need to wait bug 1166592 solved . Thanks. Sorry it took a bit to get to this needsinfo... It's not clear to me how this is related to bug 1166592 as you didn't include the output from anything beyond "but now exist a bug"... My quick read of the other bug is that the "<key>" is not present in the XML shown there, but the XML you show here has a "<key>", so I need more information in order to investigate more. I use the following on my upstream code and have no issues: # cat lv_test_larger_capacity.xml <volume> <name>lv_test</name> <key>r4xkCv-MQhr-WKIT-R66x-Epn2-e8hG-1Z5gY0</key> <source> <device path='/dev/sdb1'> </device> </source> <capacity>8388608</capacity> <allocation>4194304</allocation> <target> <path>/dev/lv_pool/lv_test</path> <permissions> <mode>0660</mode> <owner>0</owner> <group>6</group> <label>system_u:object_r:fixed_disk_device_t:s0</label> </permissions> </target> </volume> # virsh vol-create LVM_Test lv_test_larger_capacity.xml Vol lv_test created from lv_test_larger_capacity.xml # virsh vol-info --pool LVM_Test lv_test Name: lv_test Type: block Capacity: 8.00 MiB Allocation: 4.00 MiB # virsh pool-info LVM_Test Name: LVM_Test UUID: c93d091b-2314-4909-8ab1-4559909adcce State: running Persistent: yes Autostart: yes Capacity: 9.31 GiB Allocation: 4.00 MiB Available: 9.30 GiB # virsh pool-dumpxml LVM_Test <pool type='logical'> <name>LVM_Test</name> <uuid>c93d091b-2314-4909-8ab1-4559909adcce</uuid> <capacity unit='bytes'>9995026432</capacity> <allocation unit='bytes'>4194304</allocation> <available unit='bytes'>9990832128</available> <source> <device path='/dev/sda3'/> <name>LVM_Test</name> <format type='unknown'/> </source> <target> <path>/dev/LVM_Test</path> <permissions> <mode>0755</mode> <owner>-1</owner> <group>-1</group> </permissions> </target> </pool> More investigation into bug 1166592 and I think I now know what the issue is. I've placed a depends on that bug being fixed since you won't be able to create a sparse volume in 7.1 without a change in libvirt. verify version: libvirt-1.2.8-11.el7.x86_64 qemu-kvm-rhev-2.1.2-17.el7.x86_64 kernel-3.10.0-220.el7.x86_64 # lvs --version LVM version: 2.02.114(2)-RHEL7 (2014-12-01) Library version: 1.02.92-RHEL7 (2014-12-01) Driver version: 4.29.0 Verify steps: 1.prepare a logical pool and create volume in the pool. # virsh vol-list logical-pool --details Name Path Type Capacity Allocation ------------------------------------------------------------- vol4 /dev/logical-pool/vol4 block 1.00 GiB 1.00 GiB vol5 /dev/logical-pool/vol5 block 2.00 GiB 1.00 GiB check via lvs # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert vol4 logical-pool -wi-a----- 1.00g vol5 logical-pool swi-a-s--- 1.00g [vol5_vorigin] 0.00 2. try to wipe a sparse volume . # virsh vol-wipe vol5 --pool logical-pool error: Failed to wipe vol vol5 error: this function is not supported by the connection driver: logical volue '/dev/logical-pool/vol5' is sparse, volume wipe not supported 3.check volumes in logical pool after refresh , volume won't disappear # virsh pool-refresh logical-pool Pool logical-pool refreshed # virsh vol-list logical-pool --details Name Path Type Capacity Allocation ------------------------------------------------------------- vol4 /dev/logical-pool/vol4 block 1.00 GiB 1.00 GiB vol5 /dev/logical-pool/vol5 block 2.00 GiB 1.00 GiB 4.try to wipe a non-sparse volume .wipe successfully. # virsh vol-wipe vol4 --pool logical-pool Vol vol4 wiped # cmp /dev/logical-pool/vol4 /dev/zero cmp: EOF on /dev/logical-pool/vol4 5.check volumes in logical pool # virsh pool-refresh logical-pool Pool logical-pool refreshed # virsh vol-list logical-pool --details Name Path Type Capacity Allocation ------------------------------------------------------------- vol4 /dev/logical-pool/vol4 block 1.00 GiB 1.00 GiB vol5 /dev/logical-pool/vol5 block 2.00 GiB 1.00 GiB # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert vol4 logical-pool owi-a-s--- 1.00g vol5 logical-pool swi-a-s--- 1.00g [vol5_vorigin] 0.00 It has a clear error message and volume won't disappear . Move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html |
Description of problem: volume is disappered after vol-wipe with logical type pool. Version-Release number of selected component (if applicable): libvirt-1.1.1-29.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. define, build, and start a logical type pool: lv_pool # more lv_pool.xml <pool type="logical"> <name>lv_pool</name> <source> <device path="/dev/sdb2"/> <device path="/dev/sdb3"/> </source> <target> <path>/dev/lv_pool</path> </target> </pool> # virsh pool-build lv_pool Pool lv_pool built # virsh pool-start lv_pool Pool lv_pool started 2. create a volume in the logical pool # more lv_test.xml <volume> <name>lv_test</name> <key>r4xkCv-MQhr-WKIT-R66x-Epn2-e8hG-1Z5gY0</key> <source> <device path='/dev/sdb3'> </device> </source> <capacity>274784</capacity> <allocation>174784</allocation> <target> <path>/dev/lv_pool/lv_test</path> <permissions> <mode>0660</mode> <owner>0</owner> <group>6</group> <label>system_u:object_r:fixed_disk_device_t:s0</label> </permissions> </target> </volume> # virsh vol-create lv_pool lv_test.xml Vol lv_test created from lv_test.xml # virsh vol-list lv_pool Name Path ----------------------------------------- lv_test /dev/lv_pool/lv_test 3. wipe the volume and refresh the logical pool, the new created volume is disappered. # virsh vol-wipe /dev/lv_pool/lv_test Vol /dev/lv_pool/lv_test wiped # virsh vol-list lv_pool Name Path ----------------------------------------- lv_test /dev/lv_pool/lv_test # virsh pool-refresh lv_pool Pool lv_pool refreshed # virsh vol-list lv_pool Name Path ----------------------------------------- 4. check the vgdisplay, lvdisplay, pvdisplay # vgdisplay --- Volume group --- VG Name lv_pool System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 4 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 2 Act PV 2 VG Size 400.00 MiB PE Size 4.00 MiB Total PE 100 Alloc PE / Size 1 / 4.00 MiB Free PE / Size 99 / 396.00 MiB VG UUID PiHMyh-uHDF-jt9i-ng2c-k6wq-2sD0-nidu8B # lvdisplay --- Logical volume --- LV Path /dev/lv_pool/lv_test LV Name lv_test VG Name lv_pool LV UUID C5c4bJ-ynmU-jbEa-0kJK-cx5S-PIab-I6wAXp LV Write Access read/write LV Creation host, time localhost.localdomain, 2014-04-28 14:36:08 +0800 LV snapshot status INACTIVE destination for lv_test_vorigin LV Status available # open 0 LV Size 4.00 MiB Current LE 1 COW-table size 4.00 MiB COW-table LE 1 Snapshot chunk size 4.00 KiB Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:3 # pvdisplay --- Physical volume --- PV Name /dev/sdb2 VG Name lv_pool PV Size 201.29 MiB / not usable 1.29 MiB Allocatable yes PE Size 4.00 MiB Total PE 50 Free PE 49 Allocated PE 1 PV UUID p18S2m-hwi8-Gpwd-7leJ-d3M1-OoHH-vv12FV --- Physical volume --- PV Name /dev/sdb3 VG Name lv_pool PV Size 201.29 MiB / not usable 1.29 MiB Allocatable yes PE Size 4.00 MiB Total PE 50 Free PE 50 Allocated PE 0 PV UUID 7jNpqK-f80v-yGXU-Rwkf-LT9B-EU5R-wtPUIe Actual results: In step3, after vol-wipe the volume is disappered Expected results: In step3, after vol-wipe, the volume is still existed. Additional info: