Seems that both userspace and kernel code have problem when creating VG with 1k extent and small snapshot (chunk size too small) (Similar problem was with mirror log.) Probably need fixes in kernel to not allow wrong parameters (access beyond of end of device) and also in userspace (do not allow create such small snapshot) # vgcreate -c n -s 1k vg_test /dev/sdb1 Non-clustered volume group "vg_test" successfully created # lvcreate -l10 -n lv1 vg_test Logical volume "lv1" created # lvcreate -s -l1 -n lv1s vg_test/lv1 Error locking on node bar-05: device-mapper: reload ioctl failed: Cannot allocate memory Failed to suspend origin lv1 dmesg: device-mapper: table: 253:4: snapshot: Unable to allocate hash table space device-mapper: ioctl: error adding target to table ... # lvremove vg_test/lv1s Do you really want to remove active logical volume lv1s? [y/n]: y Logical volume "lv1s" successfully removed # lvcreate -s -l4 -n lv1s vg_test/lv1 Error locking on node bar-05: device-mapper: reload ioctl failed: Input/output error Failed to suspend origin lv1 dmesg: device-mapper: table: 253:4: snapshot: Unable to allocate hash table space device-mapper: ioctl: error adding target to table attempt to access beyond end of device dm-6: rw=17, want=16, limit=8 device-mapper: snapshots: zero_area(0) failed device-mapper: table: 253:4: snapshot: Failed to read snapshot metadata device-mapper: ioctl: error adding target to table # rpm -q lvm2 lvm2-2.02.46-2.el5 # uname -a Linux bar-05 2.6.18-128.1.10.el5 #1 SMP Wed Apr 29 13:53:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Created attachment 375639 [details] A patch for RHEL 5.5
Note that the configuration described here is invalid: "lvcreate -s -l4 -n lv1s vg_test/lv1" --- snapshot must have at least two chunks. But there is similar errors, when the origin is too small. The patch fixes this case. It also fixes a bogus "Unable to allocate hash table space" message.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-179.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
When creating new snapshot with 1k extent and less than 2 chunks(i.e. lvcreate with -l4), the code doesn't check whether snapshot can be created and create it(lvdisplay is aware of it and volume in /dev exists) even though it can't be created(because snapshot must have at least 2 chunks).
Please separate two problems here - kernel and lvm2. This is kernel BZ - so if there is a problem in lvm2 userspace, create new bug (maybe userspace calculation is wrong - please add full reproducer and output to bug). Basically the kernel problem should be reproducible wihout using lvm2 at all.
Reproduced basically with code in first post: [root@dell-pesc430-01 ~]# uname -r 2.6.18-192.el5 [root@dell-pesc430-01 ~]# dd if=/dev/zero bs=4096 count=200000 of=px42.img 200000+0 records in 200000+0 records out 819200000 bytes (819 MB) copied, 3.10531 seconds, 264 MB/s [root@dell-pesc430-01 ~]# losetup /dev/loop0 px42.img [root@dell-pesc430-01 ~]# pvcreate /dev/loop0 Physical volume "/dev/loop0" successfully created [root@dell-pesc430-01 ~]# vgcreate -c n -s 1k testvg /dev/loop0 Volume group "testvg" successfully created [root@dell-pesc430-01 ~]# lvcreate -l10 -n lv1 testvg Logical volume "lv1" created [root@dell-pesc430-01 ~]# lvcreate -s -l4 -n lv1s testvg/lv1 device-mapper: snapshots: zero_disk_area(0) failed device-mapper: table: 253:3: snapshot: Failed to read snapshot metadata device-mapper: ioctl: error adding target to table device-mapper: reload ioctl failed: Input/output error Failed to suspend origin lv1 [root@dell-pesc430-01 ~]# ls /dev/testvg/ lv1 lv1s [root@dell-pesc430-01 ~]# lvdisplay ..... --- Logical volume --- LV Name /dev/testvg/lv1 VG Name testvg LV UUID SDEpav-04K4-Z2ua-asq5-hL3D-oe3p-eJkuam LV Write Access read/write LV Status available # open 0 LV Size 10.00 KB Current LE 10 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:2 --- Logical volume --- LV Name /dev/testvg/lv1s VG Name testvg LV UUID MeRQrL-D4ag-GA57-M94e-Clye-wI0l-iuCYMC LV Write Access read/write LV Status available # open 0 LV Size 4.00 KB Current LE 4 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:3 Problem is that the device really exists(even kernel is aware of it) even though it shouldn't. Question is whether the bug is in kernel-space code or user-space code. dmesg output: attempt to access beyond end of device dm-5: rw=17, want=16, limit=8 device-mapper: snapshots: zero_disk_area(0) failed device-mapper: table: 253:3: snapshot: Failed to read snapshot metadata device-mapper: ioctl: error adding target to table
Seems that both kernel & userspace problem... What's the lvm2 version? (rpm -q lvm2 ; lvm version)
I don't have access to the machine, right now, but it was rhel 5.5 snapshot from 15th March 2010. I have also managed to reproduce it in fedora 12(lvm2 version 2.02.53-2) so I guess it is not version-specific problem. btw: Architecture on both machines was x86_64.
Comment 10: This is not a bug. You are trying to create a snapshot that is too small. So it fails and you get an error. All that my patch fixed, was to remove that bogus error message about memory allocation failure. The error messages about access beyond the end of device are correct because there are really accesses beyond the end (you may get similar ones, if the snapshot overflows). As for left device --- if snapshot creation fails, the userspace lvm leaves a device of the same name that is not a snapshot, but a linear volume. You can delete it afterwards. It would make sense to delete it automatically in userspace lvm. But anyway, it doesn't create corrupted metadata, so it is not worth backporting now.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html