Bug 173209
Summary: | filesystem corruption while creating 2nd snapshot with I/O to origin | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Corey Marthaler <cmarthal> |
Component: | lvm2 | Assignee: | Alasdair Kergon <agk> |
Status: | CLOSED NEXTRELEASE | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-01-19 17:57:25 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2005-11-14 23:39:15 UTC
This is easily reproduced without genesis, all it takes is a looping file copy. Found a userspace bug that might cause this. Currently it reloads device tables every time - even when the existing table is the same. The FIXME in the code need resolving, ideally comparing the live table with the new one required, and suppressing the reload operation if they are identical. Also, there's a missing ordering relation that needs to resume the new snapshot LV before resuming the original LV. dm cvs code now suppresses reloading tables that haven't changed Ordering changed so snapshot origins now get resumed last (i.e. *after* new snapshots of them begin). Need to retest with lvm2 2.02.01 and dm 1.02.01 which I'll release shortly. I still see corruption with the new rpms. [root@link-08 bin]# rpm -q device-mapper device-mapper-1.02.02-1.0.RHEL4 [root@link-08 bin]# rpm -q lvm2 lvm2-2.02.01-1.1.RHEL4 Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 Buffer I/O error on device dm-7, logical block 0 lost page write due to I/O error on dm-7 Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 Buffer I/O error on device dm-7, logical block 0 lost page write due to I/O error on dm-7 EXT3-fs error (device dm-5): read_inode_bitmap: Cannot read inode bitmap - block_group = 0, inode_bitmap = 642 Aborting journal on device dm-5. Buffer I/O error on device dm-5, logical block 1161 lost page write due to I/O error on dm-5 Buffer I/O error on device dm-5, logical block 1161 Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 EXT3-fs error (device dm-5) in ext3_new_inode: IO failure Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 EXT3-fs error (device dm-5) in ext3_create: IO failure Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 EXT3-fs error (device dm-7): read_inode_bitmap: Cannot read inode bitmap - block_group = 0, inode_bitmap = 642 Aborting journal on device dm-7. Buffer I/O error on device dm-7, logical block 1161 lost page write due to I/O error on dm-7 EXT3-fs error (device dm-7) in ext3_new_inode: IO failure EXT3-fs error (device dm-7) in ext3_create: IO failure lost page write due to I/O error on dm-5 printk: 17 messages suppressed. Buffer I/O error on device dm-5, logical block 0 lost page write due to I/O error on dm-5 device-mapper: Could not create kcopyd client device-mapper: error adding target to table [root@link-08 bin]# touch /mnt/snap2/foo touch: cannot touch `/mnt/snap2/foo': Read-only file system EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal Remounting filesystem read-only Dec 20 09:39:35 link-08 kernel: ext3_abort called. Dec 20 09:39:35 link-08 kernel: EXT3-fs error (device dm-7): ext3_journal_start_sb: Detected aborted journal Dec 20 09:39:35 link-08 kernel: Remounting filesystem read-only ext3_abort called. [root@link-08 bin]# mount /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw) none on /proc type proc (rw) none on /sys type sysfs (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) usbfs on /proc/bus/usb type usbfs (rw) /dev/hda5 on /boot type ext3 (rw) none on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) /dev/mapper/snapper-snap1 on /mnt/snap1 type ext3 (rw) /dev/mapper/snapper-snap2 on /mnt/snap2 type ext3 (rw) /dev/mapper/snapper-origin on /mnt/origin type ext3 (rw) /dev/mapper/snapper-snap3 on /mnt/snap3 type ext3 (rw) And exactly which kernel? [root@link-08 bin]# uname -ar Linux link-08 2.6.9-25.ELsmp #1 SMP Mon Dec 12 17:29:54 EST 2005 x86_64 x86_64 x86_64 GNU/Linux Once Jason has incorporated the two snapshot patches I sent him yesterday in connection with bug 172839 and bug 177620 please repeat the test next week with his new kernel. If that fails, we can try out some further kernel patches. This appears to be fixed with the latest kern/rpms [root@link-08 bin]# uname -ar Linux link-08 2.6.9-28.ELsmp #1 SMP Fri Jan 13 17:08:22 EST 2006 x86_64 x86_64 x86_64 GNU/Linux [root@link-08 bin]# rpm -q lvm2 lvm2-2.02.01-1.3.RHEL4 [root@link-08 bin]# rpm -q device-mapper device-mapper-1.02.02-3.0.RHEL4 So one or other of the kernel patches fixed it after the lvm2 packages were also fixed. Closing. |