Bug 821384
Summary: | Errors reported during reboot of system with thinp LV | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Nenad Peric <nperic> | ||||||
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 6.3 | CC: | agk, cmarthal, coughlan, ddumas, dwysocha, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, syeghiay, thornber, xiaoli, zkabelac | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-12-10 19:38:59 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 773507, 840699 | ||||||||
Attachments: |
|
Description
Nenad Peric
2012-05-14 10:11:15 UTC
Same steps as above but with adding FS: vgcreate vgforthin /dev/sda1 /dev/sdb1 /dev/sdc1 lvcreate -T -L 1G vgforthin/thin_pool lvcreate -V1G vgforthin/thin_pool -T -n virtual1 lvcreate -V1G vgforthin/thin_pool -T -n virtual2 mke2fs /dev/vgforthin/virtual1 mke2fs -t ext4 /dev/vgforthin/virtual2 mount /dev/vgforthin/virtual1 /mnt/virtual1/ mount /dev/vgforthin/virtual2 /mnt/virtual2/ touch /mnt/virtual1/file touch /mnt/virtual2/file reboot Results: Stopping monitoring for VG VolGroup: /dev/mapper/vgforthin-thin_pool: read failed after 0 of 4096 at 1073676288: Input/output error /dev/mapper/vgforthin-thin_pool: read failed after 0 of 4096 at 1073733632: Input/output error /dev/mapper/vgforthin-thin_pool: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/vgforthin-thin_pool: read failed after 0 of 4096 at 4096: Input/output error Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Failed to get state of mapped device Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Failed to get state of mapped device Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Huge memory allocation (size 67108864) rejected - metadata corruption? Couldn't create ioctl argument. Failed to get driver version Unmounting file systems: Buffer I/O error on device dm-7, logical block 1 lost page write due to I/O error on dm-7 Buffer I/O error on device dm-7, logical block 73 lost page write due to I/O error on dm-7 Buffer I/O error on device dm-7, logical block 81 lost page write due to I/O error on dm-7 Buffer I/O error on device dm-7, logical block 97 lost page write due to I/O error on dm-7 Buffer I/O error on device dm-7, logical block 0 lost page write due to I/O error on dm-7 Aborting journal on device dm-7-8. Buffer I/O error on device dm-7, logical block 131072 lost page write due to I/O error on dm-7 JBD2: I/O error detected when updating journal superblock for dm-7-8. EXT4-fs error (device dm-7): ext4_put_super: Couldn't clean up the journal EXT4-fs (dm-7): Remounting filesystem read-only device-mapper: thin: dm_thin_find_block() failed, error = -19 device-mapper: thin: dm_thin_find_block() failed, error = -19 Buffer I/O error on device dm-6, logical block 0 lost page write due to I/O error on dm-6 ------------[ cut here ]------------ WARNING: at fs/buffer.c:1161 mark_buffer_dirty+0x82/0xa0() (Tainted: G --------------- T) Hardware name: KVM Modules linked in: ext2 dm_thin_pool(T) dm_persistent_data dm_bufio libcrc32c nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc sg sd_mod crc_t10dif be2iscsi iscsi_boot_sysfs uio cxgb4 libcxgbi cxgb3 mdio ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi microcode virtio_balloon virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_addr] Pid: 2652, comm: umount Tainted: G --------------- T 2.6.32-269.el6.x86_64 #1 Call Trace: [<ffffffff8106b6b7>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b70a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffff811ae282>] ? mark_buffer_dirty+0x82/0xa0 [<ffffffffa04a26d3>] ? ext2_sync_fs+0x43/0xb0 [ext2] [<ffffffff811dde8e>] ? sync_quota_sb+0x5e/0x130 [<ffffffff811aa92a>] ? __sync_filesystem+0x7a/0x90 [<ffffffff811aab3b>] ? sync_filesystem+0x4b/0x70 [<ffffffff8117d957>] ? generic_shutdown_super+0x27/0xe0 [<ffffffff8117da41>] ? kill_block_super+0x31/0x50 [<ffffffff8117eaf0>] ? deactivate_super+0x70/0x90 [<ffffffff8119aadf>] ? mntput_no_expire+0xbf/0x110 [<ffffffff8119b57b>] ? sys_umount+0x7b/0x3a0 [<ffffffff81082d51>] ? sigprocmask+0x71/0x110 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b ---[ end trace da3e93115e42f319 ]--- device-mapper: thin: dm_thin_find_block() failed, error = -19 Buffer I/O error on device dm-6, logical block 67 lost page write due to I/O error on dm-6 Expected: normal reboot Can you please attach -vvvv from 'vgchange -vvvv -ay' activation after reboot ? Whether thin_check successfully passed in this case ? Two separate fixes may be needed. 1) The "Huge memory allocation" messages should not be appearing - the failure condition, whatever it is, should be caught (by thin_check or lvm2) earlier and handled cleanly. 2) If this is caused by an inconsistent metadata state on disk, the (kernel?) code needs fixing to avoid that state occurring. Created attachment 584576 [details]
vgchange -vvvv -ay
I have attached the log of vgchange -ay with -vvvv
As far as I have seen, the issue seems to be (maybe) in the order of stopping things, since the disks being used are iscsi targets.
I tested the same on a "normal" hardware and no errors occurred during reboot.
So, during the reboot maybe the lvm monitoring should be stopped after iscsi stopped?
Or maybe there is some issue with iscsi itself, and does not sync properly, and what we see in lvm later is an effect rather than cause?
As the tests go, I tried changing the sequence of shut down events, and when lvm monitoring is stopped before iscsi stops there are no errors.
Not sure if this is how it should be done though :)
An UPDATE: I just noticed that the second machine I was testing this on did not have device-maper-persistent-data installed, so I have installed it now and I will add a new attachment as a result. Sorry for not noticing it sooner. This package imho should be a part of lvm install since it is an integral part of thinp it seems. The second attachment contains a reboot log with iscsi errors and a log of vgchange/vgremove. Created attachment 584585 [details]
reboot log + vgchange + vgremove
(In reply to comment #6) > Sorry for not noticing it sooner. This package imho should be a part of lvm > install since it is an integral part of thinp it seems. We did not do that because this we do not want a Tech Preview package installed on everyone's machine. But if it's not installed and lvm can't find the checking binary required to use thinp, it should be issuing an error. Yes, that is how I noticed the error after I already uploaded the first log. Spotting the error when running with -vvvv is a bit complicated, so that is why I missed it. But yes, it is issuing an error that it cannot find thin_check. |