Red Hat Bugzilla – Bug 678357
online disk resizing may cause data corruption
Last modified: 2011-05-19 08:00:56 EDT
Description of problem: https://lkml.org/lkml/2011/2/17/15 From: NeilBrown <neilb@suse.de> Date: Thu, 17 Feb 2011 16:37:30 +1100 Subject: [PATCH] Fix over-zealous flush_disk when changing device size. There are two cases when we call flush_disk. In one, the device has disappeared (check_disk_change) so any data will hold becomes irrelevant. In the oter, the device has changed size (check_disk_size_change) so data we hold may be irrelevant. In both cases it makes sense to discard any 'clean' buffers, so they will be read back from the device if needed. In the former case it makes sense to discard 'dirty' buffers as there will never be anywhere safe to write the data. In the second case it *does*not* make sense to discard dirty buffers as that will lead to file system corruption when you simply enlarge the containing devices. flush_disk calls __invalidate_devices. __invalidate_device calls both invalidate_inodes and invalidate_bdev. invalidate_inodes *does* discard I_DIRTY inodes and this does lead to fs corruption. invalidate_bev *does*not* discard dirty pages, but I don't really care about that at present. So this patch adds a flag to __invalidate_device (calling it __invalidate_device2) to indicate whether dirty buffers should be killed, and this is passed to invalidate_inodes which can choose to skip dirty inodes. flusk_disk then passes true from check_disk_change and false from check_disk_size_change. dm avoids tripping over this problem by calling i_size_write directly rathher than using check_disk_size_change. md does use check_disk_size_change and so is affected. This regression was introduced by commit 608aeef17a which causes check_disk_size_change to call flush_disk. Version-Release number of selected component (if applicable): All rhel6 kernels How reproducible: Steps to Reproduce: Actual results: Expected results: I will attempt to put together an automated test for this.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Patch(es) available on kernel-2.6.32-125.el6
Created attachment 491231 [details] reproducer This is what I've been working with, though I have yet to get it to produce corruption.
I cannot reproduce it on unpatched kernel(-124).I used loop device to make up raid5 and ran reproducer 500 times in loop(about 6 hours). On x86_64 host with -128 kernel, I can hit kernel BUG quite reliably. On i386 host there is no such crash. On -130 kernel, x86_64 and i386 hosts all went well, it seems the kernel BUG was introduced between -125 and -128 and fixed on newer kernel. Confirmed patch is applied in -130 kernel. So I set it to SanityOnly. A full fs testing will be performed on snapshot 4. (Kernel BUG seen on -128 kernel x86_64 host) BUG: unable to handle kernel paging request at 0000000000010021 IP: [<ffffffff8117ac3d>] __free_pipe_info+0x3d/0x70 PGD 474f31067 PUD 475a7c067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/block/md127/uevent CPU 1 Modules linked in: ext3 jbd raid456 device-mapper: ioctl: remove_all left 2 open device(s) async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log s g bnx2 cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif mptsa s mptscsih mptbase scsi_transport_sas pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Modules linked in: ext3 jbd raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv 6 dm_mirror dm_region_hash dm_log sg bnx2 cdc_ether usbnet mii serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext4 m bcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase scsi_transport_sas pata_acpi ata_generic ata_piix dm_mod [last unloaded: microcode] Pid: 2230, comm: udevd Not tainted 2.6.32-128.el6.x86_64 #1 System x3550 M3 -[7944I21]- RIP: 0010:[<ffffffff8117ac3d>] [<ffffffff8117ac3d>] __free_pipe_info+0x3d/0x70 RSP: 0018:ffff880475b21e28 EFLAGS: 00010202 RAX: 0000000000010001 RBX: 000000000000000b RCX: 0000000000000003 RDX: 00000000000001b8 RSI: ffff880275771a10 RDI: ffff880275771800 RBP: ffff880475b21e48 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005 R13: ffff880275771800 R14: 0000000000000001 R15: 0000000000000000 FS: 00007fdc415b87a0(0000) GS:ffff880028220000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000010021 CR3: 0000000475a70000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process udevd (pid: 2230, threadinfo ffff880475b20000, task ffff880474408040) Stack: ffff880474408040 ffff8802708f6108 ffff8802708f6108 ffff8802708f61c0 <0> ffff880475b21e68 ffffffff8117ac8d ffff8802738f46c0 ffff880275771800 <0> ffff880475b21ea8 ffffffff8117ad58 ffff8802738f46c0 ffff8802738f46c0 Call Trace: [<ffffffff8117ac8d>] free_pipe_info+0x1d/0x30 [<ffffffff8117ad58>] pipe_release+0xb8/0xc0 [<ffffffff8117adb5>] pipe_read_release+0x15/0x20 [<ffffffff81172da5>] __fput+0xf5/0x210 [<ffffffff81172ee5>] fput+0x25/0x30 [<ffffffff8116e45d>] filp_close+0x5d/0x90 [<ffffffff8116e535>] sys_close+0xa5/0x100 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Code: 41 bc 10 00 00 00 31 db 49 89 fd 0f 1f 00 48 63 c3 48 8d 14 80 48 c1 e2 03 49 8b 44 15 68 48 85 c0 74 0b 49 8d 74 15 58 4c 89 ef <ff> 50 20 83 c3 01 41 83 ec 01 75 d7 49 8b 7d 20 48 85 ff 74 07 RIP [<ffffffff8117ac3d>] __free_pipe_info+0x3d/0x70 RSP <ffff880475b21e28> CR2: 0000000000010021
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0542.html