Bug 1171317
Summary: | xfs may crash after unmount if a log write is delayed | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Harold Miller <hamiller> | ||||||
Component: | kernel | Assignee: | Brian Foster <bfoster> | ||||||
kernel sub component: | XFS | QA Contact: | Zorro Lang <zlang> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | medium | ||||||||
Priority: | medium | CC: | bfoster, eguan, esandeen, plambri, vlad.bespalov | ||||||
Version: | 6.6 | ||||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-2.6.32-537.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-07-22 08:34:06 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 965236 [details]
Proposd patch
Interesting, the reporter first reported this upstream in 2013: http://marc.info/?t=137123029400001&r=1&w=2 Note that the attached patch is not a proposed patch to fix the problem, but a patch which provokes the error. OK, I unchecked the 'Private' box, and will update customer. SFDC Case closed, customer will monitor BZ directly. Issue NOT fixed yet... I think the difference between this crashing and not is the fs shutdown. This means we skip writing the unmount record and the associated log write wait logic. Using the attached delay hack patch, I can reproduce by issuing a godown and umount once a delayed log write is triggered. Without the shutdown, the umount sequence blocks on the delayed I/O. I think this is similar to bug 1112214 on rhel7 and fundamentally addressed by the same fix to make the workqueue per-mount such that it is destroyed at umount. The problem is that rhel6 does not have the newer workqueue implementation that has lended itself to the lower-cost creation of many per-mount wq's in XFS. I see that we have created at least one single-threaded per-mount workqueue in rhel6. I don't think that's really an option for xfslogd, however, as it is responsible for all metadata I/O completion along with log I/O completion. We do have some code in xfs_log_unmount_write() that appears to attempt to handle the unmount of a shutdown fs scenario. I'll have to dig more into that to see why it doesn't prevent this issue... ... and that code doesn't help us because all of the iclogs are in error state. Basically, what we have is a branch of code in the unmount path that runs explicitly when we've had a log I/O shutdown and attempts to process and wait on the current iclog if it isn't in an error state. As part of log I/O shutdown, we set _all_ of the iclogs to an error state (xlog_state_ioerror()). So it appears this code doesn't really do anything at all. I think what we might need is the following commit or some variant thereof: 9c23eccc xfs: unmount does not wait for shutdown during unmount Confirmed that the commit in comment #12 addresses the problem via the delayed I/O reproducer. I have a series undergoing additional testing. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Patch(es) available on kernel-2.6.32-537.el6 Reproduced on kernel 504 with attached patch and script(test on scsi_debug disk): BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: [<ffffffff8152ce5e>] _spin_lock+0xe/0x30 PGD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/devices/pseudo_0/adapter0/host6/target6:0:0/6:0:0:0/delete CPU 41 Modules linked in: scsi_debug xfs(U) exportfs autofs4 cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ipmi_devintf ipmi_si ipmi_msghandler microcode iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii serio_raw i2c_i801 i2c_core lpc_ich mfd_core shpchp sg i7core_edac edac_core ioatdma dca bnx2 ext4 jbd2 mbcache sd_mod crc_t10dif pata_acpi ata_generic ata_piix megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug] Pid: 47574, comm: xfslogd/41 Not tainted 2.6.32-504.el6.x86_64 #1 IBM System x3850 X5 -[7143OQE]-/Node 1, Processor Card RIP: 0010:[<ffffffff8152ce5e>] [<ffffffff8152ce5e>] _spin_lock+0xe/0x30 RSP: 0018:ffff881032491db0 EFLAGS: 00010286 RAX: 0000000000010000 RBX: ffff881077de03c0 RCX: ffffe8f7ffd24948 RDX: ffff88107865ebb8 RSI: 0000000000000002 RDI: 00000000000000a8 RBP: ffff881032491db0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 00000000000000a8 R13: 0000000000000000 R14: 0000000000000002 R15: ffffe8f7ffd24948 FS: 0000000000000000(0000) GS:ffff88048e520000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000000000a8 CR3: 0000000001a85000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process xfslogd/41 (pid: 47574, threadinfo ffff881032490000, task ffff881078edb500) Stack: ffff881032491de0 ffffffffa051af79 ffff881077de03c0 0000000000000000 <d> ffff88107865eb00 ffff881032491fd8 ffff881032491e10 ffffffffa051b739 <d> ffff881032491e30 ffff88107865eb00 ffffe8f7ffd24940 ffffffffa0536780 Call Trace: [<ffffffffa051af79>] xlog_state_done_syncing+0x39/0xb0 [xfs] [<ffffffffa051b739>] xlog_iodone+0x59/0xb0 [xfs] [<ffffffffa0536780>] ? xfs_buf_iodone_work+0x0/0x50 [xfs] [<ffffffffa05367a6>] xfs_buf_iodone_work+0x26/0x50 [xfs] [<ffffffff81097fe0>] worker_thread+0x170/0x2a0 [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81097e70>] ? worker_thread+0x0/0x2a0 [<ffffffff8109e66e>] kthread+0x9e/0xc0 [<ffffffff8100c20a>] child_rip+0xa/0x20 [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 Code: e5 0f 1f 44 00 00 fa 66 0f 1f 44 00 00 f0 81 2f 00 00 00 01 74 05 e8 22 bb d6 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 <f0> 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 0f b7 17 eb f5 RIP [<ffffffff8152ce5e>] _spin_lock+0xe/0x30 RSP <ffff881032491db0> CR2: 00000000000000a8 test passed, after run this reproducer more than 100 times on kernel 540 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1272.html |
Created attachment 965235 [details] Script to cause crash Description of problem:If a specific XFS log write is delayed, e.g. in a failing drive, and while it is not complete XFS goes into shutdown mode and is unmounted, then a crash accessing freed memory is possible. Version-Release number of selected component (if applicable):/linux-2.6/xfs_buf.c How reproducible: See attached repro-script Steps to Reproduce: 1.Execute script 2. 3. Actual results:Crash with the following call trace: <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 <1>IP: [<ffffffff8152ce7e>] _spin_lock+0xe/0x30 <4>PGD aeb4c067 PUD aeb1a067 PMD 0 <4>Oops: 0002 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:15.0/0000:03:00.0/host2/port-2:2/end_device-2:2/target2:0:2/2:0:2:0/delete <4>CPU 2 <4>Modules linked in: xfs(U) exportfs 8021q garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntra ck ip6table_filter ip6_tables ipv6 vsock(U) fiop(P)(U) ppdev parport_pc parport microcode vmware_balloon e1000 sg vmci(U) i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif sr_mod cdrom mptsas mptscsih mptbase scsi_transport_sas pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf] <4> <4>Pid: 4844, comm: xfslogd/2 Tainted: P W --------------- 2.6.32-504.1.3.el6.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform <4>RIP: 0010:[<ffffffff8152ce7e>] [<ffffffff8152ce7e>] _spin_lock+0xe/0x30 <4>RSP: 0018:ffff8800a603fdb0 EFLAGS: 00010286 <4>RAX: 0000000000010000 RBX: ffff8800b91e4780 RCX: ffffe8ffffd086c8 <4>RDX: ffff8800bc87abb8 RSI: 0000000000000002 RDI: 00000000000000a8 <4>RBP: ffff8800a603fdb0 R08: 0000000000000000 R09: 0000000000000000 <4>R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000a8 <4>R13: 0000000000000000 R14: 0000000000000002 R15: ffffe8ffffd086c8 <4>FS: 00007f3aec62b700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 00000000000000a8 CR3: 00000000aeb56000 CR4: 00000000000407e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process xfslogd/2 (pid: 4844, threadinfo ffff8800a603e000, task ffff880037af2aa0) <4>Stack: <4> ffff8800a603fde0 ffffffffa052bf79 ffff8800b91e4780 0000000000000000 <4><d> ffff8800bc87ab00 ffff8800a603ffd8 ffff8800a603fe10 ffffffffa052c739 <4><d> ffff8800a603fe30 ffff8800bc87ab00 ffffe8ffffd086c0 ffffffffa0547780 <4>Call Trace: <4> [<ffffffffa052bf79>] xlog_state_done_syncing+0x39/0xb0 [xfs] <4> [<ffffffffa052c739>] xlog_iodone+0x59/0xb0 [xfs] <4> [<ffffffffa0547780>] ? xfs_buf_iodone_work+0x0/0x50 [xfs] <4> [<ffffffffa05477a6>] xfs_buf_iodone_work+0x26/0x50 [xfs] <4> [<ffffffff81097fe0>] worker_thread+0x170/0x2a0 <4> [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff81097e70>] ? worker_thread+0x0/0x2a0 <4> [<ffffffff8109e66e>] kthread+0x9e/0xc0 Expected results: unmount complete without crash Additional info: proposed patch attached