Bug 484409
Summary: | XFS related deadlock on MP system | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jussi Eloranta <eloranta> | ||||
Component: | kernel | Assignee: | Eric Sandeen <esandeen> | ||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 10 | CC: | kernel-maint, quintela | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-06-30 21:12:57 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jussi Eloranta
2009-02-06 16:53:37 UTC
Created attachment 331146 [details]
dmesg output
I suppose /var/log/messages doesn't have any more info? Ok, 5 processes are blocked: 15 INFO: task 0logwatch:2767 blocked for more than 120 seconds. 15 INFO: task auditd:27611 blocked for more than 120 seconds. 15 INFO: task molprop_2006_1_:1847 blocked for more than 120 seconds. 15 INFO: task ntpd:27636 blocked for more than 120 seconds. 15 INFO: task pdflush:691 blocked for more than 120 seconds. 4 of them are stuck behind pdflush ("?" functions removed): INFO: task pdflush:691 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. pdflush D ffff8808228771c0 0 691 2 ffff880a90db1cc0 0000000000000046 ffffe20021440c00 ffffe20021440c38 ffffffff816e1500 ffffffff816e1500 ffff880c21d32e20 ffff8808229f9710 ffff880c21d33198 0000000021440768 ffffe200214407a0 ffff880c21d33198 Call Trace: [<ffffffffa004dcae>] xfs_buf_wait_unpin+0x7e/0xa5 [xfs] [<ffffffffa004eade>] xfs_buf_iorequest+0x28/0x6c [xfs] [<ffffffffa0052a93>] xfs_bdstrat_cb+0x19/0x3b [xfs] [<ffffffffa004b1f7>] xfs_bwrite+0x5f/0xae [xfs] [<ffffffffa004682e>] xfs_syncsub+0x123/0x22f [xfs] [<ffffffffa004697c>] xfs_sync+0x42/0x47 [xfs] [<ffffffffa0053e88>] xfs_fs_write_super+0x23/0x2b [xfs] [<ffffffff810c1902>] sync_supers+0x71/0xc4 [<ffffffff81095db9>] wb_kupdate+0x35/0x119 [<ffffffff8109683f>] pdflush+0x16e/0x231 [<ffffffff81054e9b>] kthread+0x49/0x76 [<ffffffff810116e9>] child_rip+0xa/0x11 at first glance, this is xfs waiting for an io completion. Either xfs got the counting wrong or the storage lost an IO, perhaps. I've not seen this sort of hang before, or at least not recently... It'd be nice to know if there are any storage errors. Perhaps a serial console or remote syslog would be good in case this happens again, to gather more info? I haven't seen any I/O related errors on this system. But may be something is about to break down - who knows. As it is not easy to reproduce this, I will just have to wait until it happens again + try to get more info. Also, I will try to do remote syslog. Thanks - I think there have been lost IO completion issues w/ md in the past, though very rare.... I'd probably chalk this up to that, but I know that's not a very satisfying answer ... Have you seen this since? if not, I'll chalk it up to bogons in md and close ... I have not seen it any more - may be it has been fixed... Ok, I'm going to close it based on the age & lack of info we have about the problem, but if you see it again, please feel free to re-open. Thanks, -Eric |