| Summary: | fsstress via nfs leads to kernel panic | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Eryu Guan <eguan> | ||||
| Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Filesystem QE <fs-qe> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.1 | CC: | bfields, jlayton, kzhang, rwheeler, steved, yanwang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-07-08 19:55:47 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Eryu Guan
2011-03-02 02:55:30 UTC
I'll test more via and without nfs, and update the results here. fsstress blocked for more than 120 seconds on another i386 host, not panic INFO: task fsstress:5984 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D f011dba8 0 5984 5975 0x00000000 f1d18030 00000086 c0540345 f011dba8 002a1068 5c5c8b82 00000000 00000004 c1f081a0 f06ce900 00000a72 c0ade1a0 c0ade1a0 f1d182d8 c0ade1a0 c0ad9bd4 c0ade1a0 6c81473b 00000a72 f1d182d8 ffffffff 6c812a96 c1f48700 f72c405c Call Trace: [<c0540345>] ? mntput_no_expire+0x15/0xd0 [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c043f931>] ? enqueue_task_fair+0x31/0x70 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:5985 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D db38bf4c 0 5985 5975 0x00000000 f1e98030 00000082 00000002 db38bf4c c1f03bd4 00000000 00000000 00000003 c1ec81a0 f06ce740 00000a75 c0ade1a0 c0ade1a0 f1e982d8 c0ade1a0 c0ad9bd4 c0ade1a0 0a3c2424 00000a75 f1e982d8 c1f03bd4 0a3c0b63 00aadd6a f1e98030 Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6001 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D c1ec3d00 0 6001 5975 0x00000000 f07ffab0 00000086 00000400 c1ec3d00 00000002 efd3ff48 efd3ff44 00000006 c1f881a0 f2bdd200 00000a70 c0ade1a0 c0ade1a0 f07ffd58 c0ade1a0 c0ad9bd4 c0ade1a0 c302a8b2 00000a70 f07ffd58 c1ec3bd4 00000400 00aa9988 f07ffab0 Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6007 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D e5799f4c 0 6007 5975 0x00000000 d7ee7ab0 00000086 00000002 e5799f4c c1e83bd4 00000000 00000000 c09fa020 c09fa020 f0669ac0 00000a6e c0ade1a0 c0ade1a0 d7ee7d58 c0ade1a0 c0ad9bd4 c0ade1a0 bd6faf97 00000a6e d7ee7d58 c1e83bd4 c040b1c0 00aa73a6 d7ee7ab0 Call Trace: [<c040b1c0>] ? do_IRQ+0x50/0xc0 [<c040a030>] ? common_interrupt+0x30/0x38 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6009 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D d7139f4c 0 6009 5975 0x00000000 d7ee7030 00000082 00000002 d7139f4c c1f43bd4 00000000 00000000 00000002 c1e881a0 f0669e40 00000a6f c0ade1a0 c0ade1a0 d7ee72d8 c0ade1a0 c0ad9bd4 c0ade1a0 0df3affc 00000a6f d7ee72d8 c1f43bd4 0df398ac 00aa78d2 d7ee7030 Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6015 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D c1f03e00 0 6015 5975 0x00000000 ec643030 00000086 000007ff c1f03e00 00000002 d7057f48 d7057f44 00000002 c1e881a0 f05b43c0 00000a71 c0ade1a0 c0ade1a0 ec6432d8 c0ade1a0 c0ad9bd4 c0ade1a0 4e09ccc2 00000a71 ec6432d8 c1f03bd4 000007ff 00aa9e30 ec643030 Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6016 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D 00000004 0 6016 5975 0x00000000 deb3dab0 00000082 f70c2000 00000004 00000001 c0463325 d36273e0 000048e5 00000000 f05b4c80 00000a6c c0ade1a0 c0ade1a0 deb3dd58 c0ade1a0 c0ad9bd4 c0ade1a0 d362840c 00000a6c deb3dd58 efbd2000 c09f99e4 c04b6115 c0459b15 Call Trace: [<c0463325>] ? run_timer_softirq+0x35/0x2c0 [<c04b6115>] ? rcu_process_callbacks+0x35/0x40 [<c0459b15>] ? __do_softirq+0xb5/0x1b0 [<c0459d75>] ? irq_exit+0x35/0x70 [<c0427523>] ? smp_apic_timer_interrupt+0x53/0x90 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6039 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D d855e888 0 6039 5975 0x00000000 efb85030 00000086 f87edca0 d855e888 0029ffba f118ce48 00000000 0000355b 00000000 f1e2c200 00000a6e c0ade1a0 c0ade1a0 efb852d8 c0ade1a0 c0ad9bd4 c0ade1a0 7d460832 00000a6e efb852d8 ef4da000 7d45e1b0 c1f88700 f72c405c Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c043f931>] ? enqueue_task_fair+0x31/0x70 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6048 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D c1e83e00 0 6048 5975 0x00000000 d1b32030 00000082 00000400 c1e83e00 00000002 d5ef5f48 d5ef5f44 00000000 c1e081a0 f072e040 00000a6f c0ade1a0 c0ade1a0 d1b322d8 c0ade1a0 c0ad9bd4 c0ade1a0 819cc28d 00000a6f d1b322d8 c1e83bd4 00000400 00aa8035 d1b32030 Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 INFO: task fsstress:6055 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ed888c70 0 6055 5975 0x00000000 ed889ab0 00000086 ef575c58 ed888c70 002a04bb 4cfe0106 00000000 000080c5 00000000 f0192900 00000a6f c0ade1a0 c0ade1a0 ed889d58 c0ade1a0 c0ad9bd4 c0ade1a0 79882cc4 00000a6f ed889d58 ed914000 7988195e c1e48700 f72c405c Call Trace: [<c043f41d>] ? enqueue_entity+0x37d/0x400 [<c043f931>] ? enqueue_task_fair+0x31/0x70 [<c08218f8>] ? __mutex_lock_slowpath+0xd8/0x140 [<c08217fd>] ? mutex_lock+0x1d/0x40 [<c054d7b0>] ? sync_filesystems+0x10/0x100 [<c054d8de>] ? sys_sync+0xe/0x40 [<c0409adf>] ? sysenter_do_call+0x12/0x28 This is not a regression, also happens on 6.0 GA kernel Also found this on s390x host, with a different call trace, I'm not sure if they share the same root cause.
Ý<00000000004be12e>¨ mutex_lock+0x5a/0x60
Ý<00000000002800be>¨ sync_filesystems+0x3a/0x184
Ý<000000000028028e>¨ sys_sync+0x32/0x64
Ý<0000000000118464>¨ sysc_tracego+0xe/0x14
Ý<0000020000137f1a>¨ 0x20000137f1a
INFO: task fsstress:2198 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fsstress D 00000000004be032 0 2198 2186 0x00000200
000000001d107bb0 00000000010e4e00 000000001d107bb0 000000001d107bd8
000000000380f518 00000000008a5e00 00000000010e4e00 000000000380f518
000000000380f518 0000000000000000 00000000024ee990 000000000080ee98
00000000008a5e00 00000000024eee28 000000000380f4e0 00000000010e4e00
00000000004c6c78 00000000004bcbae 000000001d107c10 000000001d107dc8
Call Trace:
(Ý<00000000004bcbae>¨ schedule+0x5aa/0xf84)
Ý<00000000004be032>¨ __mutex_lock_slowpath+0xa6/0x148
Ý<00000000004be12e>¨ mutex_lock+0x5a/0x60
Ý<00000000002800be>¨ sync_filesystems+0x3a/0x184
Ý<000000000028028e>¨ sys_sync+0x32/0x64
Ý<0000000000118464>¨ sysc_tracego+0xe/0x14
Ý<0000020000137f1a>¨ 0x20000137f1a
INFO: task fsstress:2199 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fsstress D 00000000004be032 0 2199 2186 0x00000200
000000001f1b3bb0 00000000010e4e00 000000001f1b3bb0 000000001f1b3bd8
000000001d533418 00000000008a5e00 00000000010e4e00 000000001d533418
000000001d533418 0000000000000000 00000000050ba540 000000000080ee98
00000000008a5e00 00000000050ba9d8 000000001d5333e0 00000000010e4e00
00000000004c6c78 00000000004bcbae 000000001f1b3c10 000000001f1b3dc8
Call Trace:
(Ý<00000000004bcbae>¨ schedule+0x5aa/0xf84)
Ý<00000000004be032>¨ __mutex_lock_slowpath+0xa6/0x148
Ý<00000000004be12e>¨ mutex_lock+0x5a/0x60
Ý<00000000002800be>¨ sync_filesystems+0x3a/0x184
Ý<000000000028028e>¨ sys_sync+0x32/0x64
Ý<0000000000118464>¨ sysc_tracego+0xe/0x14
Ý<0000020000137f1a>¨ 0x20000137f1a
Below is output of "echo w > /proc/sysrq-trigger", seems not so informative ...
Ý<00000000002800be>¨ sync_filesystems+0x3a/0x184
Ý<000000000028028e>¨ sys_sync+0x32/0x64
Ý<0000000000118464>¨ sysc_tracego+0xe/0x14
Ý<0000020000137f1a>¨ 0x20000137f1a
fsstress D 00000000004be032 0 3018 2186 0x00000200
0000000017677bb0 00000000010e4e00 0000000017677bb0 0000000017677bd8
0000000017c1c478 00000000008a5e00 00000000010e4e00 0000000017c1c478
0000000017c1c478 0000000000000001 0000000017677e00 000000000080ee98
00000000008a5e00 000000001782ad28 0000000017c1c440 00000000010e4e00
00000000004c6c78 00000000004bcbae 0000000017677c10 0000000017677dc8
Call Trace:
(Ý<00000000004bcbae>¨ schedule+0x5aa/0xf84)
Ý<00000000004be032>¨ __mutex_lock_slowpath+0xa6/0x148
Ý<00000000004be12e>¨ mutex_lock+0x5a/0x60
Ý<00000000002800be>¨ sync_filesystems+0x3a/0x184
Ý<000000000028028e>¨ sys_sync+0x32/0x64
Ý<0000000000118464>¨ sysc_tracego+0xe/0x14
Ý<0000020000137f1a>¨ 0x20000137f1a
fsstress D 00000000004be032 0 3019 2186 0x00000200
000000001799fbb0 00000000010e4e00 000000001799fbb0 000000001799fbd8
00000000024ba578 00000000008a5e00 00000000010e4e00 00000000024ba578
00000000024ba578 0000000000000000 000000001782a040 000000000080ee98
00000000008a5e00 000000001782a4d8 00000000024ba540 00000000010e4e00
00000000004c6c78 00000000004bcbae 000000001799fc10 000000001799fdc8
Call Trace:
(Ý<00000000004bcbae>¨ schedule+0x5aa/0xf84)
Ý<00000000004be032>¨ __mutex_lock_slowpath+0xa6/0x148
Ý<00000000004be12e>¨ mutex_lock+0x5a/0x60
.rt_runtime : 950.000000
runnable tasks:
task PID tree-key switches prio exec-runtime
sum-exec sum-sleep
--------------------------------------------------------------------------------
--------------------------
R bash 2017 96765.804301 87 120 96765.804301 1
80.667980 245534.321129 /
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Created attachment 492022 [details]
Call trace and sysrq-w output
Got similar call trace on x86_64 host when testing xfs without nfs. It seems that it's a file system independent issue.
The host was not hang, fsstress finished eventually.
Change platform to ALL.
The testing mentioned in comment 7 is performing on 2.6.32-131.0.1.el6 kernel. The panic described in comment 0 hasn't been seen for a second time. (In reply to comment #0) > Description of problem: > fsstress via nfs leads to kernel panic > > Version-Release number of selected component (if applicable): > [root@intel-d3c69-01 ~]# uname -a > Linux intel-d3c69-01.rhts.eng.bos.redhat.com 2.6.32-118.el6.i686 #1 SMP Tue Feb > 22 11:12:47 EST 2011 i686 i686 i386 GNU/Linux > > How reproducible: > Not sure, I haven't tested it mutiple times > > Steps to Reproduce: > 1. Install fsstress from LTP > 2. Setup nfs server, that could be: > mkdir /home/testdir; mkdir /mnt/nfs > echo "/home/testdir *(rw,no_root_squash)" >> /etc/exports > service nfs start > mount -t nfs localhost:/home/testdir /mnt/nfs > 3. Run fsstress > fsstress -d /mnt/nfs -n 1000 -p 1000 > 4. Wait for kernel panic, I guess it may take several hours Does this panic/hang happen when a non-loopback (a remote server) mount point is used? (In reply to comment #9) > Does this panic/hang happen when a non-loopback (a remote server) > mount point is used? I re-tested on 2.6.32-131.0.15.el6 kernel, seems the issue was gone, I saw no panic nor hang. I tested it on both a local mounted nfs and a remote mounted nfs. Ok, thanks. Closing bug per comment #10. Please reopen if this reappears. |