Hide Forgot
Created attachment 512868 [details] console log Description of problem: xfstests 013 INFO: task fsstress:412 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 412 411 413 (NOTLB) ffff81005c49bed8 0000000000000086 0000000002441fff ffffffff800481d4 ffff81005c49be48 0000000000000007 ffff81007c5aa0c0 ffff810002983100 000008f2f501c8b9 00000000000db424 ffff81007c5aa2a8 0000000700000000 Call Trace: [<ffffffff800481d4>] pagevec_lookup_tag+0x1a/0x21 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80022357>] __up_read+0x19/0x7f [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800f59b0>] __sync_inodes+0x85/0xaa [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:413 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 413 411 414 412 (NOTLB) ffff810047567ed8 0000000000000086 0000000002441fff ffffffff800481d4 ffff810047567e48 0000000000000007 ffff81007f47f080 ffff810037c60080 000008f2baf95176 00000000000bf1eb ffff81007f47f268 000000068873b1e6 Call Trace: [<ffffffff800481d4>] pagevec_lookup_tag+0x1a/0x21 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80022357>] __up_read+0x19/0x7f [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800f59b0>] __sync_inodes+0x85/0xaa [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:414 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 414 411 415 413 (NOTLB) ffff81007dd7ded8 0000000000000082 0000000002441fff ffffffff800481d4 ffff81007dd7de48 0000000000000007 ffff8100256a8080 ffff81007f44c100 000008f28526742e 000000000038ffa0 ffff8100256a8268 0000000300000000 Call Trace: [<ffffffff800481d4>] pagevec_lookup_tag+0x1a/0x21 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80064604>] __down_read+0x12/0x92 [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:416 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 416 411 417 415 (NOTLB) ffff8100501bfed8 0000000000000086 00000000000065f1 ffff81000c2e6800 ffff8100501bfe48 0000000000000007 ffff81003c71e820 ffff810037c51100 000008f2e25c2fd0 000000000000507f ffff81003c71ea08 000000050151d708 Call Trace: [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80064604>] __down_read+0x12/0x92 [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:417 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 417 411 418 416 (NOTLB) ffff810044397ed8 0000000000000082 0000000010801040 ffff810075d25c00 ffff810044397ef8 0000000000000007 ffff81005a297820 ffff810037c01820 000008f2f4fb1e57 0000000000000d20 ffff81005a297a08 0000000110d790c0 Call Trace: [<ffffffff80047172>] try_to_wake_up+0x472/0x484 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80022357>] __up_read+0x19/0x7f [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800f59b0>] __sync_inodes+0x85/0xaa [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:418 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 418 411 419 417 (NOTLB) ffff81005a67fed8 0000000000000082 00000000000065f1 ffff81000c2e6800 ffff81005a67fe48 0000000000000007 ffff810020119100 ffff81007875f820 000008f2a1ec8603 0000000000001894 ffff8100201192e8 000000045a67fee8 Call Trace: [<ffffffff8002af8f>] iput+0x4b/0x84 [<ffffffff80021095>] sync_sb_inodes+0x146/0x26f [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80064604>] __down_read+0x12/0x92 [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:419 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 419 411 420 418 (NOTLB) ffff8100666c5ed8 0000000000000082 0000000002441fff ffffffff800481d4 ffff8100666c5e48 0000000000000007 ffff810071f447e0 ffff810037c60080 000008f2aff3d5ec 0000000000006411 ffff810071f449c8 000000067eced820 Call Trace: [<ffffffff800481d4>] pagevec_lookup_tag+0x1a/0x21 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80064604>] __down_read+0x12/0x92 [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:420 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 420 411 421 419 (NOTLB) ffff81000ca83ed8 0000000000000082 00000000000065f1 ffff81000c2e6800 ffff81000ca83e48 0000000000000007 ffff8100144960c0 ffff810037c60080 000008f2f4ffca70 000000000000c4ce ffff8100144962a8 000000068873b1f9 Call Trace: [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80022357>] __up_read+0x19/0x7f [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800f59b0>] __sync_inodes+0x85/0xaa [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:421 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 421 411 422 420 (NOTLB) ffff810066ec3ed8 0000000000000086 00000000000065f1 ffff81000c2e6800 ffff810066ec3e48 0000000000000007 ffff810047b78040 ffff810002983100 000008f2a1d6cc9d 00000000001a5949 ffff810047b78228 0000000700000000 Call Trace: [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80064604>] __down_read+0x12/0x92 [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 INFO: task fsstress:422 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fsstress D ffffffff801538c1 0 422 411 423 421 (NOTLB) ffff810065fbded8 0000000000000086 0000000002441fff ffffffff800481d4 ffff810065fbde48 0000000000000007 ffff81004fea57e0 ffff810037c60080 000008f2f579d9f5 000000000000498e ffff81004fea59c8 0000000672fc4408 Call Trace: [<ffffffff800481d4>] pagevec_lookup_tag+0x1a/0x21 [<ffffffff80063c4f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff80022357>] __up_read+0x19/0x7f [<ffffffff80063c99>] .text.lock.mutex+0xf/0x14 [<ffffffff800f59b0>] __sync_inodes+0x85/0xaa [<ffffffff800e67e5>] sync_filesystems+0x13/0x102 [<ffffffff800e34d3>] do_sync+0x28/0x5a [<ffffffff800e3513>] sys_sync+0xe/0x12 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Version-Release number of selected component (if applicable): kernel-2.6.18-238.19.1.el5 How reproducible: Only once for now I've tried to run 013 in loop, but still cannot reproduce it. Steps to Reproduce: 1. xfstests 013 2. 3. Actual results: task hang Expected results: pass test Additional info: Full log is attached.
Which file system are you using here? thanks!
(In reply to comment #1) > Which file system are you using here? > > thanks! Sorry, I shouldn't hide it in summary. It's xfs.
I'm not so sure this is an XFS problem. kjournald is in the middle of waiting for IO during a transaction commit like this: kjournald D ffff81007f398040 0 717 51 742 678 (L-TLB) ffff81007d6bbcf0 0000000000000046 ffff81007e27d928 ffff81007b527928 ffffffff800155a2 000000000000000a ffff81007e0e0860 ffff81007f398040 00001c3a5732230e 0000000000005594 ffff81007e0e0a48 000000068807b407 Call Trace: [<ffffffff800155a2>] sync_buffer+0x0/0x3f [<ffffffff8006ec4e>] do_gettimeofday+0x40/0x90 [<ffffffff8005aa30>] getnstimeofday+0x10/0x28 [<ffffffff800155a2>] sync_buffer+0x0/0x3f [<ffffffff800637ca>] io_schedule+0x3f/0x67 [<ffffffff800155dd>] sync_buffer+0x3b/0x3f [<ffffffff800639f6>] __wait_on_bit+0x40/0x6e [<ffffffff800155a2>] sync_buffer+0x0/0x3f [<ffffffff80063a90>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff800a2929>] wake_bit_function+0x0/0x23 [<ffffffff880312e8>] :jbd:__journal_file_buffer+0x13e/0x243 [<ffffffff880339a5>] :jbd:journal_commit_transaction+0x543/0x1066 [<ffffffff8003dd98>] lock_timer_base+0x1b/0x3c [<ffffffff8004b432>] try_to_del_timer_sync+0x7f/0x88 [<ffffffff880375d3>] :jbd:kjournald+0xc1/0x213 [<ffffffff800a28fb>] autoremove_wake_function+0x0/0x2e [<ffffffff88037512>] :jbd:kjournald+0x0/0x213 [<ffffffff800a26e3>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032b26>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff800a26e3>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032a28>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 and syslogd is blocked on a fsync waiting for a journal commit: syslogd D ffffffff801538c1 0 2785 1 2788 2777 (NOTLB) ffff81007396bd88 0000000000000082 0000000600000296 0000000000000001 000000037396bd18 000000000000000a ffff81007e2b4100 ffff81007f44c100 00001c3a5731c376 000000000000cd33 ffff81007e2b42e8 0000000300000096 Call Trace: [<ffffffff88036d8a>] :jbd:log_wait_commit+0xa3/0xf5 [<ffffffff800a28fb>] autoremove_wake_function+0x0/0x2e [<ffffffff8803178a>] :jbd:journal_stop+0x1cf/0x1ff [<ffffffff8002fe73>] __writeback_single_inode+0x1d9/0x318 [<ffffffff800e2be6>] do_readv_writev+0x26e/0x291 [<ffffffff800f5a22>] sync_inode+0x24/0x33 [<ffffffff8804c370>] :ext3:ext3_sync_file+0xcc/0xf8 [<ffffffff80050508>] do_fsync+0x52/0xa4 [<ffffffff800e346b>] __do_fsync+0x23/0x36 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 then there's pcscd, acpid, sshd and other system daemons stuck waiting for ext3 journal commits. Then there's a sys_sync call from fsstress that is stuck waiting for a ext3 journal commit to complete, and then a bunch more sys_sync calls stuck behind that first one. It's that bunch of sys_sync() calls that triggered the hung task warning. So this looks like a problem with your root filesystem (ext3), not XFS. It is possible a manifestation of the well known fsync-the-world problem if you are using loopback devices hosted on the root ext3 filesystem to provide the TEST_DEV/SCRATCH_DEV devices for xfstests....
Thanks for the explaining. Yes, this is a testing on loop device hosted on ext3. (Perhaps qe should improve z-stream testing to try to avoid using loop device when running xfstests, as what we've already done with y-stream testing) Is there a bug number for the sync-the-world problem?
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).