| Summary: | Long EXT4 lockups and errors | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | ilya m. <imusayev> |
| Component: | kernel | Assignee: | Red Hat Kernel Manager <kernel-mgr> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.2 | CC: | dchinner, esandeen, jbacik, lczerner, rwheeler |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-02-15 13:09:05 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Forced FSCK had no errors. Package filesystem contains just basic system directories layout. It has nothing to do with ext4... reassigning to kernel. Ilya, please open a support ticket with Red Hat support - they can help you gather information and debug the issue. Best regards, Ric FYI: This issue does not occur on XFS fs. The IO Scheduler is "noop". Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. |
Description of problem: While in process of bench-marking various file systems, we came across an issue where EXT4 partition would lock up under specific load. In current tests it was under iozone testing. A repetitive message i'd see in messages file is attached: Jan 9 23:38:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more than 120 seconds. Jan 9 23:38:51 minigiant5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 9 23:38:51 minigiant5 kernel: jbd2/md0-8 D 0000000000000005 0 16169 2 0x00000000 Jan 9 23:38:51 minigiant5 kernel: ffff88010153bc20 0000000000000046 0000000000000000 ffff88010153bbe4 Jan 9 23:38:51 minigiant5 kernel: 0000000000000000 ffff88012dc24c00 ffff880028215f80 0000000000000400 Jan 9 23:38:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8 000000000000f4e8 ffff880108f230f8 Jan 9 23:38:51 minigiant5 kernel: Call Trace: Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff814edd08>] out_of_line_wait_on_bit+0x78/0x90 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ? wake_bit_function+0x0/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811a9236>] __wait_on_buffer+0x26/0x30 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa022a0e6>] jbd2_journal_commit_transaction+0xa76/0x14b0 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ? try_to_del_timer_sync+0x7b/0xe0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Jan 9 23:38:51 minigiant5 kernel: INFO: task iozone:17280 blocked for more than 120 seconds. Jan 9 23:38:51 minigiant5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 9 23:38:51 minigiant5 kernel: iozone D 0000000000000000 0 17280 16178 0x00000000 Jan 9 23:38:51 minigiant5 kernel: ffff880127e779a8 0000000000000086 ffff880127e77998 00000000ffffffff Jan 9 23:38:51 minigiant5 kernel: ffffffff81ea1100 0000000000000286 ffff880127e77958 ffffffff8107c99b Jan 9 23:38:51 minigiant5 kernel: ffff8800b9071b38 ffff880127e77fd8 000000000000f4e8 ffff8800b9071b38 Jan 9 23:38:51 minigiant5 kernel: Call Trace: Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ? try_to_del_timer_sync+0x7b/0xe0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090cfe>] ? prepare_to_wait+0x4e/0x80 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa022909d>] do_get_write_access+0x29d/0x520 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ? wake_bit_function+0x0/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0229471>] jbd2_journal_get_write_access+0x31/0x50 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa027db78>] __ext4_journal_get_write_access+0x38/0x80 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0259253>] ext4_reserve_inode_write+0x73/0xa0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa02592cc>] ext4_mark_inode_dirty+0x4c/0x1d0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa02595c0>] ext4_dirty_inode+0x40/0x60 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8119fc6b>] __mark_inode_dirty+0x3b/0x160 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811901e2>] file_update_time+0xf2/0x170 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81112b50>] __generic_file_aio_write+0x220/0x480 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8105e770>] ? default_wake_function+0x0/0x20 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81112e1f>] generic_file_aio_write+0x6f/0xe0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0252de1>] ext4_file_write+0x61/0x1e0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811b4ccd>] ? fsnotify_add_notify_event+0x12d/0x280 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811760fa>] do_sync_write+0xfa/0x140 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811b4613>] ? fsnotify+0x113/0x160 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81218e2b>] ? selinux_file_permission+0xfb/0x150 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8120c1d6>] ? security_file_permission+0x16/0x20 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811763f8>] vfs_write+0xb8/0x1a0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81176e01>] sys_write+0x51/0x90 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81176753>] ? sys_lseek+0x53/0x80 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Jan 9 23:38:51 minigiant5 kernel: INFO: task iozone:17282 blocked for more than 120 seconds. Jan 9 23:38:51 minigiant5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 9 23:38:51 minigiant5 kernel: iozone D 0000000000000000 0 17282 16178 0x00000000 Jan 9 23:38:51 minigiant5 kernel: ffff880108fe59a8 0000000000000082 0000000000000000 00000000ffffffff Jan 9 23:38:51 minigiant5 kernel: ffffffff81ea1100 0000000000000286 ffff880108fe5958 ffffffff8107c99b Jan 9 23:38:51 minigiant5 kernel: ffff880128e45078 ffff880108fe5fd8 000000000000f4e8 ffff880128e45078 Jan 9 23:38:51 minigiant5 kernel: Call Trace: Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8107c99b>] ? try_to_del_timer_sync+0x7b/0xe0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa022909d>] do_get_write_access+0x29d/0x520 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a50>] ? wake_bit_function+0x0/0x50 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0229471>] jbd2_journal_get_write_access+0x31/0x50 [jbd2] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa027db78>] __ext4_journal_get_write_access+0x38/0x80 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0259253>] ext4_reserve_inode_write+0x73/0xa0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa02592cc>] ext4_mark_inode_dirty+0x4c/0x1d0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa02595c0>] ext4_dirty_inode+0x40/0x60 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8119fc6b>] __mark_inode_dirty+0x3b/0x160 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811901e2>] file_update_time+0xf2/0x170 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81112b50>] __generic_file_aio_write+0x220/0x480 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8105e770>] ? default_wake_function+0x0/0x20 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81112e1f>] generic_file_aio_write+0x6f/0xe0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffffa0252de1>] ext4_file_write+0x61/0x1e0 [ext4] Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811b4ccd>] ? fsnotify_add_notify_event+0x12d/0x280 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811760fa>] do_sync_write+0xfa/0x140 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811b4613>] ? fsnotify+0x113/0x160 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81218e2b>] ? selinux_file_permission+0xfb/0x150 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8120c1d6>] ? security_file_permission+0x16/0x20 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff811763f8>] vfs_write+0xb8/0x1a0 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff81176e01>] sys_write+0x51/0x90 Jan 9 23:38:51 minigiant5 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Jan 10 00:06:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more than 120 seconds. Jan 10 00:06:51 minigiant5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 10 00:06:51 minigiant5 kernel: jbd2/md0-8 D 0000000000000005 0 16169 2 0x00000000 Jan 10 00:06:51 minigiant5 kernel: ffff88010153bc20 0000000000000046 ffff88010153bbe8 ffff88010153bbe4 Jan 10 00:06:51 minigiant5 kernel: 0000000000000001 ffff88012dc24c00 ffff8800282f5f80 0000000000000400 Jan 10 00:06:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8 000000000000f4e8 ffff880108f230f8 Jan 10 00:06:51 minigiant5 kernel: Call Trace: Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff814edd08>] out_of_line_wait_on_bit+0x78/0x90 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090a50>] ? wake_bit_function+0x0/0x50 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff811a9236>] __wait_on_buffer+0x26/0x30 Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022a7a1>] jbd2_journal_commit_transaction+0x1131/0x14b0 [jbd2] Jan 10 00:06:51 minigiant5 kernel: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8107c99b>] ? try_to_del_timer_sync+0x7b/0xe0 Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220 [jbd2] Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 10 00:06:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220 [jbd2] Jan 10 00:06:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0 Jan 10 00:06:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Jan 10 00:16:51 minigiant5 kernel: INFO: task jbd2/md0-8:16169 blocked for more than 120 seconds. Jan 10 00:16:51 minigiant5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 10 00:16:51 minigiant5 kernel: jbd2/md0-8 D 0000000000000005 0 16169 2 0x00000000 Jan 10 00:16:51 minigiant5 kernel: ffff88010153bc20 0000000000000046 ffff88010153bbe8 ffff88010153bbe4 Jan 10 00:16:51 minigiant5 kernel: 0000000000000001 ffff88012dc24c00 ffff880028235f80 0000000000000400 Jan 10 00:16:51 minigiant5 kernel: ffff880108f230f8 ffff88010153bfd8 000000000000f4e8 ffff880108f230f8 Jan 10 00:16:51 minigiant5 kernel: Call Trace: Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814ed2a3>] io_schedule+0x73/0xc0 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9280>] sync_buffer+0x40/0x50 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814edc5f>] __wait_on_bit+0x5f/0x90 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9240>] ? sync_buffer+0x0/0x50 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff814edd08>] out_of_line_wait_on_bit+0x78/0x90 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090a50>] ? wake_bit_function+0x0/0x50 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff811a9236>] __wait_on_buffer+0x26/0x30 Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022a7a1>] jbd2_journal_commit_transaction+0x1131/0x14b0 [jbd2] Jan 10 00:16:51 minigiant5 kernel: [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8107c99b>] ? try_to_del_timer_sync+0x7b/0xe0 Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022f928>] kjournald2+0xb8/0x220 [jbd2] Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090a10>] ? autoremove_wake_function+0x0/0x40 Jan 10 00:16:51 minigiant5 kernel: [<ffffffffa022f870>] ? kjournald2+0x0/0x220 [jbd2] Jan 10 00:16:51 minigiant5 kernel: [<ffffffff810906a6>] kthread+0x96/0xa0 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff81090610>] ? kthread+0x0/0xa0 Jan 10 00:16:51 minigiant5 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Version-Release number of selected component (if applicable): [root@minigiant5 ~]# uname -a Linux minigiant5 2.6.32-220.2.1.el6.x86_64 #1 SMP Tue Dec 13 16:21:34 EST 2011 x86_64 x86_64 x86_64 GNU/Linux [root@minigiant5 ~]# dmsetup version Library version: 1.02.66-RHEL6 (2011-10-12) Driver version: 4.22.6 2 x 1 TB SATA drives with software RAID 1. scsi-SATA_SAMSUNG_HD103SJS246J9KB510183 scsi-SATA_SAMSUNG_HD103SJS246J9KB510187 System Information Manufacturer: Dell Inc. Product Name: OptiPlex 990 RAM: 4GB CPU: Quad Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz with HT How reproducible: Always Steps to Reproduce: mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1 mkfs.ext4 /dev/md0 mount /dev/md0 /storage/ -o noatime iozone -R -l 5 -u 5 -r 4k -s 5000m -F /storage/f1 /storage/f2 /storage/f3 /storage/f4 /storage/f5 Actual results: Process takes much longer to complete due to EXT4 FS lockups Expected results: The partition along with the process should not lock due to FS issues Additional info: It takes a to run while but eventually locks up the partition.