Hide Forgot
Description of problem: IO operations became 20-100x slow if large ext4 filesystem is more than 97% full. There are call traces in dmesg output regarding ext4 calls: Call Trace: [<f906850d>] ext4_mark_iloc_dirty+0x42d/0x4e1 [ext4] [<f9068539>] ext4_mark_iloc_dirty+0x459/0x4e1 [ext4] [<f9068d1f>] ext4_mark_inode_dirty+0x167/0x19b [ext4] [<f900aa8b>] start_this_handle+0x224/0x33b [jbd2] [<c043731b>] autoremove_wake_function+0x0/0x2d [<f900ac2f>] jbd2_journal_start+0x8d/0xbc [jbd2] [<f906dcfd>] ext4_da_write_begin+0x18e/0x28b [ext4] [<c045952b>] generic_file_buffered_write+0x101/0x58b [<f900a4b3>] jbd2_journal_stop+0x177/0x181 [jbd2] [<c0459e5b>] __generic_file_aio_write_nolock+0x4a6/0x52a [<c04c8bf3>] avc_has_perm_noaudit+0x5e/0x336 [<c040597a>] common_interrupt+0x1a/0x20 [<c04c9927>] avc_has_perm+0x3c/0x46 [<c0459f38>] generic_file_aio_write+0x59/0xac [<f9065d4c>] ext4_file_write+0xf3/0x1ef [ext4] [<c047628a>] do_sync_write+0xb6/0xf1 [<c043731b>] autoremove_wake_function+0x0/0x2d [<c04761d4>] do_sync_write+0x0/0xf1 [<c0476b13>] vfs_write+0xa1/0x143 [<c047713d>] sys_write+0x3c/0x63 [<c0404f4b>] syscall_call+0x7/0xb Version-Release number of selected component (if applicable): e4fsprogs-1.41.12-2.el5 kernel-PAE-2.6.18-274.12.1.el5 How reproducible: I was able to test and reproduce the scenario on two different servers with two different underlying physical storages * md3000i with single virtual drive and multipathd * dell h700 with 6x2TB drives and raid0 softraid Steps to Reproduce: 1. I've created 11TB partition, formated as ext4 filesystem with default parameters 'mkfs.ext4 -L /opt /dev/md0' 2. run benchmark that copied 6 millions of small(100KB) files and 6 millions of 100MB files from local drive to the storage. 6 cp processes in parallel - that simulated our workload. Actual results: File writes become 20-100x times slower when ext4 partition was 97% full. LA is up to 800, server crashes Expected results: no slowdown Additional info: [root@localhost ~]# dumpe4fs /dev/md0 dumpe4fs 1.41.12 (17-May-2010) Filesystem volume name: test Last mounted on: /opt/opt Filesystem UUID: 6c9b3fe9-9198-440d-976c-cc269228590a Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 724697088 Block count: 2898784512 Reserved block count: 144939225 Free blocks: 2853256642 Free inodes: 724697077 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 332 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Tue Dec 27 15:38:47 2011 Last mount time: Tue Dec 27 15:56:34 2011 Last write time: Tue Dec 27 15:56:34 2011 Mount count: 1 Maximum mount count: 23 Last checked: Tue Dec 27 15:38:47 2011 Check interval: 15552000 (6 months) Next check after: Sun Jun 24 16:38:47 2012 Lifetime writes: 173 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: fe943c16-2765-42cf-a23c-99cd30c65772 Journal backup: inode blocks Journal features: journal_incompat_revoke Journal size: 128M Journal length: 32768 Journal sequence: 0x000012b5 Journal start: 27176
Please open this through red hat support so we can have them help grab information. If you don't have a red hat support contract, we should take this discussion out to the upstream lists. Thanks!
We don't have Red Hat support contract. Can you please specify to what upstream list should I forward this issue. Thank you, Regards, Andrey
linux-ext4.org please. Although ideally it should be tested on an upstream kernel prior to reporting there. Thanks, -Eric
Closing since this was moved to upstream lists