Bug 770767 - ext4: file writes become 20-100x times slower when partition was 97% full
Summary: ext4: file writes become 20-100x times slower when partition was 97% full
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.7
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-29 02:37 UTC by Andrey
Modified: 2013-02-15 13:16 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-15 13:16:07 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andrey 2011-12-29 02:37:38 UTC
Description of problem:
IO operations became 20-100x slow if large ext4 filesystem is more than 97% full. There are call traces in dmesg output regarding ext4 calls:
Call Trace:
 [<f906850d>] ext4_mark_iloc_dirty+0x42d/0x4e1 [ext4]
 [<f9068539>] ext4_mark_iloc_dirty+0x459/0x4e1 [ext4]
 [<f9068d1f>] ext4_mark_inode_dirty+0x167/0x19b [ext4]
 [<f900aa8b>] start_this_handle+0x224/0x33b [jbd2]
 [<c043731b>] autoremove_wake_function+0x0/0x2d
 [<f900ac2f>] jbd2_journal_start+0x8d/0xbc [jbd2]
 [<f906dcfd>] ext4_da_write_begin+0x18e/0x28b [ext4]
 [<c045952b>] generic_file_buffered_write+0x101/0x58b
 [<f900a4b3>] jbd2_journal_stop+0x177/0x181 [jbd2]
 [<c0459e5b>] __generic_file_aio_write_nolock+0x4a6/0x52a
 [<c04c8bf3>] avc_has_perm_noaudit+0x5e/0x336
 [<c040597a>] common_interrupt+0x1a/0x20
 [<c04c9927>] avc_has_perm+0x3c/0x46
 [<c0459f38>] generic_file_aio_write+0x59/0xac
 [<f9065d4c>] ext4_file_write+0xf3/0x1ef [ext4]
 [<c047628a>] do_sync_write+0xb6/0xf1
 [<c043731b>] autoremove_wake_function+0x0/0x2d
 [<c04761d4>] do_sync_write+0x0/0xf1
 [<c0476b13>] vfs_write+0xa1/0x143
 [<c047713d>] sys_write+0x3c/0x63
 [<c0404f4b>] syscall_call+0x7/0xb


Version-Release number of selected component (if applicable):
e4fsprogs-1.41.12-2.el5
kernel-PAE-2.6.18-274.12.1.el5

How reproducible:
I was able to test and reproduce the scenario on two different servers with two different underlying physical storages
* md3000i with single virtual drive and multipathd 
* dell h700 with 6x2TB drives and raid0 softraid

Steps to Reproduce:
1. I've created 11TB partition, formated as ext4 filesystem with default parameters 'mkfs.ext4 -L /opt /dev/md0'
2. run benchmark that copied 6 millions of small(100KB) files and 6 millions of 100MB files from local drive to the storage. 6 cp processes in parallel - that simulated our workload.

  
Actual results:
File writes become 20-100x times slower when ext4 partition was 97% full. 
LA is up to 800, server crashes

Expected results:
no slowdown

Additional info:


[root@localhost ~]# dumpe4fs /dev/md0 
dumpe4fs 1.41.12 (17-May-2010)
Filesystem volume name:   test
Last mounted on:          /opt/opt
Filesystem UUID:          6c9b3fe9-9198-440d-976c-cc269228590a
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              724697088
Block count:              2898784512
Reserved block count:     144939225
Free blocks:              2853256642
Free inodes:              724697077
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      332
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Tue Dec 27 15:38:47 2011
Last mount time:          Tue Dec 27 15:56:34 2011
Last write time:          Tue Dec 27 15:56:34 2011
Mount count:              1
Maximum mount count:      23
Last checked:             Tue Dec 27 15:38:47 2011
Check interval:           15552000 (6 months)
Next check after:         Sun Jun 24 16:38:47 2012
Lifetime writes:          173 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      fe943c16-2765-42cf-a23c-99cd30c65772
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x000012b5
Journal start:            27176

Comment 1 Ric Wheeler 2012-01-03 15:57:48 UTC
Please open this through red hat support so we can have them help grab information. If you don't have a red hat support contract, we should take this discussion out to the upstream lists.

Thanks!

Comment 2 Andrey 2012-01-03 17:13:03 UTC
We don't have Red Hat support contract. Can you please specify to what upstream list should I forward this issue.

Thank you, 
Regards, Andrey

Comment 3 Eric Sandeen 2012-01-03 18:27:07 UTC
linux-ext4.org please.  Although ideally it should be tested on an upstream kernel prior to reporting there.

Thanks,
-Eric

Comment 4 Jes Sorensen 2013-02-15 13:16:07 UTC
Closing since this was moved to upstream lists


Note You need to log in before you can comment on or make changes to this bug.