Description of problem: Regular disk hangs and server side hangs for small file random high i/o. Version-Release number of selected component (if applicable): 2.6.18-238.9.1.el5 kernel 5.6 ext4 FS inode size 512 dumpe4fs 1.41.12 (17-May-2010) Filesystem volume name: /data1 Last mounted on: /data1/data1 Filesystem UUID: 6e4af245-fea6-459d-86fe-a33a49bce8c5 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 610304000 Block count: 2441215990 Reserved block count: 122060799 Free blocks: 2364722641 Free inodes: 610303989 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 441 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 1024 Flex block group size: 16 Filesystem created: Wed Aug 3 12:48:19 2011 Last mount time: Wed Aug 3 13:37:22 2011 Last write time: Wed Aug 3 13:37:22 2011 Mount count: 1 Maximum mount count: 20 Last checked: Wed Aug 3 12:48:19 2011 Check interval: 15552000 (6 months) Next check after: Mon Jan 30 11:48:19 2012 Lifetime writes: 291 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 512 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: c6cb58d1-fab3-4750-b73e-64a6a2fd8dc2 Journal backup: inode blocks Journal features: journal_incompat_revoke Journal size: 128M Journal length: 32768 Journal sequence: 0x0004f8f2 Journal start: 31790 sysctl.conf :- vm.swappiness = 0 vm.vfs_cache_pressure = 100000 vm.dirty_background_ratio = 1 vm.dirty_ratio = 5 mount options :- /dev/sdb1 /data1 ext4 rw,noatime,nodiratime,barrier=1,data=writeback 0 0 Hardware :- Dell PERC H700 How reproducible: Easily, everytime with high load for 5-6hrs. Steps to Reproduce: Running a homegrown application which would cause disks to start throwing 'hung_task_timeouts' with a backtrace. Actual results: Application hung after 6hrs, no more progress in the job. Expected results: Application runs to completion on NetApp backend device. Additional info: Logs for 'dmidecode', 'dmesg-backtrace', 'lscpi-info', 'slabinfo' .etc are available @ http://shell.gluster.com/~harsha/redhat-bugzilla/
Fixed it by tuning kernel parameters. # latency echo "deadline" > /sys/block/sdb/queue/scheduler echo "deadline" > /sys/block/sdc/queue/scheduler # 2x queue_depth - 128 echo "256" > /sys/block/sdb/queue/nr_requests echo "256" > /sys/block/sdc/queue/nr_requests # 64k stripe size echo "16" > /proc/sys/vm/page-cluster # Saturate internal RAID cache blockdev --setra 4096 /dev/sdb blockdev --setra 4096 /dev/sdc # Virtual Memory optimizations sysctl vm.swappiness=0 sysctl vm.vfs_cache_pressure=100000 sysctl vm.dirty_ratio=5 sysctl vm.dirty_background_ratio=1