Created attachment 322238 [details] full crash log Description of problem: After a few hours of running I see this crash in log: Nov 2 22:20:53 taz kernel: __jbd2_log_wait_for_space: no transactions Nov 2 22:20:53 taz kernel: Aborting journal on device dm-0:8. Nov 2 22:20:57 taz kernel: ext4_abort called. Nov 2 22:20:57 taz kernel: EXT4-fs error (device dm-0): ext4_journal_start_sb: Detected aborted journal Nov 2 22:20:57 taz kernel: Remounting filesystem read-only Nov 2 22:20:57 taz kernel: ext4_da_writepages: jbd2_start: 1024 pages, ino 7113326; err -30 Nov 2 22:20:57 taz kernel: Pid: 224, comm: pdflush Not tainted 2.6.27.4-19.fc9.x86_64 #1 Nov 2 22:20:57 taz kernel: Nov 2 22:20:57 taz kernel: Call Trace: Nov 2 22:20:57 taz kernel: [<ffffffffa0041bae>] ext4_da_writepages+0x189/0x322 [ext4] Nov 2 22:20:57 taz kernel: [<ffffffff8114b6b1>] ? __next_cpu+0x19/0x26 Nov 2 22:20:57 taz kernel: [<ffffffffa0042d26>] ? ext4_da_get_block_write+0x0/0x11c [ext4] Nov 2 22:20:57 taz kernel: [<ffffffff81094355>] do_writepages+0x28/0x38 Nov 2 22:20:57 taz kernel: [<ffffffff810dbc9c>] __writeback_single_inode+0x185/0x2f9 Nov 2 22:20:57 taz kernel: [<ffffffff810334b1>] ? __dequeue_entity+0x61/0x6a Nov 2 22:20:57 taz kernel: [<ffffffff810dc1f5>] generic_sync_sb_inodes+0x229/0x309 Nov 2 22:20:57 taz kernel: [<ffffffff810dc55e>] writeback_inodes+0xa4/0xfd Nov 2 22:20:57 taz kernel: [<ffffffff810944ab>] wb_kupdate+0xa3/0x119 Nov 2 22:20:57 taz kernel: [<ffffffff81094ebf>] pdflush+0x16e/0x231 Nov 2 22:20:57 taz kernel: [<ffffffff81094408>] ? wb_kupdate+0x0/0x119 Nov 2 22:20:57 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231 Nov 2 22:20:57 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231 Nov 2 22:20:57 taz kernel: [<ffffffff8105338b>] kthread+0x49/0x76 Nov 2 22:20:57 taz kernel: [<ffffffff810116e9>] child_rip+0xa/0x11 Nov 2 22:20:57 taz kernel: [<ffffffff81010a07>] ? restore_args+0x0/0x30 Nov 2 22:20:57 taz kernel: [<ffffffff81053342>] ? kthread+0x0/0x76 Nov 2 22:20:57 taz kernel: [<ffffffff810116df>] ? child_rip+0x0/0x11 Nov 2 22:20:57 taz kernel: Nov 2 22:21:27 taz kernel: ext4_da_writepages: jbd2_start: 1024 pages, ino 7113325; err -30 Nov 2 22:21:27 taz kernel: Pid: 224, comm: pdflush Not tainted 2.6.27.4-19.fc9.x86_64 #1 Nov 2 22:21:27 taz kernel: Nov 2 22:21:27 taz kernel: Call Trace: Nov 2 22:21:27 taz kernel: [<ffffffffa0041bae>] ext4_da_writepages+0x189/0x322 [ext4] Nov 2 22:21:27 taz kernel: [<ffffffff8114b6b1>] ? __next_cpu+0x19/0x26 Nov 2 22:21:27 taz kernel: [<ffffffffa0042d26>] ? ext4_da_get_block_write+0x0/0x11c [ext4] Nov 2 22:21:27 taz kernel: [<ffffffff81094355>] do_writepages+0x28/0x38 Nov 2 22:21:27 taz kernel: [<ffffffff810dbc9c>] __writeback_single_inode+0x185/0x2f9 Nov 2 22:21:27 taz kernel: [<ffffffff810334b1>] ? __dequeue_entity+0x61/0x6a Nov 2 22:21:27 taz kernel: [<ffffffff810dc1f5>] generic_sync_sb_inodes+0x229/0x309 Nov 2 22:21:27 taz kernel: [<ffffffff810dc55e>] writeback_inodes+0xa4/0xfd Nov 2 22:21:27 taz kernel: [<ffffffff810944ab>] wb_kupdate+0xa3/0x119 Nov 2 22:21:27 taz kernel: [<ffffffff81094ebf>] pdflush+0x16e/0x231 Nov 2 22:21:27 taz kernel: [<ffffffff81094408>] ? wb_kupdate+0x0/0x119 Nov 2 22:21:27 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231 Nov 2 22:21:27 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231 Nov 2 22:21:27 taz kernel: [<ffffffff8105338b>] kthread+0x49/0x76 Nov 2 22:21:27 taz kernel: [<ffffffff810116e9>] child_rip+0xa/0x11 Nov 2 22:21:27 taz kernel: [<ffffffff81010a07>] ? restore_args+0x0/0x30 Nov 2 22:21:27 taz kernel: [<ffffffff81053342>] ? kthread+0x0/0x76 Nov 2 22:21:27 taz kernel: [<ffffffff810116df>] ? child_rip+0x0/0x11 that is roll over and over for until I reboot the computer and I does a e2fsck from single run level. Version-Release number of selected component (if applicable): F9 up-to-date kernel-2.6.27.4-19.fc9.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
What does e2fsck find, if anything? I wouldn't expect this to be a disk corruption issue, but if e2fsck found problems please attach that info. Thanks, -Eric
Also; can you let me know what the filesystem geometry is (dumpe2fs -h), as well as which mount options you're using? Thanks, -Eric
Looks like this bug is also being tracked at http://bugzilla.kernel.org/show_bug.cgi?id=11937 and there is a proposed fix. I'm just waiting for feedback patch.
Re: comment #1, #2 No, e2fsck does not find anything. [mihai@taz ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/taz-root 28G 4.5G 22G 17% / /dev/mapper/taz-home 109G 95G 9.2G 92% /home /dev/sda1 99M 27M 68M 29% /boot tmpfs 994M 48K 994M 1% /dev/shm [root@taz ~]# dumpe2fs -h /dev/mapper/taz-home dumpe2fs 1.41.3 (12-Oct-2008) Filesystem volume name: <none> Last mounted on: <not available> Filesystem UUID: d06f7797-270a-45c7-8f5b-7c85f7db0698 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent sparse_super large_file Filesystem flags: signed_directory_hash test_filesystem Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 29491200 Block count: 29483008 Reserved block count: 1474150 Free blocks: 3656225 Free inodes: 29241170 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 32768 Inode blocks per group: 1024 Filesystem created: Mon Nov 28 23:32:34 2005 Last mount time: Mon Nov 3 15:29:34 2008 Last write time: Mon Nov 3 15:29:34 2008 Mount count: 1 Maximum mount count: -1 Last checked: Mon Nov 3 15:27:31 2008 Check interval: 0 (<none>) Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Default directory hash: tea Directory Hash Seed: 2a02e5ed-d63c-4f57-b50e-5a135bdf95dd Journal backup: inode blocks Journal size: 32M [root@taz ~]# cat /proc/mounts rootfs / rootfs rw 0 0 /dev/root / ext4dev rw,relatime,barrier=1,data=ordered 0 0 /dev /dev tmpfs rw,relatime,mode=755 0 0 /proc /proc proc rw,relatime 0 0 /sys /sys sysfs rw,relatime 0 0 none /selinux selinuxfs rw,relatime 0 0 /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0 devpts /dev/pts devpts rw,relatime,gid=5,mode=620 0 0 /dev/mapper/taz-home /home ext4dev rw,relatime,barrier=1,data=ordered 0 0 /dev/sda1 /boot ext3 rw,relatime,errors=continue,user_xattr,acl,data=ordered 0 0 tmpfs /dev/shm tmpfs rw,relatime 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0 /proc /var/named/chroot/proc proc rw,relatime 0 0 gvfs-fuse-daemon /home/mihai/.gvfs fuse.gvfs-fuse-daemon rw,nosuid,nodev,relatime,user_id=500,group_id=500 0 0
In reply to comment #3: Do you need my feedback or are you referring to feedback from bugzilla.kernel.org? Also that patch to fix my problem seems to be against kernel 2.6.28-rc2, isn't?
In reply to comment #5, yes, that patch is against 2.6.28-rc2, and against the ext3 filesystem. There was a similar patch that caused the same regression for ext3, and which is in the Ext4 tree, and a similar patch which is in the ext4 patch queue. I'll attach it here for your convenence, but Eric knows where to find it. :-)
Created attachment 322354 [details] Proposed patch for the problem reported here.
(In reply to comment #5) > In reply to comment #3: > > Do you need my feedback or are you referring to feedback from > bugzilla.kernel.org? > Also that patch to fix my problem seems to be against kernel 2.6.28-rc2, isn't? 2.6.27.4 fc9 kernels include the ext4 updates from 2.6.28-rc2
Ok, now I run the kernel-test 2.6.27.4-26.mh.bz469582.fc9.x86_64 that means kernel 2.6.27.4-26 + patch from comment #7. I'll report back any results about this kernel-test.
1. After running for 24 hours the kernel-test 2.6.27.4-26.mh.bz469582.fc9.x86_64 I don't see anymore the crash from comment #1. 2. After that period of testing I also increase the journal size from 32M to 128M for filesystem /dev/taz/home, as per advice of Mr. Theodore Tso from http://lkml.org/lkml/2008/11/1/61, and running the same kernel-test to see if I can have more performance from filesystem
Fix from upstream went in 2.6.27.5-30
kernel-2.6.27.5-32.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-32.fc9
kernel-2.6.27.5-32.fc9 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update kernel'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9583
kernel-2.6.27.5-37.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-37.fc9
kernel-2.6.27.5-41.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/kernel-2.6.27.5-41.fc9
kernel-2.6.27.5-41.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.