Bug 469582 - Kernel 2.6.27.4-19.fc9.x86_64 crash ext4 filesystem
Kernel 2.6.27.4-19.fc9.x86_64 crash ext4 filesystem
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
All Linux
medium Severity high
: ---
: ---
Assigned To: Eric Sandeen
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-02 16:40 EST by Mihai Harpau
Modified: 2008-11-19 09:54 EST (History)
4 users (show)

See Also:
Fixed In Version: kernel-2.6.27.5-37.fc9
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-11-14 11:57:49 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
full crash log (707.92 KB, text/plain)
2008-11-02 16:40 EST, Mihai Harpau
no flags Details
Proposed patch for the problem reported here. (2.62 KB, patch)
2008-11-03 14:08 EST, Theodore Tso
no flags Details | Diff

  None (edit)
Description Mihai Harpau 2008-11-02 16:40:24 EST
Created attachment 322238 [details]
full crash log

Description of problem:

After a few hours of running I see this crash in log:
Nov  2 22:20:53 taz kernel: __jbd2_log_wait_for_space: no transactions
Nov  2 22:20:53 taz kernel: Aborting journal on device dm-0:8.
Nov  2 22:20:57 taz kernel: ext4_abort called.
Nov  2 22:20:57 taz kernel: EXT4-fs error (device dm-0): ext4_journal_start_sb: Detected aborted journal
Nov  2 22:20:57 taz kernel: Remounting filesystem read-only
Nov  2 22:20:57 taz kernel: ext4_da_writepages: jbd2_start: 1024 pages, ino 7113326; err -30
Nov  2 22:20:57 taz kernel: Pid: 224, comm: pdflush Not tainted 2.6.27.4-19.fc9.x86_64 #1
Nov  2 22:20:57 taz kernel:
Nov  2 22:20:57 taz kernel: Call Trace:
Nov  2 22:20:57 taz kernel: [<ffffffffa0041bae>] ext4_da_writepages+0x189/0x322 [ext4]
Nov  2 22:20:57 taz kernel: [<ffffffff8114b6b1>] ? __next_cpu+0x19/0x26
Nov  2 22:20:57 taz kernel: [<ffffffffa0042d26>] ? ext4_da_get_block_write+0x0/0x11c [ext4]
Nov  2 22:20:57 taz kernel: [<ffffffff81094355>] do_writepages+0x28/0x38
Nov  2 22:20:57 taz kernel: [<ffffffff810dbc9c>] __writeback_single_inode+0x185/0x2f9
Nov  2 22:20:57 taz kernel: [<ffffffff810334b1>] ? __dequeue_entity+0x61/0x6a
Nov  2 22:20:57 taz kernel: [<ffffffff810dc1f5>] generic_sync_sb_inodes+0x229/0x309
Nov  2 22:20:57 taz kernel: [<ffffffff810dc55e>] writeback_inodes+0xa4/0xfd
Nov  2 22:20:57 taz kernel: [<ffffffff810944ab>] wb_kupdate+0xa3/0x119
Nov  2 22:20:57 taz kernel: [<ffffffff81094ebf>] pdflush+0x16e/0x231
Nov  2 22:20:57 taz kernel: [<ffffffff81094408>] ? wb_kupdate+0x0/0x119
Nov  2 22:20:57 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231
Nov  2 22:20:57 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231
Nov  2 22:20:57 taz kernel: [<ffffffff8105338b>] kthread+0x49/0x76
Nov  2 22:20:57 taz kernel: [<ffffffff810116e9>] child_rip+0xa/0x11
Nov  2 22:20:57 taz kernel: [<ffffffff81010a07>] ? restore_args+0x0/0x30
Nov  2 22:20:57 taz kernel: [<ffffffff81053342>] ? kthread+0x0/0x76
Nov  2 22:20:57 taz kernel: [<ffffffff810116df>] ? child_rip+0x0/0x11
Nov  2 22:20:57 taz kernel:
Nov  2 22:21:27 taz kernel: ext4_da_writepages: jbd2_start: 1024 pages, ino 7113325; err -30
Nov  2 22:21:27 taz kernel: Pid: 224, comm: pdflush Not tainted 2.6.27.4-19.fc9.x86_64 #1
Nov  2 22:21:27 taz kernel:
Nov  2 22:21:27 taz kernel: Call Trace:
Nov  2 22:21:27 taz kernel: [<ffffffffa0041bae>] ext4_da_writepages+0x189/0x322 [ext4]
Nov  2 22:21:27 taz kernel: [<ffffffff8114b6b1>] ? __next_cpu+0x19/0x26
Nov  2 22:21:27 taz kernel: [<ffffffffa0042d26>] ? ext4_da_get_block_write+0x0/0x11c [ext4]
Nov  2 22:21:27 taz kernel: [<ffffffff81094355>] do_writepages+0x28/0x38
Nov  2 22:21:27 taz kernel: [<ffffffff810dbc9c>] __writeback_single_inode+0x185/0x2f9
Nov  2 22:21:27 taz kernel: [<ffffffff810334b1>] ? __dequeue_entity+0x61/0x6a
Nov  2 22:21:27 taz kernel: [<ffffffff810dc1f5>] generic_sync_sb_inodes+0x229/0x309
Nov  2 22:21:27 taz kernel: [<ffffffff810dc55e>] writeback_inodes+0xa4/0xfd
Nov  2 22:21:27 taz kernel: [<ffffffff810944ab>] wb_kupdate+0xa3/0x119
Nov  2 22:21:27 taz kernel: [<ffffffff81094ebf>] pdflush+0x16e/0x231
Nov  2 22:21:27 taz kernel: [<ffffffff81094408>] ? wb_kupdate+0x0/0x119
Nov  2 22:21:27 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231
Nov  2 22:21:27 taz kernel: [<ffffffff81094d51>] ? pdflush+0x0/0x231
Nov  2 22:21:27 taz kernel: [<ffffffff8105338b>] kthread+0x49/0x76
Nov  2 22:21:27 taz kernel: [<ffffffff810116e9>] child_rip+0xa/0x11
Nov  2 22:21:27 taz kernel: [<ffffffff81010a07>] ? restore_args+0x0/0x30
Nov  2 22:21:27 taz kernel: [<ffffffff81053342>] ? kthread+0x0/0x76
Nov  2 22:21:27 taz kernel: [<ffffffff810116df>] ? child_rip+0x0/0x11

that is roll over and over for until I reboot the computer and I does a e2fsck from single run level.

Version-Release number of selected component (if applicable):

F9 up-to-date
kernel-2.6.27.4-19.fc9.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Eric Sandeen 2008-11-03 10:48:16 EST
What does e2fsck find, if anything?  I wouldn't expect this to be a disk corruption issue, but if e2fsck found problems please attach that info.

Thanks,
-Eric
Comment 2 Eric Sandeen 2008-11-03 10:54:36 EST
Also; can you let me know what the filesystem geometry is (dumpe2fs -h), as well as which mount options you're using?

Thanks,
-Eric
Comment 3 Theodore Tso 2008-11-03 11:09:05 EST
Looks like this bug is also being tracked at http://bugzilla.kernel.org/show_bug.cgi?id=11937 and there is a proposed fix.   I'm just waiting for feedback patch.
Comment 4 Mihai Harpau 2008-11-03 13:13:48 EST
Re: comment #1, #2
No, e2fsck does not find anything.

[mihai@taz ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/taz-root   28G  4.5G   22G  17% /
/dev/mapper/taz-home  109G   95G  9.2G  92% /home
/dev/sda1              99M   27M   68M  29% /boot
tmpfs                 994M   48K  994M   1% /dev/shm


[root@taz ~]# dumpe2fs -h /dev/mapper/taz-home
dumpe2fs 1.41.3 (12-Oct-2008)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          d06f7797-270a-45c7-8f5b-7c85f7db0698
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent sparse_super large_file
Filesystem flags:         signed_directory_hash test_filesystem 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              29491200
Block count:              29483008
Reserved block count:     1474150
Free blocks:              3656225
Free inodes:              29241170
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         32768
Inode blocks per group:   1024
Filesystem created:       Mon Nov 28 23:32:34 2005
Last mount time:          Mon Nov  3 15:29:34 2008
Last write time:          Mon Nov  3 15:29:34 2008
Mount count:              1
Maximum mount count:      -1
Last checked:             Mon Nov  3 15:27:31 2008
Check interval:           0 (<none>)
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          128
Journal inode:            8
Default directory hash:   tea
Directory Hash Seed:      2a02e5ed-d63c-4f57-b50e-5a135bdf95dd
Journal backup:           inode blocks
Journal size:             32M


[root@taz ~]# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext4dev rw,relatime,barrier=1,data=ordered 0 0
/dev /dev tmpfs rw,relatime,mode=755 0 0
/proc /proc proc rw,relatime 0 0
/sys /sys sysfs rw,relatime 0 0
none /selinux selinuxfs rw,relatime 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620 0 0
/dev/mapper/taz-home /home ext4dev rw,relatime,barrier=1,data=ordered 0 0
/dev/sda1 /boot ext3 rw,relatime,errors=continue,user_xattr,acl,data=ordered 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/proc /var/named/chroot/proc proc rw,relatime 0 0
gvfs-fuse-daemon /home/mihai/.gvfs fuse.gvfs-fuse-daemon rw,nosuid,nodev,relatime,user_id=500,group_id=500 0 0
Comment 5 Mihai Harpau 2008-11-03 13:24:31 EST
In reply to comment #3:

Do you need my feedback or are you referring to feedback from bugzilla.kernel.org?
Also that patch to fix my problem seems to be against kernel 2.6.28-rc2, isn't?
Comment 6 Theodore Tso 2008-11-03 14:07:41 EST
In reply to comment #5, yes, that patch is against 2.6.28-rc2, and against the ext3 filesystem.   There was a similar patch that caused the same regression for ext3, and which is in the Ext4 tree, and a similar patch which is in the ext4 patch queue.

I'll attach it here for your convenence, but Eric knows where to find it.  :-)
Comment 7 Theodore Tso 2008-11-03 14:08:36 EST
Created attachment 322354 [details]
Proposed patch for the problem reported here.
Comment 8 Chuck Ebbert 2008-11-05 21:28:27 EST
(In reply to comment #5)
> In reply to comment #3:
> 
> Do you need my feedback or are you referring to feedback from
> bugzilla.kernel.org?
> Also that patch to fix my problem seems to be against kernel 2.6.28-rc2, isn't?

2.6.27.4 fc9 kernels include the ext4 updates from 2.6.28-rc2
Comment 9 Mihai Harpau 2008-11-06 14:11:00 EST
Ok, now I run the kernel-test 2.6.27.4-26.mh.bz469582.fc9.x86_64 that means kernel 2.6.27.4-26 + patch from comment #7. I'll report back any results about this kernel-test.
Comment 10 Mihai Harpau 2008-11-08 08:01:23 EST
1. After running for 24 hours the kernel-test 2.6.27.4-26.mh.bz469582.fc9.x86_64 I don't see anymore the crash from comment #1.
2. After that period of testing I also increase the journal size from 32M to 128M for filesystem /dev/taz/home, as per advice of Mr. Theodore Tso from http://lkml.org/lkml/2008/11/1/61, and running the same kernel-test to see if I can have more performance from filesystem
Comment 11 Chuck Ebbert 2008-11-08 17:13:55 EST
Fix from upstream went in 2.6.27.5-30
Comment 12 Fedora Update System 2008-11-10 08:15:32 EST
kernel-2.6.27.5-32.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/kernel-2.6.27.5-32.fc9
Comment 13 Fedora Update System 2008-11-11 21:57:33 EST
kernel-2.6.27.5-32.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update kernel'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-9583
Comment 14 Fedora Update System 2008-11-13 02:42:41 EST
kernel-2.6.27.5-37.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/kernel-2.6.27.5-37.fc9
Comment 15 Fedora Update System 2008-11-14 06:53:57 EST
kernel-2.6.27.5-41.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/kernel-2.6.27.5-41.fc9
Comment 16 Fedora Update System 2008-11-19 09:54:22 EST
kernel-2.6.27.5-41.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.