Description of problem: ------------[ cut here ]------------ kernel BUG at fs/ext4/mballoc.c:1648! invalid opcode: 0000 [1] SMP CPU 0 Modules linked in: nfsd auth_rpcgss exportfs nls_utf8 nfs lockd nfs_acl usb_storage bridge bnep rfcomm l2cap bluetooth autofs4 fuse sunrpc ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_ondemand acpi_cpufreq freq_table ext4dev jbd2 crc16 ext2 dm_mirror dm_multipath dm_mod ipv6 sr_mod cdrom ata_generic ppdev snd_hda_intel floppy dcdbas parport_pc parport snd_seq_dummy snd_seq_oss i2c_i801 pcspkr i2c_core firewire_ohci snd_seq_midi_event ata_piix sg iTCO_wdt firewire_core snd_seq pata_acpi iTCO_vendor_support snd_seq_device crc_itu_t snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd tg3 joydev button i82975x_edac soundcore edac_core ahci libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 4357, comm: fio Not tainted 2.6.25.6-55.fc9.x86_64 #1 RIP: 0010:[<ffffffff8832ae8d>] [<ffffffff8832ae8d>] :ext4dev:ext4_mb_new_blocks+0x1043/0x2175 RSP: 0018:ffff81006cca3a98 EFLAGS: 00010246 RAX: 0000000000008000 RBX: 0000000000008000 RCX: 0000000000008000 RDX: 0000000000008000 RSI: 0000000000008000 RDI: 000000000000000c RBP: ffff81006cca3c58 R08: 000000000000000d R09: ffff81007b044fff R10: 000000000000000d R11: 0000000000000001 R12: 0000000000000000 R13: ffff81001e9f31f8 R14: 0000000000000fce R15: ffff81001e9f3238 FS: 00007f5a30f966f0(0000) GS:ffffffff813f2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000032c30dad60 CR3: 000000006ac42000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process fio (pid: 4357, threadinfo ffff81006cca2000, task ffff810016c7a000) Stack: ffff810027481f00 ffff810044c02420 0000000000000000 ffff81006cca3bf8 0000000300000000 0000000000000000 0000000000000000 ffff8100587dc000 0000000000000292 ffff81006cca3d64 ffff81006cca3cf8 ffff81007f6e1300 Call Trace: [<ffffffff88302e22>] ? :jbd2:find_revoke_record+0x5a/0x89 [<ffffffff883032cf>] ? :jbd2:jbd2_journal_cancel_revoke+0x11c/0x163 [<ffffffff8832699b>] :ext4dev:ext4_ext_get_blocks+0x83a/0xa2f [<ffffffff8128e800>] ? __down_read+0x1a/0x98 [<ffffffff88317f41>] :ext4dev:ext4_get_blocks_wrap+0xd3/0x110 [<ffffffff88323de3>] :ext4dev:ext4_fallocate+0x194/0x350 [<ffffffff810b7e55>] ? notify_change+0x2fb/0x30e [<ffffffff8106c5bf>] ? audit_syscall_entry+0x126/0x15a [<ffffffff8106c290>] ? audit_syscall_exit+0x331/0x353 [<ffffffff810a30ef>] sys_fallocate+0xfb/0x11f [<ffffffff8100c052>] tracesys+0xd5/0xda Code: 88 48 c7 c6 70 1e 33 88 31 c0 e8 53 4d ff ff e9 da 00 00 00 49 8b 45 08 48 8b 80 58 02 00 00 48 8b 48 10 48 63 c2 48 39 c8 72 04 <0f> 0b eb fe 48 63 45 a4 48 39 c8 72 04 0f 0b eb fe 41 80 bd 82 RIP [<ffffffff8832ae8d>] :ext4dev:ext4_mb_new_blocks+0x1043/0x2175 RSP <ffff81006cca3a98> ---[ end trace e1aedad6ea231792 ]--- Version-Release number of selected component (if applicable): 2.6.25.6-55.fc9.x86_64 How reproducible: Not sure. Steps to Reproduce: Use the following fio work file: [global] ioengine=libaio iodepth=64 bs=4k ; job files should be pre-allocated, and each file should be created ; in turn so as not to interleave disk blocks. direct=1 size=1024m overwrite=1 create_serialize=1 unlink=0 ;thread [aio-test1] rw=write [aio-test2] rw=read [aio-test3] rw=randwrite [aio-test4] rw=randread Actual results: Backtrace reported above, and file system does not like to do I/O after this.
OK, this is reproducible. Time for another reboot.
1640 static void ext4_mb_measure_extent(struct ext4_allocation_context *ac, 1641 struct ext4_free_extent *ex, 1642 struct ext4_buddy *e4b) 1643 { 1644 struct ext4_free_extent *bex = &ac->ac_b_ex; 1645 struct ext4_free_extent *gex = &ac->ac_g_ex; 1646 1647 BUG_ON(ex->fe_len <= 0); 1648 BUG_ON(ex->fe_len >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb)); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1649 BUG_ON(ex->fe_start >= EXT4_BLOCKS_PER_GROUP(ac->ac_sb)); 1650 BUG_ON(ac->ac_status != AC_STATUS_CONTINUE); Here's what fio is doing: open("aio-test1.1.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 8 fstat(8, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 close(8) = 0 write(1, "aio-test1: Laying out IO file(s)"..., 55aio-test1: Laying out IO file(s) (1 file(s) / 1024MiB) ) = 55 open("aio-test1.1.0", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 8 ftruncate(8, 1073741824) = 0 syscall_285(0x8, 0, 0, 0x40000000, 0x40000000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 Message from syslogd@segfault at Jul 1 13:42:01 ... kernel: ------------[ cut here ]------------
(sorry for so many updates, but this is crashing my desktop machine, so I can only get so far before I need to reboot!) And, of course, syscall 285 is fallocate (but we all knew that, given the stack trace): #define __NR_fallocate 285 __SYSCALL(__NR_fallocate, sys_fallocate)
FYI, I booted 2.6.26-rc8 and could not reproduce the problem with that kernel.
Pretty sure this is fixed now.