From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; Q312461) Description of problem: running '/usr/etc/rdump 0ua -f /dev/dull /' will requently cause a kernel oops. Possibly due to a buffer flush trashing buffers used by ext3 when the file system is mounted. Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1. /usr/etc/rdump 0ua -f /dev/dull / Actual Results: Should complete all steps of the dump successfully. Expected Results: either segfaults or crashes the host. Additional info: the following oops often occurs: May 9 10:31:22 brazos kernel: invalidate: busy buffer May 9 10:31:22 brazos kernel: invalidate: busy buffer May 9 10:31:22 brazos last message repeated 35 times May 9 10:31:22 brazos kernel: ------------[ cut here ]------------ May 9 10:31:22 brazos kernel: kernel BUG at page_alloc.c:226! May 9 10:31:22 brazos kernel: invalid operand: 0000 May 9 10:31:22 brazos kernel: CPU: 0 May 9 10:31:22 brazos kernel: EIP: 0010:[rmqueue+138/688] Not tainted May 9 10:31:22 brazos kernel: EIP: 0010:[<c012fbaa>] Not tainted May 9 10:31:22 brazos kernel: EFLAGS: 00010086 May 9 10:31:22 brazos kernel: EIP is at rmqueue [kernel] 0x8a May 9 10:31:22 brazos kernel: eax: 00000020 ebx: c1633971 ecx: 00000001 edx: 001b93d7 May 9 10:31:22 brazos kernel: esi: c1633971 edi: c02c3414 ebp: 00000002 esp: d9fbde68 May 9 10:31:22 brazos kernel: ds: 0018 es: 0018 ss: 0018 May 9 10:31:22 brazos kernel: Process rdump (pid: 6191, stackpage=d9fbd000) May 9 10:31:22 brazos kernel: Stack: c02987bb 000000e2 00001000 00001000 00000282 00000000 c02c33bc c02c33bc May 9 10:31:22 brazos kernel: c02c3578 00000000 000001ff c012ff55 c012678d 00000001 c02c3574 000001d0 May 9 10:31:22 brazos kernel: dfb09a94 00000017 000100f8 c17d0e4c c0126801 dc826264 dc826264 00000017 May 9 10:31:22 brazos kernel: Call Trace: [__alloc_pages+117/768] __alloc_pages [kernel] 0x75 May 9 10:31:22 brazos kernel: Call Trace: [<c012ff55>] __alloc_pages [kernel] 0x75 May 9 10:31:22 brazos kernel: [add_to_page_cache_unique+109/128] add_to_page_cache_unique [kernel] 0x6d May 9 10:31:22 brazos kernel: [<c012678d>] add_to_page_cache_unique [kernel] 0x6d May 9 10:31:22 brazos kernel: [page_cache_read+97/192] page_cache_read [kernel] 0x61 May 9 10:31:22 brazos kernel: [<c0126801>] page_cache_read [kernel] 0x61 May 9 10:31:22 brazos kernel: [generic_file_readahead+279/352] generic_file_readahead [kernel] 0x117 May 9 10:31:22 brazos kernel: [<c0126f37>] generic_file_readahead [kernel] 0x117 May 9 10:31:22 brazos kernel: [do_generic_file_read+563/1280] do_generic_file_read [kernel] 0x233 May 9 10:31:22 brazos kernel: [<c01271b3>] do_generic_file_read [kernel] 0x233 May 9 10:31:22 brazos kernel: [generic_file_read+126/304] generic_file_read [kernel] 0x7e May 9 10:31:22 brazos kernel: [<c01277ce>] generic_file_read [kernel] 0x7e May 9 10:31:22 brazos kernel: [file_read_actor+0/224] file_read_actor [kernel] 0x0 May 9 10:31:22 brazos kernel: [<c0127670>] file_read_actor [kernel] 0x0 May 9 10:31:22 brazos kernel: [sys_read+150/208] sys_read [kernel] 0x96 May 9 10:31:22 brazos kernel: [<c01367d6>] sys_read [kernel] 0x96 May 9 10:31:22 brazos kernel: [tracesys+31/35] tracesys [kernel] 0x1f May 9 10:31:22 brazos kernel: [<c0106f5f>] tracesys [kernel] 0x1f May 9 10:31:23 brazos kernel: May 9 10:31:23 brazos kernel: May 9 10:31:23 brazos kernel: Code: 0f 0b 59 58 8b 53 04 8b 03 89 50 04 89 02 8b 44 24 10 8b 90 May 9 10:31:27 brazos kernel: ------------[ cut here ]------------ May 9 10:31:27 brazos kernel: kernel BUG at page_alloc.c:226!
I'm using the 2.4.17-0.18smp kernel.
Fixed in Red Hat Linux 7.3.
Actually, the known fix in 7.3 is for a different dump-triggered bug: this trace here looks more like a VM problem than the VFS one we fixed in 7.3. However, there were so many other VM-related changes between 2.4.17-0.18 and the final 2.4.18-3 kernel in 7.3 that we really need to know if the final kernel shows the same problem.