Description of problem: F13 kernel-xen runninf on F8 xen hypervisor constantly crashes Version-Release number of selected component (if applicable): 2.6.33.5-124.fc13.x86_64 How reproducible: Start the XEN virtual machine and eventually crashes Steps to Reproduce: 1. start a VM 2. wait ... 3. Actual results: first sign is a non responding shell and CPU (on xm top) @ 100% Expected results: Additional info: Have 4 VM machines and all react the same. If I boot back into F12 - no issues Example in the logs: Jun 13 13:24:00 lax1 kernel: CPU 0 Jun 13 13:24:00 lax1 kernel: Pid: 269, comm: kjournald Not tainted 2.6.33.5-112.fc13.x86_64 #1 / Jun 13 13:24:00 lax1 kernel: RIP: e030:[<ffffffff8100122a>] [<ffffffff8100122a>] hypercall_page+0x22a/0x1006 Jun 13 13:24:00 lax1 kernel: RSP: e02b:ffff88000300f990 EFLAGS: 00000246 Jun 13 13:24:00 lax1 kernel: RAX: 0000000000030001 RBX: ffff88002a05c228 RCX: ffffffff8100122a Jun 13 13:24:00 lax1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Jun 13 13:24:00 lax1 kernel: RBP: ffff88000300f9a8 R08: ffff880003dd7480 R09: ffff8800031dd208 Jun 13 13:24:00 lax1 kernel: R10: 0000000000000001 R11: 0000000000000246 R12: ffff88002f296000 Jun 13 13:24:00 lax1 kernel: R13: 0000000000011200 R14: ffffffff810c29be R15: 0000000000011200 Jun 13 13:24:00 lax1 kernel: FS: 00007ff03a679740(0000) GS:ffff880003dc5000(0000) knlGS:0000000000000000 Jun 13 13:24:00 lax1 kernel: CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 13 13:24:00 lax1 kernel: CR2: 0000000000ca3bb8 CR3: 000000000a154000 CR4: 0000000000000660 Jun 13 13:24:00 lax1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 13 13:24:00 lax1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Jun 13 13:24:00 lax1 kernel: Process kjournald (pid: 269, threadinfo ffff88000300e000, task ffff88000333dd40) Jun 13 13:24:00 lax1 kernel: Stack: Jun 13 13:24:00 lax1 kernel: ffff880002fe18b0 0000000000000000 ffffffff81005f05 ffff88000300fa50 Jun 13 13:24:00 lax1 kernel: <0> ffffffff810065e2 ffff880002fe18b0 0000000000000001 ffff8800031dd208 Jun 13 13:24:00 lax1 kernel: <0> ffff880003dd7480 0000000000010200 0000000000011200 0000000000000000 Jun 13 13:24:00 lax1 kernel: Call Trace: Jun 13 13:24:00 lax1 kernel: [<ffffffff81005f05>] ? xen_force_evtchn_callback+0xd/0xf Jun 13 13:24:00 lax1 kernel: [<ffffffff810065e2>] check_events+0x12/0x20 Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff810f5873>] ? kmem_cache_alloc+0xa2/0x10f Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c29be>] mempool_alloc_slab+0x10/0x12 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] ? mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff8134e4b1>] alloc_tio+0x21/0x39 Jun 13 13:24:00 lax1 kernel: [<ffffffff8134fa4a>] __split_and_process_bio+0x23c/0x529 Jun 13 13:24:00 lax1 kernel: [<ffffffff81005f05>] ? xen_force_evtchn_callback+0xd/0xf Jun 13 13:24:00 lax1 kernel: [<ffffffff813500c6>] dm_request+0x1c8/0x1db Jun 13 13:24:00 lax1 kernel: [<ffffffff811ea91b>] generic_make_request+0x2c8/0x321 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] ? mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff811eaa41>] submit_bio+0xcd/0xea Jun 13 13:24:00 lax1 kernel: [<ffffffff81120cc5>] submit_bh+0xef/0x111 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119c8a9>] journal_commit_transaction+0x9c8/0xfd7 Jun 13 13:24:00 lax1 kernel: [<ffffffff810586f0>] ? try_to_del_timer_sync+0x6e/0x7c Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119f987>] kjournald+0xe3/0x220 Jun 13 13:24:00 lax1 kernel: [<ffffffff8106480b>] ? autoremove_wake_function+0x0/0x34 Jun 13 13:24:00 lax1 kernel: [<ffffffff8142ae38>] ? _raw_spin_unlock_irqrestore+0x14/0x16 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119f8a4>] ? kjournald+0x0/0x220 Jun 13 13:24:00 lax1 kernel: [<ffffffff810643bb>] kthread+0x7a/0x82 Jun 13 13:24:00 lax1 kernel: [<ffffffff8100a924>] kernel_thread_helper+0x4/0x10 Jun 13 13:24:00 lax1 kernel: [<ffffffff81009d21>] ? int_ret_from_sys_call+0x7/0x1b Jun 13 13:24:00 lax1 kernel: [<ffffffff8142b29d>] ? retint_restore_args+0x5/0x6 Jun 13 13:24:00 lax1 kernel: [<ffffffff8100a920>] ? kernel_thread_helper+0x0/0x10 Jun 13 13:24:00 lax1 kernel: Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc Jun 13 13:24:00 lax1 kernel: Call Trace: Jun 13 13:24:00 lax1 kernel: [<ffffffff81005f05>] ? xen_force_evtchn_callback+0xd/0xf Jun 13 13:24:00 lax1 kernel: [<ffffffff810065e2>] check_events+0x12/0x20 Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff810f5873>] ? kmem_cache_alloc+0xa2/0x10f Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c29be>] mempool_alloc_slab+0x10/0x12 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] ? mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff8134e4b1>] alloc_tio+0x21/0x39 Jun 13 13:24:00 lax1 kernel: [<ffffffff8134fa4a>] __split_and_process_bio+0x23c/0x529 Jun 13 13:24:00 lax1 kernel: [<ffffffff81005f05>] ? xen_force_evtchn_callback+0xd/0xf Jun 13 13:24:00 lax1 kernel: [<ffffffff813500c6>] dm_request+0x1c8/0x1db Jun 13 13:24:00 lax1 kernel: [<ffffffff811ea91b>] generic_make_request+0x2c8/0x321 Jun 13 13:24:00 lax1 kernel: [<ffffffff810c2aa2>] ? mempool_alloc+0x6c/0x11e Jun 13 13:24:00 lax1 kernel: [<ffffffff811eaa41>] submit_bio+0xcd/0xea Jun 13 13:24:00 lax1 kernel: [<ffffffff81120cc5>] submit_bh+0xef/0x111 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119c8a9>] journal_commit_transaction+0x9c8/0xfd7 Jun 13 13:24:00 lax1 kernel: [<ffffffff810586f0>] ? try_to_del_timer_sync+0x6e/0x7c Jun 13 13:24:00 lax1 kernel: [<ffffffff810065cf>] ? xen_restore_fl_direct_end+0x0/0x1 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119f987>] kjournald+0xe3/0x220 Jun 13 13:24:00 lax1 kernel: [<ffffffff8106480b>] ? autoremove_wake_function+0x0/0x34 Jun 13 13:24:00 lax1 kernel: [<ffffffff8142ae38>] ? _raw_spin_unlock_irqrestore+0x14/0x16 Jun 13 13:24:00 lax1 kernel: [<ffffffff8119f8a4>] ? kjournald+0x0/0x220 Jun 13 13:24:00 lax1 kernel: [<ffffffff810643bb>] kthread+0x7a/0x82 Jun 13 13:24:00 lax1 kernel: [<ffffffff8100a924>] kernel_thread_helper+0x4/0x10 Jun 13 13:24:00 lax1 kernel: [<ffffffff81009d21>] ? int_ret_from_sys_call+0x7/0x1b Jun 13 13:24:00 lax1 kernel: [<ffffffff8142b29d>] ? retint_restore_args+0x5/0x6 Jun 13 13:24:00 lax1 kernel: [<ffffffff8100a920>] ? kernel_thread_helper+0x0/0x10
Since the f12 kernel you are using works, and the f13 doesn't, can you please try to bisect it down using the kernels available here? http://kojipkgs.fedoraproject.org/packages/kernel/ Thanks, Andrew
(In reply to comment #1) > Since the f12 kernel you are using works, and the f13 doesn't, can you please > try to bisect it down using the kernels available here? > > http://kojipkgs.fedoraproject.org/packages/kernel/ > > Thanks, > Andrew Quick attempts before a more thorough testing: last F12 : kernel-2.6.32.14-134.fc12.x86_64.rpm -> STABLE first F13 : kernel-2.6.33.1-17.fc13.x86_64.rpm -> crash during boot. will try more kernels in the 2.6.33 branch (latest - earliest - middle of the date range)
Fixed in: kernel-2.6.33.5-133.fc13.x86_64 and still in kernel-2.6.33.6-147.2.4.fc13.x86_64 I *think* that xen-libs-3.4.3-2.fc13.x86_64 were the catalyst. Previous FC13 versions would always crash.