Bug 214635
Summary: | dom0: panic w/X on i965 (appears to be agpgart?) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Bill Nottingham <notting> |
Component: | kernel-xen | Assignee: | Rik van Riel <riel> |
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5.0 | CC: | ajax, rvokal, xen-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-01-10 20:26:17 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 217715 |
Description
Bill Nottingham
2006-11-08 18:43:20 UTC
Fixing subject, as xen reboots on panic by default. As a wild guess, agp_release isn't paranoid enough to nop away double-frees or frees of invalid regions. A kernel built with most CONFIG_DEBUG_* enabled yields: Xorg: Corrupted page table at address a7f300 PGD 6706e067 PUD 695bf067 PMD 5d39c067 PTE 74992fff Bad pagetable: 000f [1] SMP last sysfs file: /class/drm/card0/dev *** Bug 214287 has been marked as a duplicate of this bug. *** As an interesting data point (that I have no idea what it means): If you boot into runlevel 5, this happens with the first X instance (if you're using GL.) If you boot into runlevel 3 and run 'telinit 5', it doesn't crash until the second or third time X is started. And here's one with dri disabled: swap_free: Bad swap offset entry 1000000000000 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at mm/rmap.c:587 invalid opcode: 0000 [1] SMP last sysfs file: /class/xen/blktap0/dev CPU 0 Modules linked in: nfs lockd fscache nfs_acl xt_physdev x_tables netconsole bridge netloop netbk blktap blkbk autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 video sbs i2c_ec button battery asus_acpi ac parport_pc lp parport intel_rng snd_hda_intel snd_hda_codec sr_mod cdrom snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss sg snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 e1000 serial_core i2c_core shpchp pcspkr serio_raw dm_snapshot dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 3806, comm: python Not tainted 2.6.18-1.2746.el5xen #1 RIP: e030:[<ffffffff8020adfa>] [<ffffffff8020adfa>] page_remove_rmap+0x13/0x2c RSP: e02b:ffff880039dc9c40 EFLAGS: 00010286 RAX: 00000000ffffffff RBX: ffff880001a165b8 RCX: 030000001f1d1d1d RDX: 0000000000000000 RSI: 0000000057ad6120 RDI: ffff880001a165b8 RBP: 000000001f1d1d00 R08: 0000000000074992 R09: 0000000000006400 R10: 0000003eec201000 R11: 0000000000000000 R12: 0000003eec201000 R13: ffff880041f94008 R14: ffff880038c93b00 R15: 0000003eec208000 FS: 00002aaaaaabdf40(0000) GS:ffffffff8058e000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process python (pid: 3806, threadinfo ffff880039dc8000, task ffff8800411f9100) Stack: ffffffff80207972 0000000000000000 ffff880039dc9d18 ffffffffffffffff 0000000000000000 ffff880037d6d5d0 ffff880039dc9d20 000000000003931e 0000000000000000 0000000138c93b00 Call Trace: [<ffffffff80207972>] unmap_vmas+0x793/0xae7 [<ffffffff80239cea>] exit_mmap+0x7d/0xf8 [<ffffffff8023c02a>] mmput+0x30/0x83 [<ffffffff80214fd5>] do_exit+0x288/0x89a [<ffffffff80247aa3>] cpuset_exit+0x0/0x6b [<ffffffff8022aaab>] get_signal_to_deliver+0x439/0x46c [<ffffffff8025a182>] do_notify_resume+0x9c/0x7b4 [<ffffffff80280afb>] task_rq_lock+0x3f/0x71 [<ffffffff802458c4>] try_to_wake_up+0x365/0x376 [<ffffffff8028d912>] signal_wake_up+0x1e/0x2d [<ffffffff8028e40b>] specific_send_sig_info+0xa4/0xaf [<ffffffff8028e682>] force_sig_info+0xa9/0xb3 [<ffffffff80269537>] do_stack_segment+0x84/0x8b [<ffffffff8025cace>] retint_signal+0x5d/0xb7 Code: 0f 0b 68 21 b3 46 80 c2 4b 02 8b 77 18 83 f6 01 83 e6 01 e9 RIP [<ffffffff8020adfa>] page_remove_rmap+0x13/0x2c RSP <ffff880039dc9c40> <1>Fixing recursive fault but reboot is needed! Proposing as RC blocker. Xen dom0 + X + i965 = panic-o-rama. Aha, we have an interesting data point. This is x86-64 specific - the i386 xen kernel is fine. *** Bug 214650 has been marked as a duplicate of this bug. *** This really needs to not be assigned to me, I don't understand the agpgart code at all. Should be fixed by adding the include/asm-x86_64/mach-xen/asm/agp.h file, as in bug 217715. *** This bug has been marked as a duplicate of 217715 *** |