Description of problem: If one un-blacklists firewire-ohci, then subsequently boots a xen kernel, the xen kernel will panic in udev startup, presumably when the firewire-ohci kernel module gets loaded. INIT: version 2.86 booting Welcome to Red Hat Enterprise Linux Server Press 'I' to enter interactive startup. Setting clock (utc): Tue Dec 11 14:25:15 EST 2007 [ OK ] Starting udev: Unable to handle kernel paging request at ffff87fff0008000 RIP: [<ffffffff80277ed4>] xen_destroy_contiguous_region+0xc8/0x4f1 PGD 0 Oops: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:1d.1/usb2/2-0:1.0/usbdev2.1_ep81/dev CPU 0 Modules linked in: 8250 soundcore firewire_ohci snd_page_alloc serial_core firewire_core cdrom floppy parport_pc serio_raw pcspkr tg3 parport dm_snapshot dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 654, comm: modprobe Not tainted 2.6.18-58.el5xen #1 RIP: e030:[<ffffffff80277ed4>] [<ffffffff80277ed4>] xen_destroy_contiguous_region+0xc8/0x4f1 RSP: e02b:ffffffff80618e00 EFLAGS: 00010006 RAX: 0000000780000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff880000008000 RSI: 0000000000000000 RDI: ffffffff80618e48 RBP: 0000000000000000 R08: ffffffff805a4480 R09: 0000000000000000 R10: ffffffff80618e00 R11: 0000000000000048 R12: 0000000000000001 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000 FS: 00002aaaaaac7240(0000) GS:ffffffff8059b000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process modprobe (pid: 654, threadinfo ffff8800378be000, task ffff880039c1b080) Stack: 0000000000000000 0000000000000001 0000000000000000 0000000000007ff0 0000000000000000 0000000000000001 0000000000000000 0000000000007ff0 0000000000000000 0000000000000000 Call Trace: <IRQ> [<ffffffff8026ead9>] dma_free_coherent+0x69/0x77 [<ffffffff881e7a8c>] :firewire_ohci:bus_reset_tasklet+0x131/0x1b9 [<ffffffff80288b09>] tasklet_action+0x6a/0xbc [<ffffffff80211f50>] __do_softirq+0x62/0xdd [<ffffffff8025dd9c>] call_softirq+0x1c/0x280 [<ffffffff8026aaa2>] do_softirq+0x31/0x98 [<ffffffff8026a91d>] do_IRQ+0xec/0xf5 [<ffffffff803969d6>] evtchn_do_upcall+0x86/0xe0 [<ffffffff8025d8ce>] do_hypervisor_callback+0x1e/0x2c <EOI> [<ffffffff8039010d>] klist_devices_get+0x0/0x9 [<ffffffff8039010d>] klist_devices_get+0x0/0x9 [<ffffffff80390111>] klist_devices_get+0x4/0x9 [<ffffffff8044d9aa>] klist_add_tail+0x1c/0x43 [<ffffffff80390c78>] device_bind_driver+0x36/0x71 [<ffffffff80390d18>] driver_probe_device+0x65/0xaa [<ffffffff80390e34>] __driver_attach+0x65/0xb6 [<ffffffff80390dcf>] __driver_attach+0x0/0xb6 [<ffffffff80390746>] bus_for_each_dev+0x43/0x6e [<ffffffff8039038c>] bus_add_driver+0x7e/0x130 [<ffffffff8033a561>] __pci_register_driver+0x57/0x7d [<ffffffff8029b6e3>] sys_init_module+0x16a6/0x1857 [<ffffffff8025d102>] system_call+0x86/0x8b [<ffffffff8025d07c>] system_call+0x0/0x8b Code: 0f a3 02 19 c0 85 c0 0f 84 11 04 00 00 83 fd 09 0f 87 08 04 RIP [<ffffffff80277ed4>] xen_destroy_contiguous_region+0xc8/0x4f1 RSP <ffffffff80618e00> CR2: ffff87fff0008000 <0>Kernel panic - not syncing: Fatal exception (XEN) Domain 0 crashed: rebooting machine in 5 seconds. Version-Release number of selected component (if applicable): kernel-xen-2.6.18-58.el5 How reproducible: Boot kernel-xen, modprobe firewire-ohci.
This works fine for me on fenlason-lab1 (i386) with "03:0c.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)". Can you provide more details on the hardware you are seeing this on?
The system I saw this on has two firewire controllers in it: 05:0a.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61) 29:00.0 FireWire (IEEE 1394): Texas Instruments XIO2200(A) IEEE-1394a-2000 Controller (PHY/Link) (rev 01) The Agere is PCI, the TI is PCIe. The box is busy w/other tasks at the moment, but soonish, I can probably try narrowing it down to one card or the other (or the combination of both).
Does 2.6.18-58.el5xen include a backport of upstream commit 0bd243c4d93583cd8e1786c0bd6982f6f9f94ab6 "Fix pci resume to not pass in a __be32 config rom."? This patch is already half a year old and looked alright when it was posted. But it exposed some other suspend/resume related crashes with some SBP-2 devices in earlier firewire subsystem incarnations. Those crashes were somehow gone when I retested while preparing the upstream 2.6.23->2.6.24-rc1 merge, so I included that patch then. OTOH, this bug here could be yet another weakness of Xen WRT DMA mappings, to be fixed in Xen rather than fixed or worked around in the drivers.
> OTOH, this bug here could be yet another weakness of Xen WRT DMA mappings, > to be fixed in Xen rather than fixed or worked around in the drivers. References: bug 235542, bug 240471, bug 307461
(In reply to comment #3) > Does 2.6.18-58.el5xen include a backport of upstream commit > 0bd243c4d93583cd8e1786c0bd6982f6f9f94ab6 "Fix pci resume to not pass in a __be32 > config rom."? It does include that commit. Going to hold off on doing a whole lot more debugging on this one though, as 5.2 is slated to have a firewire stack rebased on the latest upstream.
Finally got back to poking at this. Not a problem with the latest rhel5.2 kernel-xen.