From Bugzilla Helper: User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko) Description of problem: I have got one XEN instance called "gateway" running; it owns the ipw3945 device to route WLAN traffic to the internal eth: ---snip --- [root@t60p-jschrode ~]# xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 1756 2 r----- 61.3 gateway 3 256 1 -b---- 18.2 [root@t60p-jschrode ~]# cat /data/xen/gateway.xencfg # xen config file # this is a paravirtualized guest name = "gateway" memory = "256" uuid = "8223c51d-3059-07fd-7236-6c324403cbcb" # installation time: #kernel="/inst/images/xen/vmlinuz" #ramdisk="/inst/images/xen/initrd.img" # for installation while being connected to a network with DHCP, remove the ip and netmask parts in the following line: #extra="ks=http://192.168.60.1/ks/gateway/gateway-ks.cfg ip=192.168.60.254 netmask=255.255.255.0" #extra="ks=http://gateway.demo.redhat.com/cobbler/kickstarts/rhel5s-xen/ks.cfg" #extra="linux rescue" #on_reboot = 'destroy' # installed system: bootloader="/usr/bin/pygrub" on_reboot = 'restart' # PCI hideback for LAN and WLAN #pci = [ '02:00.0' , '03:00.0' ] # WLAN only pci = [ '03:00.0' ] # installation time: #disk = [ 'phy:/dev/VolGroup00/gateway.boot,xvda,w', 'phy:/dev/VolGroup00/gateway.root,xvdb,w', 'phy:/dev/VolGroup00/gateway.swap,xvdd,w', ] # installed system: disk = [ 'phy:/dev/VolGroup00/gateway.boot,xvda,w', 'phy:/dev/VolGroup00/gateway.root,xvdb1,w', 'phy:/dev/VolGroup00/gateway.swap,xvdb2,w', ] vif = [ 'mac=00:16:3e:a8:3c:fe, bridge=xenbr0', ] on_crash = 'destroy' vnc=1 vncunused=0 --- snap --- The system is registered against webqa. If I try to run an update or create any other heavy disk/net IO, the system looses the disk: [root@t60p-jschrode ~]# yum update [...] irq21: nobody cared [...] Disabling IRQ #21 Now the system disk is not accessible anymore, journal cannot be committed. I attach a screenshot with the complete message. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Start a XEN guest with a dedicated PCI (network) device on a T60p 2. run "yum update" or produce heavy net/disk IO Actual Results: IRQ ist lost, SATA disk not accessible anymore, working with the system not possible anymore Expected Results: ./. Additional info:
Created attachment 145906 [details] screenshot of the irq lost message
Can you please post full dom0 boot logs, both "xm dmesg" and kernel "dmesg"?
Created attachment 146276 [details] dmesg this is the dmesg, xm dmesg will follow both are _before_ the system IO freezes, because I can't access the system after the oops anymore
Created attachment 146277 [details] xm dmesg
Created attachment 146278 [details] interrupts being in use from /proc/interrupts
There was a nasty bug in 1.3002 which might just be implicated here; can you please try to reproduce with 2.6.18-4.el5? Thanks.
sorry for putting the bug to the wrong state! I now updated dom0 to latest webqa kernel (2.6.18-5.el5xen) but still experienced the problem. FYI, in between I also switched the SATA mode in the T60p's BIOS from AHCI to Compatible (using module ata_piix instead of ahci) but the problem still existed, so I switched back to AHCI.
I just tried with the latest kernel from webqa, 2.6.18-8.el5xen, still the same behaviour, after downloading several megabytes via ipw3945 in the virtual guest the host looses IRQ #21 and is left unusable. Is there anything I can do to help?
It would be helpful to get the _full_ log of the error when it hits. Can you please try to set up serial console or netconsole? For a laptop, serial console likely implies a docking station these days, but netconsole should still work. (You'll need to modprobe netconsole manually; "modinfo netconsole" should show you the required parameters, and you can redirect the output to syslogd on another host. If you set it up once xend is running, you'll need to run it on peth0, not eth0, too.)
change QA contact
This has been in NEEDINFO for over 2 years. I'm going to close it out for now; if you are still having the problem, and want to pursue the problem, please feel free to re-open. Chris Lalancette