Description of problem: netdump kernel modules ("netdump","netconsole") won't load on HP XW9300 Version-Release number of selected component (if applicable): netdump-0.7.7-3 How reproducible: Always on HP XW9300 (RHEL4U2 ES AMD64 SMP) Works always on other x86_64 systems (E.g. HP XW8200, Fujitsu-Siemens Celsius V830, IBM Intellistation A Pro 6217,...) Steps to Reproduce: 1. service netdump start 2. provide netdump password at the prompt 3. Actual results: [root@hp-amd-1 ~]# service netdump start netdump.13.3's password: initializing netdump FATAL: Error inserting netdump (/lib/modules/2.6.9-22.ELsmp/kernel/drivers/net/netdump.ko): Invalid argument [FAILED] initializing netconsole FATAL: Error inserting netconsole (/lib/modules/2.6.9-22.ELsmp/kernel/drivers/net/netconsole.ko): Invalid argument [FAILED] [root@hp-amd-1 ~]# Expected results: [root@hp-intel-1 ~]# service netdump start netdump.13.3's password: initializing netdump [ OK ] initializing netconsole [ OK ] [root@hp-intel-1 ~]# Additional info: sysreport in attachment
Created attachment 125135 [details] sysreport (no rpm data)
Strange -- looking at your /var/log/messages file: netdump: eth0 doesn't support polling, aborting. netconsole: eth0 doesn't support polling, aborting. netlog: eth0 doesn't support polling, aborting. netdump and netconsole (and netlog via netconsole) all utilize the in-kernel netpoll facility, and the netpoll facility is failing because the eth0 NIC driver does not support polling mode, i.e., never set up its poll_controller() interface: int netpoll_setup(struct netpoll *np) { struct net_device *ndev = NULL; struct in_device *in_dev; if (np->dev_name) ndev = dev_get_by_name(np->dev_name); if (!ndev) { printk(KERN_ERR "%s: %s doesn't exist, aborting.\n", np->name, np->dev_name); return -1; } if (!ndev->poll_controller) { printk(KERN_ERR "%s: %s doesn't support polling, aborting.\n", np->name, np->dev_name); goto release; } ... It appears your eth0 driver is forcedeth.c, right? And if configured properly, it does have a poll_controller interface -- in forcedeth.c:nv_probe() there's this: #ifdef CONFIG_NET_POLL_CONTROLLER dev->poll_controller = nv_poll_controller; #endif and CONFIG_NET_POLL_CONTROLLER is turned on by default in RHEL4 kernels. Are you rebuilding kernels, or drivers, or doing anything out of the ordinary?
Created attachment 125145 [details] sysreport (nvnet driver)
there is one thing: I'm booting into the PXE first. Then I chose a hard disk boot in the PXE. (The system is installed frequently via PXE.) The behavior doesnt change if I boot directly from hard disk. (Again using the forcedeth driver.) With your hint pointing to the ethernet driver I tested the same on a SUN Ultra 40 Workstation. (Similar configuration, forcedeth driver.) This system shows the same error. Back to the XW9300 I configured NVIDIAs nvnet driver. This shows the same failure on the command line. But I have attached another sysreport from this configuration.
Tom, Jeff, I have no idea what's going on here -- for whatever reason the dev->poll_controller is not being initialized? Is there something special about the forcedeth driver that doesn't support netpoll? Dave
The forcedeth driver in 2.6.9-22 does not support netpoll. Netpoll support was added in kernel 2.6.9-22.16, and should be available in the next update. Having said that, I don't have any forcedeth hardware to use for testing.
> #ifdef CONFIG_NET_POLL_CONTROLLER > dev->poll_controller = nv_poll_controller; > #endif > > and CONFIG_NET_POLL_CONTROLLER is turned on by default in RHEL4 kernels. > > Are you rebuilding kernels, or drivers, or doing anything out of the ordinary? Sorry about that -- I was looking at a 2.6.9-24 source tree...
In my testing the latest kernel from here resolves this bug : http://people.redhat.com/linville/kernels/rhel4
I can confirm that the kernel from kernel-smp-2.6.9-34.EL.jwltest.119.x86_64.rpm resolves the problem on both systems. (Sun Ultra 40 and HP XW9300 using forcedeth version 0.52 as driver.)