From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.10) Gecko/20070313 Fedora/1.5.0.10-5.fc6 Firefox/1.5.0.10 Description of problem: When booting kernel on X86-64 SMP system, we get a kernel oops: Unable to handle kernel NULL pointer dereference at 000000000000 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491 Version-Release number of selected component (if applicable): kernel-2.6.18-1.2798.fc6 How reproducible: Always Steps to Reproduce: 1.Have a NetXen NXB-10GCX4 10GbE NIC installed in system 2.Try to boot 3. Actual Results: Kernel oops due to null pointer dereference. Expected Results: Kernel boots and 10GbE driver is loaded correctly. Additional info: Starting udev: Unable to handle kernel NULL pointer dereference at 000000000000 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491 PGD 43b40d067 PUD 43b40e067 PMD 0 Oops: 0002 [1] SMP ---snip --- Call Trace: [<ffffffff881598f1>] :netxen_nic:netxen_nic_probe+0x68b/0x9ba [<ffffffff802fbecb>] sysfs_make_dirent+0x1b/0x8e [<ffffffff80349e2d>] pci_device_probe+0xcd/0x134 [<ffffffff803a9d60>] really_probe+0x87/0x106 [<ffffffff803a9fb1>] __driver_attach+0x90/0xcd [<ffffffff803a9f21>] __driver_attach+0x0/0xcd [<ffffffff803a9f21>] __driver_attach+0x0/0xcd [<ffffffff803a923a>] bus_for_each_dev+0x43/0x6e [<ffffffff803a9582>] bus_add_driver+0x6b/0x18d [<ffffffff8034a02c>] __pci_register_driver+0x85/0xba [<ffffffff8029f98d>] sys_init_module+0x1797/0x1904 [<ffffffff8025a11e>] system_call+0x7e/0x83
Created attachment 154755 [details] Boot log capture Captured via serial console connection since errors do not end up in /var/log/messages
> Version-Release number of selected component (if applicable): > kernel-2.6.18-1.2798.fc6 please try a kernel that is not from the Paleolithic era
My bad - as you can see from the log file attachment, kernel version having the problem is is 2.6.20-1.2948.fc6 - which does not boot.
drivers/net/netxen/netxen_nic_init.c: if (ADDR_IN_WINDOW1(off)) { writel(buf[i].data, NETXEN_CRB_NORMALIZE(adapter, off)); } else { netxen_nic_pci_change_crbwindow(adapter, 0); Line 566 ==> writel(buf[i].data, pci_base_offset(adapter, off)); netxen_nic_pci_change_crbwindow(adapter, 1); } pci_base_offset() returned NULL and the result was not checked, causing NULL dereference. (offset is in rbx)
in 2.6.20 just before the kernel 2.6.20 development window closed, some sparse changes were checked in the tree. After this the tree did work and the fixes were huge so they never made in 2.6.20 tree. The fixes for that were checked in 2.6.21 when its development window was open. So can you try the code which is there in 2.6.21 tree and see if you still get this problem. AFAIK, i have never faced a crash while loading the 2.6.21 NetXen driver on x86 and x86_64 machines.
I grabbed the driver from the 2.6.21 kernel tree and was able to build and install successfully. Machines with the NetXen card now boot and I am able to ping between them. Thanks for all your help.
Chuck Hartley, There is a 2.6.21-1.3125.fc7 kernel available. Can you test that to make sure it has the fix?
Closing BZ as FIXED in CURRENTRELEASE