Bug 240190 - kernel oops when loading netxen_nic 10Gb Ethernet driver
Summary: kernel oops when loading netxen_nic 10Gb Ethernet driver
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 6
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Konrad Rzeszutek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-05-15 17:59 UTC by Chuck Hartley
Modified: 2007-11-30 22:12 UTC (History)
3 users (show)

Fixed In Version: FC7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-09-19 20:39:54 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Boot log capture (3.73 KB, text/plain)
2007-05-15 18:03 UTC, Chuck Hartley
no flags Details

Description Chuck Hartley 2007-05-15 17:59:20 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.10) Gecko/20070313 Fedora/1.5.0.10-5.fc6 Firefox/1.5.0.10

Description of problem:
When booting kernel on X86-64 SMP system, we get a kernel oops:

Unable to handle kernel NULL pointer dereference at 000000000000 
 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491



Version-Release number of selected component (if applicable):
kernel-2.6.18-1.2798.fc6

How reproducible:
Always


Steps to Reproduce:
1.Have a NetXen NXB-10GCX4 10GbE NIC installed in system
2.Try to boot
3.

Actual Results:
Kernel oops due to null pointer dereference. 

Expected Results:
Kernel boots and 10GbE driver is loaded correctly.

Additional info:
Starting udev: Unable to handle kernel NULL pointer dereference at 000000000000 
 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491
PGD 43b40d067 PUD 43b40e067 PMD 0 
Oops: 0002 [1] SMP 

---snip ---

Call Trace:                                                                     
 [<ffffffff881598f1>] :netxen_nic:netxen_nic_probe+0x68b/0x9ba                  
 [<ffffffff802fbecb>] sysfs_make_dirent+0x1b/0x8e                               
 [<ffffffff80349e2d>] pci_device_probe+0xcd/0x134                               
 [<ffffffff803a9d60>] really_probe+0x87/0x106                                   
 [<ffffffff803a9fb1>] __driver_attach+0x90/0xcd                                 
 [<ffffffff803a9f21>] __driver_attach+0x0/0xcd                                  
 [<ffffffff803a9f21>] __driver_attach+0x0/0xcd                                  
 [<ffffffff803a923a>] bus_for_each_dev+0x43/0x6e                                
 [<ffffffff803a9582>] bus_add_driver+0x6b/0x18d                                 
 [<ffffffff8034a02c>] __pci_register_driver+0x85/0xba                           
 [<ffffffff8029f98d>] sys_init_module+0x1797/0x1904                             
 [<ffffffff8025a11e>] system_call+0x7e/0x83

Comment 1 Chuck Hartley 2007-05-15 18:03:03 UTC
Created attachment 154755 [details]
Boot log capture

Captured via serial console connection since errors do not end up in
/var/log/messages

Comment 2 Chuck Ebbert 2007-05-15 19:18:01 UTC
> Version-Release number of selected component (if applicable):
> kernel-2.6.18-1.2798.fc6

please try a kernel that is not from the Paleolithic era

Comment 3 Chuck Hartley 2007-05-16 15:11:33 UTC
My bad - as you can see from the log file attachment, kernel version having the 
problem is is 2.6.20-1.2948.fc6 - which does not boot.


Comment 4 Chuck Ebbert 2007-05-16 20:40:22 UTC
drivers/net/netxen/netxen_nic_init.c:

                        if (ADDR_IN_WINDOW1(off)) {
                                writel(buf[i].data,
                                       NETXEN_CRB_NORMALIZE(adapter, off));
                        } else {
                                netxen_nic_pci_change_crbwindow(adapter, 0);
Line 566 ==>                    writel(buf[i].data,
                                       pci_base_offset(adapter, off));

                                netxen_nic_pci_change_crbwindow(adapter, 1);
                        }

pci_base_offset() returned NULL and the result was not checked, causing NULL
dereference.

(offset is in rbx)

Comment 6 Mithlesh Thukral 2007-05-17 14:13:42 UTC
in 2.6.20 just before the kernel 2.6.20 development window closed, some sparse
changes were checked in the tree. After this the tree did work and the fixes
were huge so they never made in 2.6.20 tree. 
The fixes for that were checked in 2.6.21 when its development window was open.
So can you try the code which is there in 2.6.21 tree and see if you still get
this problem.
AFAIK, i have never faced a crash while loading the 2.6.21 NetXen driver on x86
and x86_64 machines.

Comment 7 Chuck Hartley 2007-05-21 18:34:35 UTC
I grabbed the driver from the 2.6.21 kernel tree and was able to build and 
install successfully.  Machines with the NetXen card now boot and I am able to 
ping between them.  Thanks for all your help.

Comment 10 Konrad Rzeszutek 2007-07-02 18:37:56 UTC
Chuck Hartley,

There is a 2.6.21-1.3125.fc7 kernel available. Can you test that to make sure it
has the fix?

Comment 11 Konrad Rzeszutek 2007-09-19 20:39:54 UTC
Closing BZ as FIXED in CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.