Bug 240190 - kernel oops when loading netxen_nic 10Gb Ethernet driver
kernel oops when loading netxen_nic 10Gb Ethernet driver
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
6
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Konrad Rzeszutek
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-05-15 13:59 EDT by Chuck Hartley
Modified: 2007-11-30 17:12 EST (History)
3 users (show)

See Also:
Fixed In Version: FC7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-09-19 16:39:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Boot log capture (3.73 KB, text/plain)
2007-05-15 14:03 EDT, Chuck Hartley
no flags Details

  None (edit)
Description Chuck Hartley 2007-05-15 13:59:20 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.10) Gecko/20070313 Fedora/1.5.0.10-5.fc6 Firefox/1.5.0.10

Description of problem:
When booting kernel on X86-64 SMP system, we get a kernel oops:

Unable to handle kernel NULL pointer dereference at 000000000000 
 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491



Version-Release number of selected component (if applicable):
kernel-2.6.18-1.2798.fc6

How reproducible:
Always


Steps to Reproduce:
1.Have a NetXen NXB-10GCX4 10GbE NIC installed in system
2.Try to boot
3.

Actual Results:
Kernel oops due to null pointer dereference. 

Expected Results:
Kernel boots and 10GbE driver is loaded correctly.

Additional info:
Starting udev: Unable to handle kernel NULL pointer dereference at 000000000000 
 [<ffffffff8815b1e8>] :netxen_nic:netxen_pinit_from_rom+0x35e/0x491
PGD 43b40d067 PUD 43b40e067 PMD 0 
Oops: 0002 [1] SMP 

---snip ---

Call Trace:                                                                     
 [<ffffffff881598f1>] :netxen_nic:netxen_nic_probe+0x68b/0x9ba                  
 [<ffffffff802fbecb>] sysfs_make_dirent+0x1b/0x8e                               
 [<ffffffff80349e2d>] pci_device_probe+0xcd/0x134                               
 [<ffffffff803a9d60>] really_probe+0x87/0x106                                   
 [<ffffffff803a9fb1>] __driver_attach+0x90/0xcd                                 
 [<ffffffff803a9f21>] __driver_attach+0x0/0xcd                                  
 [<ffffffff803a9f21>] __driver_attach+0x0/0xcd                                  
 [<ffffffff803a923a>] bus_for_each_dev+0x43/0x6e                                
 [<ffffffff803a9582>] bus_add_driver+0x6b/0x18d                                 
 [<ffffffff8034a02c>] __pci_register_driver+0x85/0xba                           
 [<ffffffff8029f98d>] sys_init_module+0x1797/0x1904                             
 [<ffffffff8025a11e>] system_call+0x7e/0x83
Comment 1 Chuck Hartley 2007-05-15 14:03:03 EDT
Created attachment 154755 [details]
Boot log capture

Captured via serial console connection since errors do not end up in
/var/log/messages
Comment 2 Chuck Ebbert 2007-05-15 15:18:01 EDT
> Version-Release number of selected component (if applicable):
> kernel-2.6.18-1.2798.fc6

please try a kernel that is not from the Paleolithic era
Comment 3 Chuck Hartley 2007-05-16 11:11:33 EDT
My bad - as you can see from the log file attachment, kernel version having the 
problem is is 2.6.20-1.2948.fc6 - which does not boot.
Comment 4 Chuck Ebbert 2007-05-16 16:40:22 EDT
drivers/net/netxen/netxen_nic_init.c:

                        if (ADDR_IN_WINDOW1(off)) {
                                writel(buf[i].data,
                                       NETXEN_CRB_NORMALIZE(adapter, off));
                        } else {
                                netxen_nic_pci_change_crbwindow(adapter, 0);
Line 566 ==>                    writel(buf[i].data,
                                       pci_base_offset(adapter, off));

                                netxen_nic_pci_change_crbwindow(adapter, 1);
                        }

pci_base_offset() returned NULL and the result was not checked, causing NULL
dereference.

(offset is in rbx)
Comment 6 Mithlesh Thukral 2007-05-17 10:13:42 EDT
in 2.6.20 just before the kernel 2.6.20 development window closed, some sparse
changes were checked in the tree. After this the tree did work and the fixes
were huge so they never made in 2.6.20 tree. 
The fixes for that were checked in 2.6.21 when its development window was open.
So can you try the code which is there in 2.6.21 tree and see if you still get
this problem.
AFAIK, i have never faced a crash while loading the 2.6.21 NetXen driver on x86
and x86_64 machines.
Comment 7 Chuck Hartley 2007-05-21 14:34:35 EDT
I grabbed the driver from the 2.6.21 kernel tree and was able to build and 
install successfully.  Machines with the NetXen card now boot and I am able to 
ping between them.  Thanks for all your help.
Comment 10 Konrad Rzeszutek 2007-07-02 14:37:56 EDT
Chuck Hartley,

There is a 2.6.21-1.3125.fc7 kernel available. Can you test that to make sure it
has the fix?
Comment 11 Konrad Rzeszutek 2007-09-19 16:39:54 EDT
Closing BZ as FIXED in CURRENTRELEASE

Note You need to log in before you can comment on or make changes to this bug.