Bug 139465

Summary: em64t/ia32e kernel panic: 'interrupt handler - not syncing' during heavy network I/O
Product: Red Hat Enterprise Linux 3 Reporter: Roderick Constance <rconstance>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: jparadis, petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: ia32e   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-18 13:28:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roderick Constance 2004-11-16 03:55:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040510

Description of problem:
I have a dual processor EM64T/ia32e based system.  Only one processor
is installed and hyperthreading is off.  The system also has an
Adaptec ANA620xx/ANA69011A four-port NIC that uses the starfire
driver.  I'm using a floodping test to generate heavy I/O between the
loopbacked ports of the starfire card.  During this test the system
will lock up hard with a kernel panic in less than 5 minutes.

Kernel panic: map_single : could not allocate software IO TLB (106
bytes) In interrupt handler - not syncing

A reboot of the system is necessary.  So far I've only been able to
generate this error 3 of 8 tries.  Perhaps this is a timing problem?
This happens in either SMP or UP mode using the x86_64 kernel
(2.4.21-20.EL).  It seems the starfire card can do 64-bit DMA. 



Version-Release number of selected component (if applicable):
RedHat Enterprise Linux 3 Workstaion (Update 3)

How reproducible:
Sometimes

Steps to Reproduce:
1.install 4-port NIC
2.initialise each interface
3.generate heavy network load
    

Actual Results:  Kernel panic: map_single : could not allocate
software IO TLB (106 bytes) In interrupt handler - not syncing

Expected Results:  Packets should continue to be transmitted/received,
the kernel should not panic, and the machine should not lock up.

Additional info:

I can provide any dmesg or log files upon request.

Comment 1 Larry Woodman 2004-11-17 02:48:47 UTC
No need for further info, you ran out of swoitlb entries.  The default
is 1024 pages or 4MB.  If you add "swiotlb=2048" on the boot command
line in /boot/grub/grub.conf you will double the number of the size of
the iotlb to 8MB.  I have tested that architecture up to 64MB or
"swiotlb-16384".

Let us know how that works out.


Thanks, Larry Woodman

Comment 2 Ernie Petrides 2004-12-23 23:15:43 UTC
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.5.EL).


Comment 3 Roderick Constance 2005-01-17 20:16:16 UTC
passing the "swiotlb=2048" boot parameter fixed the problem for me. 
Thanks.

Comment 4 Tim Powers 2005-05-18 13:28:35 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html