Bug 75149 - Adaptec AHA 1200A HPT370A deadlock
Summary: Adaptec AHA 1200A HPT370A deadlock
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-10-04 19:57 UTC by Mika Ilmaranta
Modified: 2007-04-18 16:47 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-06-27 22:02:52 UTC
Embargoed:


Attachments (Terms of Use)

Description Mika Ilmaranta 2002-10-04 19:57:57 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1a) Gecko/20020702

Description of problem:
It seems that HPT370A on Adaptec AHA 1200A causes both rh7.3 vanilla and 
upgraded kernel-2.4.18-10 to deadlock under heavy disk IO.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Do nothing. Deadlock in 1 to 16 days.
2.rpm --rebuild kernel-2.4.18-10.src.rpm in a loop. Deadlock in 1 to 2 days
3.
	

Actual Results:  Deadlock. No OOPS or anything enlightning in any logs,
different count of disk-leds constantly on.

Expected Results:  Normal operation.

Additional info:

Every piece of HW changed, exept the harddisks, but three.
(PSU, NICs, SDRAMs, UDMA cables, Motherboard & CPU and latest AHA1200A changed
from revision A to revision B).

Original HW configuration:
intel celeron 800MHz
abit sa6r htp370 on-board
adaptec 1200a rev a (had it's own irq)
4 x 20GB ibm disks
at-2500tx nic (rtl8138c)

HW configuration now:
intel celeron 1.2GHz
asus tusl2-c
adaptec 1200a rev b (shares irq with promise card)
promise ultra133 tx2
3 x 20GB ibm disks
1 x 60GB ibm disk (replaced one 20GB disk because of smartctl reported 163
reallocated sectors)
2 x at2500-tx (rtl8139c)

I did order a new promise card to replace adaptec alltogether, but since it's
delivery date might be even 3 weeks from now I compiled 2.4.20-pre9 kernel for
the beast and I'm now running rpm --rebuild kernel-2.4.18-10 in a loop again.

Comment 1 Mika Ilmaranta 2002-10-05 08:52:36 UTC
Should have been under reproduce 2.:
 1 hour - 2 days. Not 1 - 2 days.

So far it seems that 2.4.20-pre9 solves this problem I have been hunting very
soon for four months.

After installing 2.4.20-pre9 the beast has survived 7 rpm--rebuild
kernel-2.4.18-10 and five consecutive full backups.

(
Deadlock soon after backup was the original problem ... which I successfully 
had forgotten trying to find some HW fault instead of self upgrading kernel.
Well, now I have extraneous HW to build more servers :) 
)

Still running tests, but with kernel-2.4.18-10 the system should have deadlocked
 several times already.

Well, sometimes you just climb up the tree the wrong way.


Comment 2 Mika Ilmaranta 2002-10-05 15:09:06 UTC
The machine has now completed both various full-backups and rebuilds of kernel
with 2.4.20-pre9. Thus I think I found the reason at last, unless the following
week(s) let me know otherwise.

So far I think this long lasting problem is finally solved.




Note You need to log in before you can comment on or make changes to this bug.