Bug 82277 - (I2O)system hang. No video no hard drive activity. SCSI Timing issue?
Summary: (I2O)system hang. No video no hard drive activity. SCSI Timing issue?
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: athlon
OS: Linux
high
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-01-20 19:52 UTC by Marco Coelho
Modified: 2005-10-31 22:00 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:40:25 UTC
Embargoed:


Attachments (Terms of Use)

Description Marco Coelho 2003-01-20 19:52:03 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)

Description of problem:
Dual athlon smp motherboard 2100 processors
AMD K7D Master dual CPU motherboard

2.4.18-19.7.xsmp kernel

DPT PM2654U2-R PCI RAID Controller with 3 controllers (daughtercard).

SCSI 5 RAID Using enlight 7100 raid enclosure on one controller

Seagate DAT Drive on second controller.

System will operate normally unless DAT drive is used.  40% of the time the DAT 
drive is used (especially during CRON job), the system will hang.  No video, No 
drive action.  Hard power cycle is the only way to bring it back.  This looks 
like it might be a timing window of some kind.  

This exact same raid controller worked fine on a dual PIII 600 MHz box prior to 
upgrading it to dual Athlon.  Redhat version 6.2 was the OS at that time.

Did the i2o drivers in the kernel change?

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.  Set cron job to back system up daily (bru 17.2) (bru -c -vvvv /)
2.  Go in to work early to reboot system.
3.
    

Actual Results:  Darkness on server.  No ip functionality, no display, no drive 
action.

Expected Results:  Backup should complete, cron should finish other tasks.

Additional info:

Deleting the CRON job that makes the drive back up stopped the hang issue.

Comment 1 Marco Coelho 2003-01-20 19:57:50 UTC
Please note the motherboard is MSI not AMD.

Comment 2 Marco Coelho 2003-01-27 20:15:50 UTC
Going back through my files, the original use of this card on a RedHat 6.2 
server used the I2O drivers from the DPT vendor.  

It was not a part of the OS.  I'm going to move the tape drive to the same 
controller as the RAID 5 array today and test some more.  Copies of the 
original driver are available on request.

Comment 3 Marco Coelho 2003-01-28 15:24:04 UTC
Moving the DAT drive to the same controller as the RAID array has allowed 
backups to be done manually.  I will put it back in CRON tonight.  It's looking 
like a kernel I20 timing issue between the first and second controllers on the 
PM2654U2-R SCSI card.

Comment 4 Alan Cox 2003-01-28 16:58:22 UTC
dpt_i2o (the driver we ship in current kernels) is the same vendor driver as
people added to old releases. The core i2o code will handle the dpt ok but 
the dpt speaks a slightly odd dialect of i2o and the vendor driver is better for
the card.

Comment 5 Marco Coelho 2003-01-28 21:35:43 UTC
Odd dialect of i2o or not, I would still have to consider something that kills 
a system with no debug output or other indications a rather serious software 
bug.  Since it does does die so hard, I cannot tell you whether it is the i2o 
driver or the kernel interaction with it.

We ran the same configuration with the older kernel (Rh6.2) and the vendor i2o 
driver with no issues for a couple of years.

Since this is a production box, I cannot afford to make many changes to try to 
debug where it is hanging exactly.

Comment 6 Alan Cox 2003-06-08 19:55:24 UTC
Unclear if i2o or dpt_i2o, installer should have favoured dpt_i2o however


Comment 7 Bugzilla owner 2004-09-30 15:40:25 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.