From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98) Description of problem: Dual athlon smp motherboard 2100 processors AMD K7D Master dual CPU motherboard 2.4.18-19.7.xsmp kernel DPT PM2654U2-R PCI RAID Controller with 3 controllers (daughtercard). SCSI 5 RAID Using enlight 7100 raid enclosure on one controller Seagate DAT Drive on second controller. System will operate normally unless DAT drive is used. 40% of the time the DAT drive is used (especially during CRON job), the system will hang. No video, No drive action. Hard power cycle is the only way to bring it back. This looks like it might be a timing window of some kind. This exact same raid controller worked fine on a dual PIII 600 MHz box prior to upgrading it to dual Athlon. Redhat version 6.2 was the OS at that time. Did the i2o drivers in the kernel change? Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1. Set cron job to back system up daily (bru 17.2) (bru -c -vvvv /) 2. Go in to work early to reboot system. 3. Actual Results: Darkness on server. No ip functionality, no display, no drive action. Expected Results: Backup should complete, cron should finish other tasks. Additional info: Deleting the CRON job that makes the drive back up stopped the hang issue.
Please note the motherboard is MSI not AMD.
Going back through my files, the original use of this card on a RedHat 6.2 server used the I2O drivers from the DPT vendor. It was not a part of the OS. I'm going to move the tape drive to the same controller as the RAID 5 array today and test some more. Copies of the original driver are available on request.
Moving the DAT drive to the same controller as the RAID array has allowed backups to be done manually. I will put it back in CRON tonight. It's looking like a kernel I20 timing issue between the first and second controllers on the PM2654U2-R SCSI card.
dpt_i2o (the driver we ship in current kernels) is the same vendor driver as people added to old releases. The core i2o code will handle the dpt ok but the dpt speaks a slightly odd dialect of i2o and the vendor driver is better for the card.
Odd dialect of i2o or not, I would still have to consider something that kills a system with no debug output or other indications a rather serious software bug. Since it does does die so hard, I cannot tell you whether it is the i2o driver or the kernel interaction with it. We ran the same configuration with the older kernel (Rh6.2) and the vendor i2o driver with no issues for a couple of years. Since this is a production box, I cannot afford to make many changes to try to debug where it is hanging exactly.
Unclear if i2o or dpt_i2o, installer should have favoured dpt_i2o however
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/