Description of problem: All Netfinity 4500R, ServeRAID 4L, firmware 4.70.17 with redhat 7.1 + updates will hang after about a day. No users, no network, no serial, no cronjobs other than the standard ones shipped with RedHat *minimal install* like updatedb, logrotate etc. Only a display attached. How reproducible: Always Steps to Reproduce: 1.Take a new Netfinity 4500R + ServeRAID 4L card + three 18.4 GB disks 2.Install (insert CD and reboot, rest is automatic) a) IBM UpdateExpress CD as per IBM website http://www.pc.ibm.com/qtechinfo/MIGR-4VVNTP.html b) IBM ServeRAID 4.70 CD as per IBM website http://www.pc.ibm.com/qtechinfo/MIGR-4X7R6P.html, iso is at ftp://ftp.pc.ibm.com/pub/pccbbs/pc_servers/25p1574.iso c) Redhat 7.1 Linux CD + updates (+ kickstart) Actual Results: After about a day the machine will crash. Expected Results: The machine should have stayed up ! Additional info: Symptoms: 1) Several `(ips0) Resetting controller' entries in kernel log 2) `device events' counters continuously increasing for the SCSI disk devices 3) At several occasions, after a cold boot, the machine rebooted immediately after displaying the SCSI messages *). So far the second boot has always succeeded. 4) After one to two days, left unattended, the machine hangs. *) right after this: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.4/5.2.0 <Adaptec AIC-7899 Ultra 160/m SCSI host adapter> scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.4/5.2.0 <Adaptec AIC-7899 Ultra 160/m SCSI host adapter> *reboot* (Normally it would show ... Loading ips module) Here's an overview of different netfinities: server device events firmware longest redhat kernel (disk1, disk2, disk3) version uptime version version ----------------------------------------------------------------------------- number1 0, 0, 0 4.00.06 225 days 6.2+updates 2.2.16-3smp number2 0, 0, 0 4.50.05 >100 days 6.2+updates 2.2.16-3smp number3 0, 1, 0 4.50.05 >100 days 6.2+updates 2.2.16-3smp replacement 10, 16, 31 4.70.17 1 day 7.1+updates 2.4.3-12 senttothelab 14, 19, 45 4.70.17 2 day 7.1 2.4.2-2 All machines are 2CPU Netfinity 4500R, ServeRAID 4L, 512MB or 1GB main memory except for number1 which only has ServeRAID 3L, Below I'll attach a history of what happened prior to this. Previous report (way too much detail, piled up as we went along) is at http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=43461 A quote from there: `There are lots of folks now using Red Hat 7.1 and are not seeing this' Could anybody running this same configuration *) for longer than a week please, please send me a note ! *) I.e. netfinity 4500R (aka xSeries 340 eServer), ServeRAID 4L, firmware 4.70.17, redhat 7.1 TIA, Harold.
Created attachment 25772 [details] Summary of the history of this case.
The issue has been resolved since end of 2001. Around that time IBM released bugfix RAID firmware 4.80.26 The machine has since been running without a prob under a load of 2-3 for well over a year now. Initially 2.4.9 would cause it to agressively swap which sometimes made it crawl, but it never went down. That swap problem went away with an upgrade to 2.4.18 Sorry for not updating ! Only when Alan changed the subject on Tue, 10 Jun 2003 I noticed that it was still open.