This was a fresh install of 7.1 on a former 7.0 system. I had no problems with the install -- this showed up immediately after the installation. This box has a Tyan S1834 (Tiger 133) mb with 1 833 MHz P3 & 384 MB memory Cards: Matrox G450 (32 MB) Adaptec 2740 U/W Adaptec 2740 U 3Com 3c594 (I think 594 -- not sure -- from memory) 10/100 MB network card Digi PC-Xem 16-port serial board Soundblaster AWE64 On the 2 Adaptec cards I have: 2740 U/W: id 00: QUANTUM Model: QM318000TD-SW 18GB HD (fast/wide scsi2) id 03: PLEXTOR Model: CD-ROM PX-32TS CDrom (narrow) id 04: HP Model: HP35480A tape drive (narrow) id 05: YAMAHA Model: CRW8824S CD-RW (narrow) id 08: IBM Model: DDRS-39130W 9 GB HD (fast/wide scsi2) id 09: IBM OEM Model: DFHSS4W 4.3 GB HD (fast/wide scsi2) 2740 U: id 00: UMAX Model: Astra 1200S scanner id 05: HP Model: C1553A tape drive (narrow) On the udma66 conroller I have one drive, and IBM 46 GB udma100 hd running as a udma66 This is the same hardware on which I was running 7.0 with no problems. I am experiencing frequent "system hangs" which remind me of the delay you see with a scsi bus reset. I am running no disk-mirroring or raid s/w, BTW. However, the delays are shorter -- about 6 seconds instead of the 15-20 which I would get from a reset. Also after checking log files, I am getting NO error messages indicating any scsi problems at all. Typically I *had* thought these delays occur whenever I did a file copy of > 4MB. However, today I started looking for and seeing them with file copies as small as 120 KB -- not every time, just intemittantly. The same "hangs" occur if I disconnect the ide drive and disable its controller on the MB (it's built in, so I can't just pull the board). So I tried an experiment -- I have another system running 7.0 with a slightly smaller scsi drive. I pulled that drive and put it in this system. The system came up and ran 7.0 without the pauses. So I reattached the other drives, tape drives, ide drive (re-enabling the controller) and still the system behaves. Likewise putting the drive with my 7.1 system (along with the u/w controller) in that other machine and booting 7.1 there, I get the same "pauses". BTW, this drive worked fine under 7.0. So I finally put these two systems "back in order" (right drives, controllers in each machine), backed up EVERYTHING onto tape, and re-installed 7.0 on my 7.1 box (fresh install) -- the pauses disappeared altogether. Reinstall 7.1 and they're back. I suspect that the often-mentioned problem(s) with the aic7xxx driver may be the root cause, although it could be something else in the kernel.
Then could you please try the other scsi driver ? aic7xxx_mod ?
I tried the aic7xxx_mod driver doing the following procedure: A) edit /etc/modules.conf B) change the line reading alias scsi_hostadapter aic7xxx to readalias scsi_hostadapter aic7xxx_mod C) do a depmod -a D) reboot. The machine reboots, recognizes the drives, goes to run level 3, then I get the messages: Entering non-interactive startup updating /etc/fstab And then the system hangs for 10-20 seconds, does a stack trace (which goes off the screen too fast to read), and then I get a repeating loop: + the repeated series of error messages on scsi0:-1:-1:-1 (too many to count -- several screensfull at least): scsi 0:-1:-1:-1 Referenced SCB 255 not valid during SELTO SCSISEQ=0x5a SEQ ADDR=0x18 SSTAT0=0x10 SSTAT1=0x8a + then another error message which goes off the screen to fast to read. which continues until I hit the reboot switch. So either I did the wrong procedure to try the driver OR it has a problem with my system as well. Is there a procedure to specify the module at boot up instead of putting it into /etc/modules.conf?
Did you remember to do the mkinitrd and lilo steps when changing the SCSI module? Just doing a depmod -a doesn't activate the change, you have to make a new initrd and you have to run the lilo command so lilo can map the new initrd image. Then, when you reboot, you should have the new driver.
FWIW, I've got a Netfinity 5000 that has AIC 7895 Ultra SCSI. I tried the following, which did NOT work: - boot 7.1 cdrom - boot XFS boot.img - boot XFS bootnet.img - boot XFS cd (created from ISO image) What *DID* work: I installed RH71 on another computer, copied the RPMS, base from CD's 1 & 2 to /usr/ftp/pub/rh71, changed ftp user to mount /usr/ftp instead of /var/ftp, added my uid/gid to /etc/ftpaccess, restarted xinetd by /sbin/service xinetd restart (not sure if I needed to), made sure ftp anonymous was working, booted with the 7.1 CDROM, did an FTP install from the other RH71 box. Hopefully that will help others! BTW, I, too am a little disenchanged with RH. They have done an execellent job over the years, but the 7.0 release really upset me. Couldn't compile the kernel out of the box?!?! I bought RH70 and was disappointed that I did. When 7.1 was released, I felt it'd be better do download the ISO's. Boy, am I *GLAD* that I did. I didn't want to WASTE more money... Hopefully, RH7.2 (or whatever the next one will be...) will install smoother... Also, I want to thank the person that suggested to do ALT-F2, ALT-F3, ALT-F4 on the installs. I had no idea that there was more detailed info avail during the install. Now I have *SOMETHING* more entertaining to watch. :-)
1I have installed an old pc system as server running RH7.1. PC system: DEC, Digital Celebris 560 96 MByte RAM Onboard VGA, Keyboard, Mouse Adaptec AHA-2940 Ultra/Ultra W 3COM Etherlink XL, 3C900 Combo Harddisk IBM DCAS-34330W and Seagate ST39173LW (connected on wide scsi) CD-ROM Plextor PX-6XCS (connected on wide scsi via adapter) LILO in /boot partition e2fs filesystem running DHCP The system was newly installed with RH7.1, for use as an experimental nfs/ftp/web/kickstart server. For normal/console operation (without network connections) the system works properly. With a second pc I tried using a network based kickstart installation. During this installation the server hang completely and forever (until manual hard !!! reset). In most cases (>90%) the server hangs during the loading of the selected packages, sometimes after 3 minutes (or a few mbytes) and sometimes after 30 minutes (or more than 1 gbyte) (during testing nfs and ftp based installation). Then: From a third pc the server is not reachable with the ping command but the second pc is. After searching bugzilla I found several problems regarding the aic7xxx scsi driver. I switched to the described driver aic7xxx_mod (/etc/modules.conf, aic7xxx_mod; depmod -a; mkinitrd; second section in /etc/lilo.conf, lilo) and the server is running well. Testing this by switching back to the original driver always reproduce the problem.
dledford@redhat wrote: Did you remember to do the mkinitrd and lilo steps when changing the SCSI module? Just doing a depmod -a doesn't activate the change, you have to make a new initrd and you have to run the lilo command so lilo can map the new initrd image. Yes, I did the new mkinitrd. Unfortunately I misspelled the output file in /boot and didn't catch it before the reboots. After I saw your note, I checked and fixed my blunder -- I have been running the aic7xxx_mod module with improved performance ever since. Thanks.