Bug 41392

Summary: System lockup during file copy, multiple servers, same model, RedHat Installer >= 6.2
Product: [Retired] Red Hat Linux Reporter: Doug Reed <doug_reed>
Component: installerAssignee: Brent Fox <bfox>
Status: CLOSED DUPLICATE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: 7.1   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-05-21 20:25:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Doug Reed 2001-05-19 19:57:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)

Description of problem:
I have several Compaq DeskPro systems (most DP4000 5233MMX) with Via 
Technologies VT82C586 IDE controllers and Compaq Netelligent 10/100 
Network cards.  RedHat 6.1 installs fine, autodetects everything and works 
fine.  As of RedHat 6.2 all systems lockup (dead mouse and keyboard, power 
off required) during the filecopy stage of install.  All correctly auto-
detect all hardware and begin install.  Occasional hangs during format, 
but mostly during file copy.  I have only tested RedHat 7.1 on one machine 
because the rest need to stay functional.  It is a DP4000 5233MMX with 
128meg, 16meg Vodoo3 Video, 1GIG IDE HD(I think), a 40GIG IDE HD, and a 
Compaq CDROM.  ...This is my development machine, the others have less 
memory, smaller disks, and cheaper video cards.  This machine almost 
always locks up after copying glibc-common, once it hung during format.  I 
have played with BIOS settings, disabling Bus Mastering etc, different 
install options (text, GUI) and settings, upgrade, clean install...to no 
avaiil.  I am convinced it has something to do with the install script 
because it started with 6.2, and still does the same thing the same way, 
yet much has changed between 6.2 and 7.1.  All machines are Compaq 
Deskpros of similar vintage, but the hard disks, amount of memory and 
video cards differ.
Doing a CTRL-Alt F3 and CTRL-Alt F4 show nothing.  ...F3 shows the bootup 
stuff, the last message in F4 is sometimes about fsck on /HOME since the 
default install doesn't format it, and the crash causes it to not be 
unmounted cleanly...  Not real useful, but gives you an idea that the 
install never sees this coming because this message is generated when the 
file partitions are mounted and the other partitions are then formated, 
the installation is copied over, and the files start to copy.  Then 
after "glibc-common" ...death!  This same machine is running RedHat 6.1 as 
we speak (I re-installed it).  It has had bug fixes applied, Ximian Gnome, 
all kinds of "modern" software.  It has never crashed, and is seldom 
rebooted.  I have never run the 2.4 kernel, but that cannot be the problem 
with RedHat 6.2.  Teh other machines are also run 24x7 with no problems.
I would REALLY like to get this fixed because I am introducing LINUX into 
our corporate environment and things like this have the potential to kill 
it.

How reproducible:
Always

Steps to Reproduce:
1. boot from CD
2. Answer all of the questions...  answers and configuration do not matter
3. System locks up 100% of the time on several machines.
	

Actual Results:  The system freezes on RedHat 6.2 and 7.1

Expected Results:  The installation should finish.

Additional info:

Compaq DeskPro systems (most DP4000 5233MMX) with Via Technologies 
VT82C586 IDE controllers and Compaq Netelligent 10/100 Network cards.

Test system isa DP4000 5233MMX with 128meg, 16meg Vodoo3 Video, 1GIG IDE 
HD(I think), a 40GIG IDE HD, and a Compaq CDROM.

Comment 1 Brent Fox 2001-05-20 17:16:59 UTC
I'm trying to understand exactly what happens when the install fails.  Do you
see a Python traceback when it crashes, or do you see something like "Signal 11"
or "Signal 7"?
Also, are these cd's that you burned yourself, or were they from the box set?
If you burned them yourself, did you check the md5sums of the ISOs to make sure
that they matched up with the ones on the ftp site?

Comment 2 Doug Reed 2001-05-21 19:41:46 UTC
--- Doug Reed's Response ---
o   What Happens?
I do not SEE ANYTHING.  The system freezes.  No Python errors or anything at 
all on any of the systems.  The mouse no longer moves <CTRL-ALT-(anything)> 
does nothing...on the plus side, the "Power" button still works!  Since I 
cannot press any keys after the freeze...to determine that I did not get any 
other errors, it was necessary to do several installs to gather information for 
you guys.  In each case, I got the install going, and when it started copying 
files, I entered <CTRL-ALT-F3>...  and left it there until the disks stopped 
spinning, I verified that it was locked up.  I then repeated this with <CTRL-
ALT-F4>.  I appologize that I don't remember what the last message on the 
screen is, but it is always the same and of no interest.  <CTRL-ALT-F3> has all 
of the boot stuff on it about what it found at boot, and everything is 
correct...this stuff is already there when the GUI first starts.  <CTRL-ALT-F4> 
basically has has status about what it is doing, but it really doesn't say 
much.  The default install doesn't check "format" for the "/home" partition.  
When I failed to check it manually, the last message was a complaint about 
mounting an unchecked file system, and suggests that an fsck was 
recommended...  This occurs when "/home" is mounted, which is done a good 5 
minutes before the machine locks up, because it then installs the system, and 
then starts copying files, and freezes after completing glibc-common.  A year 
ago or so, when I first installed 6.2, I did this same thing on one of the 
machines in the Support Center, and again, never saw anything an any of the 
screens that gave me any clue what was wrong.  I then tried the other machines 
and got ther same result.  Last time, I never changed any BIOS settings or 
anything, I simply gave up after looking for messages in the other screens...I 
knew if I reported the error to you, you would want more information and I had 
nothing to give you, so I figured that it would surface somewhere else and 
would get fixed eventually.  It now appears that whatever change occured, did 
so between 6.1 and 6.2 and is still there.  p.s. I also tried the text install 
with the same procedures with no difference in the results.

OH  FLASH!  There is one thing!  When I ran the text install, I got some 
message like:

Unrecognized E??: supported.

overlaying one of the graphics screens.  Subsequently, I found the same message 
on one of the F3, F4 screens when I ran the GUI install.  I don't recall what 
the "E??" was.  It was three letters and I originally remembered it as 
being "ESS", and suspected my sound card was having problems but on a 
subsequent install, I remember that it was not this, besides my sound is 
correctly detected.  It works flawlessly on 6.1 and is auto-detected.  It was 
not an achronym that I had heard of.  I searched for the string on Google, and 
did not find anything, I then searched for the "E??" string, and only found 
unrelated references.  The only computer references at all mentioned advanced 
graphics if I recall.  I talked to the RedHat support people, and they couldn't 
find anything related to this either and nobody had ever heard of the achronym 
either.  Maybe one of them will remember what the string was.  I am not at home 
now, I may be able to find it in my search history on my Windoze system at 
home.  I was just doing searches here for three leter E things, and I didn't 
find any combination that brought back any memories.  Anyway, it occurs REALLY 
EARLY in the install when the install is probing stuff.  It occurs like twice 
or so when the install is trying to guess about video, mouse and network 
settings.  The support staff thought it might be important, and suspected this 
as the problem...I am not convinced.  If the message occured anywhere near the 
lockup, I would agree, but it occurs too early, and the installer finally makes 
all of the right decisions and formats all of the drives, and installs the 
system before dying during the copy.  I can even successfully boot the 7.1 
system after powering it off and on!  I can't use it because it panics when it 
cannot find "init".


o Burned or Bought?
I purchased the Deluxe boxed set for RedHat 6.1, which works on all machines.  
I mean TOTALLY WORKS FLAWLESSLY!  It auto-detects everything in the machine 
including the video, network card, and even the sound card.

My company purchased the Enterprise Boxed set for RedHat 6.2, but I had not yet 
purchased this set in my department.  My co-worker (with the 6.2 version) 
burned the install CD from his Boxed set, and gave it to me so that I could 
determine if it was enough of an improvement to merrit an upgrade.  Our company 
moved our support center, so I moved the systems in question to the new 
location, and performed clean installs.  Every system hung in the same way, so 
I assumed that I had a bad CD, and simply installed 6.1.
I then borrowed the original CD from my co-worker to evaluate on my development 
system at home, and experienced the same thing on my system at home using the 
Original CD.  I went through the install several times (both Text and GUI)
looking for clues in the F3 and F4 screens to no avail so I gave up, and simply 
installed 6.1.  I assumed this was some install bug that would get sorted out 
in time, and didn't worry about it.  6.2 was not substantially different than 
6.1 with maintenance, so it was not worth my time to chase it.

I never played with RedHat 7.0 in any way.

I just purchased RedHat 7.1 Deluxe so that I could use it to upgrade my 4 or so 
machines in the Support Center, but they are now being used daily as client 
workstations for an X application (Micromuse Netcool).  Therefore I don't wish 
to break them unnecessarily.  For this reason, I did the upgrade on my 
development system at home and experienced the exact same system freeze during 
an "upgrade" that 6.2 had experienced so long ago.  I then tried to do a clean 
install, formatting everything.  I got the same freeze in the same place.  I 
then started playing with text install, skipping X configuration, BIOS settings 
like Bus Mastering and translation, and different install selections, all to no 
avail.  I simply gave up, and re-installed 6.1, which still works fine.  I have 
since re-installed all recommended maintenance, Ximian Gnome, and Evolution...  
Everything works flawlessly.   ...except Evolution, which seems to be forever 
broken. :-)

I have not tried 7.1 on any of the other machines because I am gun shy.  I have 
every reason to expect the same behaviour.  

I will be happy to take whatever debug steps you would like...although I am 
getting rather tired of re-installing 6.1 :-(  ...also I can re-post with the 
three leter achronym if I get home tonight and find it in my search history.

Comment 3 Brent Fox 2001-05-21 20:25:32 UTC
I think that this bug is a duplicate of bug #37280.  
Another Compaq DeskPro.



*** This bug has been marked as a duplicate of 37280 ***