Bug 66270 - Installer hangs while loading packages
Summary: Installer hangs while loading packages
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.3
Hardware: athlon
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-06-06 23:01 UTC by Jim Prior
Modified: 2008-08-01 16:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:39:39 UTC
Embargoed:


Attachments (Terms of Use)
lspci and friends (14.07 KB, text/plain)
2002-08-09 01:58 UTC, R P Herrold
no flags Details

Description Jim Prior 2002-06-06 23:01:21 UTC
Description of Problem:
Installer hangs at various places while loading packages.

Version-Release number of selected component (if applicable):


How Reproducible:
Every time.  


Steps to Reproduce:
1. Wipe disk (dd if=/dev/zero of=/dev/hda bs=512 count=65536)
2. Boot RH 7.3 CD, enter "text" at boot: prompt.
3. Minimal server installation, auto partition,
   mostly default selections, (no X!).



Actual Results:
The installation stalls while installing packages.
It stalls at a different place each time.
The hard drive activity LED stays on steady.
The CD-ROM drive activity LED stays off.
I get 100MB to 300MB through the 921MB total before it stalls.
It never complains with error messages.
I never have gotten to the second disk.

If I try a Workstation install,
it usually gets stuck during formatting of
the big default / partition.

Sometimes for Server install,
it gets stuck formatting / or /usr partition.


Expected Results:
Install all packages each time.


Additional Information:
Installer doesn't completely hang,
I can switch between virtual consoles after installer hangs.
I can execute ps in second console.

It seems that the faster I make my setup selections,
the further it gets installing packages until it hangs.
Sometime, especially if take a long time making
installation options, it will hang while formatting a partition.

Media verifies OK, both with mediacheck option and
md5sum of disk matches RH site.

I've had no trouble running RH 7.2 on this system.
I can still restore my old RH 7.2 with dd from a backup drive,
and then it runs the old RH 7.2 system just fine.

Hardware is cheap sis or sis clone chipset Socket A motherboard
(sis630 video, sis900 ethernet)
with 1050MHz Athlon w/128MB RAM,
27GB Maxtor IDE drive and IDE CD-ROM drive.

I had other issues installing RH 7.3 on this system.
See bug #64530.

Comment 1 R P Herrold 2002-06-06 23:20:06 UTC
Jim, 

Is this like the motherboard hardware I tested for Ron?  The issue sounds like
media, (althogh blaming CD drives is great sport as well) but may be elsewhere. 
Is there any helpfile hint in the alternative consoles?

-- Russ

Comment 2 Jim Prior 2002-06-06 23:35:32 UTC
Russ, 

Yes, this is on the hardware that you tested for Ron, 
the hardware you still have.  

I have verified the media several times with: 

   dd if=/dev/cdrom of=foo.iso bs=1024 count=652832
   md5sum foo.iso

yields an md5sum cb91...4c59.  Albeit, the dd/md5sum 
verification was done on different hardware.  

I have done the mediacheck option on the installation hardware.  
It passes.  

I did burn another copy of installation media.  Same problem.  

If CD-R media or CD-ROM drive is the culprit, 
why would CD-ROM drive LED not be stuck on?  
BTW, the CD-ROM does not make a lot of noise like it is 
trying hard to re-read bad spots.  

If CD-R media or CD-ROM drive is the culprit, 
why would the hard drive LED be stuck on?  

Just for fun, I put the old 700MHz Duron back in.  
No change in behavior, (except slower).  

I don't see any obvious clues in other consoles, 
not that I know what to look for.  

Jim


Comment 3 Jim Prior 2002-06-07 13:37:44 UTC
I restored the stable RH 7.2, 
then attempted to do an upgrade to RH 7.3.  
It gets stuck also.  

Steps to reproduce: 

   Restore old stable RH 7.2 and boot it confirm its health.  
   Boot RH 7.3 Disk 1
   boot: text
   English during installation
   us keyboard
   3 button PS/2 mouse
   Upgrade Existing System
   Upgrading RH on /dev/hda9 partition
   Customize packages to upgrade?  No.  
   Detected GRUB on /dev/hda   Update boot loader configuration

Hangs while displaying "Finding packages to upgrade"
I can still switch between virtual consoles.  
When I execute ps in second console, it hangs after column heading line.

Comment 4 Michael Fulbright 2002-06-07 18:31:27 UTC
Assigning to an engineer.

Comment 5 Jeremy Katz 2002-06-07 18:45:51 UTC
Are there any errors about reading the CD on tty4?

Comment 6 Jim Prior 2002-06-10 20:47:01 UTC
There are no reports of errors about reading the CD on tty4.  

The last lines on tty4 are: 
...
<6>kjournald starting.  Commit interval 5 seconds
<6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on ide0(3,2), internal journal
<6>EXT-fs: mounted filesystem with ordered data mode.  
<6>kjournald starting.  Commit interval 5 seconds
<6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on ide0(3,7), internal journal
<6>EXT-fs: mounted filesystem with ordered data mode.  

* * * * * * * * * * * * * * * * * * * * * * * * * * * * 

I have experimented much.  I have swapped CDs, motherboards, CPUs, 
memory, CD-ROM drives, UDMA cable to hard drive, and IDE cable to 
CD-ROM drive.  

Both of my CDs pass the "boot: linux mediacheck" test.  
Both of my CDs pass my own dd/md5sum test.  

I have tried to upgrade the old reliable restored RH7.2 system, 
with similar failure.  tty4 is slightly different then, 
but still no CD-ROM drive or media error reports.  
Tty4 looks like below for upgrades: 
...
<6>kjournald starting.  Commit interval 5 seconds
<6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on ide0(3,5), internal journal
<6>EXT-fs: mounted filesystem with ordered data mode.  
<6>kjournald starting.  Commit interval 5 seconds
<6>EXT3 FS 2.4-0.9.17, 10 Jan 2002 on ide0(3,7), internal journal
<6>EXT-fs: mounted filesystem with ordered data  mode.  
<6>Adding Swap: 2096440k swap-space (priority -1)

The two different motherboards are of similar cheap SiS chipset ilk.  

I have not, but will, try swapping power supply and hard drive.

Comment 7 Marcos Pinto 2002-06-11 01:28:25 UTC
I have the *exact* same problem, only I use bootnet.img; so much for the CDROM 
theory.  I have no idea what's going on since there are no error messages on 
any of the consoles.

Comment 8 Marcos Pinto 2002-06-11 04:31:19 UTC
More info...My computer inst an Athlon, it's a PII with 256MB RAM.  here's more:
IRQ 10	S3 ViRGE-DX/GX PCI (375/385)
IRQ 10	Intel 82371AB/EB PCI to USB Universal Host Controller
IRQ 10	IRQ Holder for PCI Steering
IRQ 11	Realtek RTL8029(AS) PCI Ethernet NIC
IRQ 11	IRQ Holder for PCI Steering
IRQ 14	Primary IDE controller (dual fifo)
IRQ 14	Intel 82371AB/EB PCI Bus Master IDE Controller
IRQ 15	Secondary IDE controller (dual fifo)
IRQ 15	Intel 82371AB/EB PCI Bus Master IDE Controller

Is there any specific info that you want to know?

Comment 9 Jim Prior 2002-06-12 01:54:33 UTC
Emprical results:

   The installation always gets stuck
   when installing to a 27GB Maxtor hard drive.
   When the installation stalls, the hard drive LED is always stuck on.
   The CD-ROM drive never acts like it trying hard to read something 
   when the installation is stuck.  
   This is repeatable.

   The installation is always successful when installing to
   three other hard drives that I've tried:
      80GB Maxtor, 40MB Maxtor and 6.3GB Samsung.
   This is repeatable.

The 27GB Maxtor hard drive is Model 92720U8.
I have no trouble restoring and running RH 7.2 on it.
I only have trouble installing RH 7.3 on it.

Swapping the following things had no effect on the result.
   motherboards
   memory
   power supply
   CD-ROM drive
   CD-R disks
   ordinary IDE cable to CD-ROM drive
   UDMA IDE cable to hard drive

Swapping the hard drive does matter.

The CD-R installation media is not the problem.
They pass all the tests I have tried,
and work fine for installing to the non-27GB hard drives.

* * * * * * * * * * * * * * * * * * * * * * * * * * *

The failure to install the 27GB is puzzling,
since that drive works just fine with RH 7.2.

Here are my thoughts.

Maybe there is some bad sector on the drive.
Perhaps the installation gets stuck on a part of the
drive that the RH 7.2 system does not exercise.

   So I will do an exhaustive hard drive test.

Perhaps there is some design vulnerability in the drive
that RH 7.2 does not provoke, but RH 7.3 does.
Could UDMA be such a vulnerability?

Comment 10 Jim Prior 2002-06-12 02:04:56 UTC
Dear pintom.mil,

What make and model hard drive are you installing to?

When your installation gets stuck,
is the hard drive LED stuck on?
Is the CD-ROM drive trying to read?

What happens when you install to a different hard drive?


Comment 11 Jim Prior 2002-06-12 12:43:07 UTC
No problems are found with the 27GB Maxtor drive when I execute: 

   badblocks -s -v -w /dev/hdd1

Comment 12 Jim Prior 2002-06-15 12:25:07 UTC
I found a roundabout way to get RH 7.3 on the 27GB Maxtor drive.  

Using dd, copy a successful installation from a smaller drive to 
the difficult drive.  Both drives have the same number of 
heads and sectors per track.

Comment 13 Jeremy Katz 2002-07-02 05:07:34 UTC
I can't reproduce this at all here.  Arjan, any idea on anything that might be
causing this?

Comment 14 Ed Voncken 2002-08-03 19:59:08 UTC
Hi all,

I'd like to add another data point. I have the same exact problem (100%
reproducible) on the following configuration:

- Gigabyte GA 6VX7+ mainboard, bios revision F3
- Intel Celeron 433MHz, FSB 66MHz
- 2x 256MB DIMM
- harddisk Maxtor 90854D4, 8GB, A40CWJ1C, 1998.12.08

- "linux mediacheck" passed on this configuration
- disk partitioned using Disk Druid
- reproducible using unmodified Server install
- reproducible using Custom install
- installation hangs somewhere during package installation,
  for example ht://dig and Perl were two occurrences
- VC's keep responding; python installer stops responding

There must be something different about the 7.3 installer; I have severe
installation problems for 7.3 on at least 2 of my machines, while 7.2 installs
without a hitch.

Greetings,
  Ed.

Comment 15 R P Herrold 2002-08-04 03:47:13 UTC
Jim was over this afternoon, and we spend several hours with Limbo 2 -- the
drive and motherboard in question which were acting up are able to be wiped and
re-tested -- I send Limbo 2 CD's with him a couple of days ago ...

... but I also worked through an FTP install method with him



Comment 16 Ed Voncken 2002-08-04 20:05:58 UTC
Hi folks,

It definitely is an interaction between the harddrive and the RedHat installer.
I reproduced the problem using RHL 7.2 and 7.3 with the Maxtor 90854D4, 8GB,
A40CWJ1C, 1998.12.08.

This morning I replaced the disk with a Maxtor 32049H3 20GB (N3R22HWC,
2000.09.14). The rest of the configuration remained identical.

This time, the RedHat 7.3 install went flawlessly.

Greetings,
  Ed.

Comment 17 Jim Prior 2002-08-07 14:09:42 UTC
I have replicated the issue with RH Limbo 7.3.93

It gets stuck during formatting of /home or /usr partitions.  

Hardware is cheap sis or sis clone chipset Socket A motherboard
(sis630 video, sis900 ethernet)
with 1050MHz Athlon w/128MB RAM,
27GB Maxtor IDE drive and IDE CD-ROM drive.

Installation CD-R passes dd/md5sum test and Limbo's Media Check.

Comment 18 Jim Prior 2002-08-08 00:50:03 UTC
I reset the BIOS settings.  

It still gets stuck during formatting of /home or /usr partitions 
for Limbo 7.3.93.  

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 

From a working RH 7.3 installation on a different drive, 
the output of lspci -v -v -v for the rest of the machine 
is at http://www.colug.net/~jep/lspci-v-v-v.

Comment 19 R P Herrold 2002-08-09 01:56:56 UTC
Jim brought the box over and I beat on it -- It looks as though the on board
hard drive driver is taking control, and silently dying, possibly when on a DMA
excursion ...

The mkfs gets 23 of about 200 inode tables in on / (having done /boot) -- when I
manually mkfs and bypass having anaconda do it, I get about 10 packages into the
install, and lock up at e2fsutils ...

I will attach some traces in a moment of various items (lscpi -v -v -v and drive
parameters in /proc) snapshotted to floppy in rescue CD mode



Comment 20 R P Herrold 2002-08-09 01:58:01 UTC
Created attachment 69664 [details]
lspci and friends

Comment 21 R P Herrold 2002-08-11 05:10:36 UTC
After a thread on valhalla list I mentioned Maxtor issues to Jim.  He wen
Googleing and found:

http://www.google.com/search?hl=en&ie=ISO-8859-1&q=maxtor+92720U8+linux

Maybe we are rediscovering the wheel.

How do we tell the kernel to be VERY conservative with this drive? -- is tehre a
check on drive model number which can automate this process?

-----------------------------------------

URGENT README :: Rogier (Re: DMA Disabled.)

Andre Hedrick (andre)
Fri, 27 Aug 1999 20:19:19 -0700 (PDT) 

    Messages sorted by: [ date ][ thread ][ subject ][ author ] 
    Next message: Andre Hedrick: "Re: Oops 2.3.15 in piix_config_drive_for_dma" 
    Previous message: Matthew Wilcox: "[PATCH] binfmt_elf cleanups" 
    In reply to: Alberto Mardegan: "[IDE] hdc: status error: status = 0x58" 


You got it right Rogier.
There is an uncorrectable timing chatter between the DMA mode 2 WD31600
and the UDMA Maxtor 92720U8. This is really problematic on the all PIIXx
chipsets that I have observed. It has nothing to do with Intel's
hardware, it is an issue between the two drive venders.

WD + Maxtor (same channel) == cat and dog in bag.
(One animal is going to damage the other.)

Disable all DMAing for data safety and split the devices to separate
channels. I do not care what is where. I fear that you will find
FS-Corruption from the hardware level. I spent four weeks and daily
reinstalls to finally pick up the phone and call Maxtor.
(Note that WD does not care about Linux, not a viable OS).

There is on a pretty way to impliment this kind of detection, or I would
have.

Andre Hedrick
The Linux IDE guy

On Fri, 27 Aug 1999, Rogier Wolff wrote:

Comment 22 R P Herrold 2002-08-11 05:11:39 UTC
Note that this may have just become a kernel issue, and may no longer be an
Anaconda issues...

Comment 23 Bugzilla owner 2004-09-30 15:39:39 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.