Bug 30822

Summary: [aic7xxx????] Install dies after 24 hours - 7.1Beta RC2
Product: [Retired] Red Hat Linux Reporter: R P Herrold <herrold>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: dledford, ewt
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2001-09-28 03:53:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Anaconda traceback none

Description R P Herrold 2001-03-06 16:49:58 UTC
see Bugzilla ticket 30250 for context --

Installed just dies after 24 hr with an anaconda traceback -- I'll file
separately to get another Bugzilla number, and note here (frown) -- it was
installing emancs libraries -- roughly 83 packages to go of 4xx total ...

I'll file this and return to attach traceback

Comment 1 R P Herrold 2001-03-06 16:51:25 UTC
Created attachment 11883 [details]
Anaconda traceback

Comment 2 R P Herrold 2001-03-06 16:53:00 UTC
Crossreference is:

  http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=30250

Comment 3 Michael Fulbright 2001-03-06 20:12:56 UTC
Try a minimal install (do custom, then unselect everything).

Does that work?

Sounds like it died trying to umount your CD.

Comment 4 R P Herrold 2001-03-07 00:25:53 UTC
wow -- I'll try it tomorrow at the office (whiere the host is)

Comment 5 R P Herrold 2001-03-07 00:34:36 UTC
The system turns out to be wholly unbootable after the stiop in question -- 

The RFE of my prior request:
   http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=24233
to commit the kernel immediately after it is transferred would have avoided
this, and allowed recovery

-------------------------------

Comment 6 R P Herrold 2001-03-07 20:55:55 UTC
Performed minimal install ... had the CD retry issue three times, but the
previously outlined 4 step method - wait 5 minutes , retry, retry again
immediately, succeeds ...

something like 100 packages and REALLY bare bones -- I like!
 
[root@dhcp244 sysconfig]# rpm -qa | grep -v openssh | grep -v wget  | wc
    117     117    1887
[root@dhcp244 sysconfig]#                                                       

-------------------------

AFTER rebooting and rebooting a second time to confirm it would, I tried to
mount the burned CD -- while the command completed without error, no media
mounted.

I switched to a Official RH 7.0 CD 1, and it mounted without issues.

-------------------------

This still looks like AIC-7xxx driver retry logic issues with burned media

Comment 7 Michael Fulbright 2001-03-09 16:08:51 UTC
Including kernel people on this - I think its worth a good look although it
sounds like there may be some media issues, since having 7.0 work could just
mean the RC2 CD is bad.

Comment 8 R P Herrold 2001-03-10 02:12:29 UTC
FYI ... I have installed with those same CD's on 4 other hosts since them with
no problem (non-scsi hardware ...)



Comment 9 Michael K. Johnson 2001-03-13 04:18:59 UTC
Could you post a complete hardware profile?

We have had some similar reports with IDE CDROM devices that do
not like IDE DMA; we have recently added an ide=nodma boot option
for these machines.  If you could test the latest tree and tell us
if the ide=nodma option fixes this bug, that would be great.  If so,
then if you can give us the exact contents of /proc/ide/hdX/model
(where X is the appropriate letter for your CDROM drive, of course)
we can try "blacklisting" that drive and see if that fixes the
problem.

Comment 10 R P Herrold 2001-03-13 04:23:11 UTC
ummm ... 
7.1 RC1 - HP NetServer 5/100 failure - onboard dual Adaptec AIC-7xxx module type
controller -- all SCSI HD and CDRom Drive -- Pent class -- 192M ram -- Cirrus 1M
onboard VGA --- SCSI-II drives -- SCSI DAT
-------------------------

This ticket was the child of another issue ... see the top of the history.

Comment 11 Arjan van de Ven 2001-03-13 11:06:37 UTC
Doug, can this be a driver issue ?

Comment 12 R P Herrold 2001-03-14 03:48:07 UTC
Installed QA0309 upgrade frpom 7.1RC2 -- locally butned media of same batch of
IBM marque blanks -- NOT A SINGLE read error in the whole upgrade with  this
NetServer 5/100 LH --


VFS: Mounted root (ext2 filesystem).
SCSI subsystem driver Revision: 1.00
(scsi0) <Adaptec AIC-7855 SCSI host adapter> found at PCI 0/5/0
(scsi0) Narrow Channel, SCSI ID=7, 3/255 SCBs
(scsi0) Downloading sequencer code... 415 instructions downloaded
(scsi1) <Adaptec AIC-7855 SCSI host adapter> found at PCI 0/6/0
(scsi1) Narrow Channel, SCSI ID=7, 3/255 SCBs
(scsi1) Downloading sequencer code... 415 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.3/5.2.0
       <Adaptec AIC-7855 SCSI host adapter>
scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.3/5.2.0
       <Adaptec AIC-7855 SCSI host adapter>
  Vendor: SEAGATE   Model: ST39140WC         Rev: 1498
  Type:   Direct-Access                      ANSI SCSI revision: 02

=====================

and yet at my office with a later release NetServer 5/100 LC, it would not even
load the AIC-7xxx drivers and did not spot the SCSI HD and CD-Rom drive ...

Comment 13 Doug Ledford 2001-03-15 04:13:49 UTC
As far as reading CD media is concerned, that is definitely a CD drive issue. 
It's not an aic7xxx issue nor a cd-rom driver or other kernel issue as neither
the upper kernel code nor the aic7xxx driver ever know when burned vs. pressed
media is in use.  What it really points to is that the optics in the CD drive
likely are getting pretty marginal and either need cleaned or are getting ready
to go out of adjustment or else the CD-ROM is old enough that it simply has a
hard time reading burned media instead of pressed media.

That is a totally separate issue from the machine at the office not seeing a
CD-ROM and a hard drive though.  For that one, I would need to know what was
seen on that machine by lspci and also I would need the aic7xxx bootup messages.


Comment 14 R P Herrold 2001-03-15 04:21:39 UTC
as to the 2001-03-14 23:13:49 comments -- okay --- but literally they were
blanks n, n+1, n+2, ... n+7 -- and no other changes in hardware ... with zero
errors on QA0309 ...

------------------------------

lspci and detect info on N/S 5/100 LC tomorrow with QA0309

Comment 15 R P Herrold 2001-03-15 21:02:23 UTC
Installing another burn of QA0309 on the N/S 5/100 LC --- it was able to boot
from the CD to the extent of offering install images -- but again, did not offer
the CD option ... only the FTP. NFS and HTTP ones ..

Started another time, having DD'd a boot.img from /images/ and it offered the CD
media ...  Seriously wierd -- it is like the autoboot image is more fragile than
the boot.img version.

-----------

Doing an install so that I may get the lspci ... there were no install time
aic7xxx messages -- I assume you want post-install messages ...

---------------------

oops ... the old hard drive was done with mkfs, and was transferring the install
image, and it tunnelled in with a head crash and then spindle stop --- it is
dead dead dead ... 

I'll find another drive ....


Comment 16 R P Herrold 2001-03-27 05:15:40 UTC
Same netserver 5/100 LC at my office with the Adaptec controller onboard, and 
QA0322 images -- 2001-03-15 16:02:23 error report continues -- 

It _boots_ from SCSI CD at id 4 -- It boots from floppy, but in neither case 
does NOT pick up that the SCSI CD (in one case that it BOOTED from) is present - 
I did not have time to manually insert the aic7xxx driver -- will do so tomorrow 
...

Comment 17 R P Herrold 2001-03-28 13:54:05 UTC
LSPCI and onboard controller data on the NS 5/100 LC:
---------------------------------------------------------------------

[herrold@dhcp229 herrold]$ /sbin/lspci -v
00:00.0 Host bridge: Intel Corporation 82434LX [Mercury/Neptune] (rev 11)
        Flags: bus master, slow devsel, latency 32

00:03.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 01)
        Flags: bus master, medium devsel, latency 66, IRQ 11
        Memory at fecff000 (32-bit, prefetchable) [size=4K]
        I/O ports at fce0 [size=32]
        Memory at fed00000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at <unassigned> [disabled] [size=1M]

00:04.0 Non-VGA unclassified device: Intel Corporation 82375EB (rev 03)
        Flags: bus master, medium devsel, latency 248

[herrold@dhcp229 herrold]$ cat /proc/scsi/
aic7xxx  scsi
[herrold@dhcp229 herrold]$ cat /proc/scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 5.2.4/5.2.0
Compile Options:
  TCQ Enabled By Default : Enabled
  AIC7XXX_PROC_STATS     : Enabled

Adapter Configuration:
           SCSI Adapter: Adaptec AIC-7770 SCSI host adapter
                           Narrow Controller Channel A at EISA slot 11
    Programmed I/O Base: bc00
    BIOS Memory Address: 0x000cc000
 Adapter SEEPROM Config: SEEPROM not found, using defaults.
      Adaptec SCSI BIOS: Disabled
                    IRQ: 15
                   SCBs: Active 0, Max Active 32,
                         Allocated 62, HW 4, Page 255
             Interrupts: 22617 (Level Sensitive)
      BIOS Control Word: 0x0000
   Adapter Control Word: 0x6767
   Extended Translation: Disabled
Disconnect Enable Flags: 0xffff
 Tag Queue Enable Flags: 0x0004
Ordered Queue Tag Flags: 0x0004
Default Tag Queue Depth: 32
    Tagged Queue By Device array for aic7xxx host instance 0:
      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
    Actual queue depth per device for aic7xxx host instance 0:
 
Statistics:
 
(scsi0:0:2:0)
  Device using Narrow/Sync transfers at 10.0 MByte/sec, offset 15
  Transinfo settings: current(25/15/0/0), goal(25/15/0/0), user(25/15/0/0)
  Total transfers 22488 (16620 reads and 5868 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:    2068      81   13140     429     292     610       0       0
  Writes:    2490     572    1647     795     149     215       0       0
 
 
(scsi0:0:4:0)
  Device using Narrow/Sync transfers at 10.0 MByte/sec, offset 15
  Transinfo settings: current(25/15/0/0), goal(25/15/0/0), user(25/15/0/0)
  Total transfers 51 (51 reads and 0 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:       0      42       3       2       2       2       0       0
  Writes:       0       0       0       0       0       0       0       0
 
 
[herrold@dhcp229 herrold]$

Comment 18 R P Herrold 2001-03-28 16:07:58 UTC
Update:  QA0327 -- attempted CD boot -- HP NS 5/100 LC  boots, but does not
recognize that it has a SCSI CD present - and so offers only FTP/NFS etc options
-----------------
Trying floppy ...

Comment 19 R P Herrold 2001-03-28 16:32:23 UTC
QA0327 - update -- booted from boot.img floppy _with the SCSI drive EMPTY_ --
boot order has been set Flop/CD/HD, but it was seemingly NOT booting from the
floppy otherwise !!!  and THEN inserted CD-1 at the first TUI prompt (to which
it defaulted -- old Cirrus chipset, covered earlier in this beta cycle in
Bugzilla ...)

... upgrade proceeding ...   fascinating ...

Comment 20 Matt Wilson 2001-03-28 16:46:22 UTC
*** Bug 30250 has been marked as a duplicate of this bug. ***

Comment 21 R P Herrold 2001-03-28 19:26:42 UTC
Completed upgrade QA0327 on the HP NS 5/100 LC without incident -- trying an
install now ...

Comment 22 R P Herrold 2001-03-28 21:02:05 UTC
Completed cold partition and install QA0327 on the HP NS 5/100 LC without
incident -- This indicates that the Pentioum based onboard controller issues
with AIC7xxx are managable with a floppy based boot.  See next paragraph.
----------------------------------------

Should I separately Bugzilla the non-identification on the 5/100 of the SCSI
controller, the specs of which I posted earlier today, in CD boot mode?  

It is not PCI, but EISA - there are tons of these in the HP server area -- and
HP built a Gazillion of these battleships ... They are great -- and with the
addition of or presence of a section in the README DOCO telling mixed EISA/PCI
chassis to just use the Floppy in the fashion I outlined, there is no disabling
problem with the QA0327 issue, in the Pentium 5/100 series.


===================================================

I am next going to test on the HP NetServer 4/66 which was also having the long
hang issues, and was the original genesis of this ticket.

Comment 23 Arjan van de Ven 2001-03-28 21:08:28 UTC
Please do a separate bug for the EISA/PCI issue.

Comment 24 R P Herrold 2001-04-08 17:09:18 UTC
... Need to annotate in EISA here --- RPH todo

Comment 25 R P Herrold 2001-09-28 03:53:25 UTC
RH 7.2 gold -- did an install with no issues from CD -- OK to close as far as I
am concerned ...