Bug 51175 - RH 7.1 installer produces boot diskette that is unbootable for aic7xxx systems
Summary: RH 7.1 installer produces boot diskette that is unbootable for aic7xxx systems
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-08-08 02:46 UTC by Red Hat Bugzilla
Modified: 2008-03-13 19:18 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-06-06 16:38:46 UTC
Embargoed:


Attachments (Terms of Use)
anaconda dump (15.24 KB, text/plain)
2001-09-19 00:46 UTC, Andreas Ott
no flags Details

Description Red Hat Bugzilla 2001-08-08 02:46:00 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.78 [en] (X11; U; OSF1 V5.1 alpha)

Description of problem:
System:

C440GX+ Production Release 7
BIOS Build 104

Dual (PIII) Xeon 550

AIC7896 v2.20S1B1

SCSI devices:

Attached devices: 
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: SEAGATE  Model: ST39236LC        Rev: 0004
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: SEAGATE  Model: ST39236LC        Rev: 0004
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: ESG-SHV  Model: SCA HSBP M6      Rev: 0.63
  Type:   Processor                        ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: TOSHIBA  Model: CD-ROM XM-6401TA Rev: 1001
  Type:   CD-ROM                           ANSI SCSI revision: 02


Description:
=========

I encountered the hang while loading the the aic7xxx driver during the
initial
install.  I read the "7.1 gotchas", downloaded the alternate boot diskette
image
for the installer, booted from that using "linux apic", and was able to
proceed through the install.

The RH 7.1 installer automatically elected to install the 2.4.2-2smp
kernel, though
it did not show up as "checked" in the graphical installer.  It also
installed
the 2.4.2-2 kernel, which I expected.  The installer made the 2.4.2-2smp
kernel
the default.

Booting from that default (2.4.2-2smp) kernel works fine.

If I try to boot from the `linux-up' entry (the 2.4.2-2 kernel) *or* the
boot diskette that
was made at the end of the RH 7.1 install, however, I get SCSI timeouts for
everything
that's probed, and the boot fails.  The timeouts look like:

scsi: aborting command due to timeout: pid 0, scsi0, channel 0, id N, lun 0
Inquiry 00 00 00 ff 00

If there is a device at whatever id is being probed (in this case scsi0 ids
0,1,6 and
scsi1 id 0) then theres a single timeout for that id and then the device
inquiry info
shows.  If there is no device at the id being probed, the timeout shows up
twice for
that ID.

After IDs 0-6 & 8-15 are probed for scsi0 and scsi1, additional SCSI
messages appear

Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0
scsi: abort pid 0, scsi 0, channel 0, id 0, lun 0 Test Unit Ready 00 00 00
00 00
(scsi0:0:0:0) SCSISIGI 0xb4, SEQADDR 0xbe, SSTAT0 0x0, SSTAT1 0x2
(scsi0:0:0:0) SGCACHEPTR 0x2, SSTAT2 0x0, STCNT 0x0

and then repeating messages about resetting, followed eventually by "trying
harder".
I can input those as well, if they're important.

If I build a boot diskette using the 2.4.2-2smp kernel, the boot diskette
works, so it
appears that the reason that both the `linux-up' entry and the boot
diskette fail is
that they're both using the stock `2.4.2-2' uniprocessor kernel.

The main reason I'm reporting all this is that I think the RedHat 7.1
"gotchas" page
that talks about how to work around the installer problem should be updated
to
indicate that people should be especially sure to test their boot diskette
after the RH 7.1 install, to make sure it works!  Otherwise, they might be
in for a nasty surprise if they need to boot from their installer-created
boot diskette, and it doesn't work.

How reproducible:
Always

Steps to Reproduce:
Install RH 7.1 on a dual processor box with the AIC 7896 SCSI controller,
using the
workaround described in the RH 7.1 gotchas page.  The install and
subsequent
reboots *from the default smp kernel*  work fine.

Reboots from the `linux-up' kernel or the kernel on the boot diskette (same
kernel)
fail, with SCSI aborts and timeouts.

Actual Results:  Couldn't boot off the installer-generated boot diskette or
the linux-up boot choice

Expected Results:  Should be able to boot off of any of the kernel images.

Additional info:

Comment 1 Red Hat Bugzilla 2001-08-08 08:29:06 UTC
Thanks for the report.

First of all, the real cause is a bug in the bios for which your hardware vendor
should provide an update. (Several bioswrites for the 440GX promised us an
updated bios when we proved the bug to them).

Second, I would recommend updating to the 2.4.3-12 kernel which has the "noapic"
by default and use that for creating a bootfloppy.

And indeed, it would be good if this were documented on the gotscha's page, and 
I'll figure out a way to get this info there.

Comment 2 Red Hat Bugzilla 2001-08-08 21:34:07 UTC
arjanv-

Thanks for your comments and information.

I had the i686 kernel updates from updates.redhat.com, so I loaded the 2.4.3-12,
2.4.3-12smp,
and 2.4.3-12enterprise kernels this afternoon on that system, and added them as
boot options
to lilo (I made initrd images for each of them).

The 2.4.3-12 uniprocessor kernel exhibited the same behavior as the 2.4.2-2 up
kernel --
SCSI timeouts.

I then went to Intel's web site, wandered around until I found

	http://developer.intel.com/support/motherboards/server/c440gx/index.htm

and then selected the `Software & Drivers' page and downloaded the `BIOS8'
update
for that C440GX+ motherboard.  I updated the BIOS, using the procedure
documented
with the updater, and now have a motherboard that reports that it's "Production
Release 8"
and "BIOS Build 106".

Even with this BIOS, booting the 2.4.2 or 2.4.3 uniprocessor kernels still
results in the SCSI
timeouts.  Using the smp or enterprise kernels from either 2.4.2 or 2.4.3 works
as expected --
no timeouts.

Anything else I should check/do/try?

Comment 3 Red Hat Bugzilla 2001-08-09 08:15:19 UTC
We have a list of bioses in the kernel that need the "apic" parameter (as all
these problems are caused by a bios bug), and apparently your bios is not yet on
that list. I need a few bits of information in order to add your bios to the
list; could you please download

http://people.redhat.com/arjanv/dmidecode.c

and as root:
gcc dmidecode.c -o dmidecode
./dmidecode | mail -s "needs apic" hardwarebugs-list

(note the later sends an email with your bios information; I recommend running
the dmidecode program without the | mail part to check the information you will
send.)
I only need the first 20 lines or so, the rest is not relevant.

Thanks.

Comment 4 Red Hat Bugzilla 2001-09-11 10:50:51 UTC
To add to this - I have just failed to install RH 7.1 onto an Intel L440GX+ system.

I too have read the gotchas page and the new boot image has allowed me to
install 7.1, with the SMP kernel, however the system is unbootable, as above.
Lots of "aborting command due to timeout" messages. I have also tried the
Enterprise kernel, with the same result.

Bios revision, Production Release 14.3 
AIC 7896 v2.57S2B3. 
scsi id 1: Fujitsu MAJ3182MC
scsi id 2: Fujitsu MAJ3182MC

Obviously the systenm is nicely unusable now so we will be reverting back to
7.0, but I thought you'd like the info, because its obviously a long way from
being fixed.

Comment 5 Red Hat Bugzilla 2001-09-11 12:04:43 UTC
This is not something we can FIX. It's an intel bios bug, and Intel will release
a fixed bios soon, if they haven't already.

Comment 6 Red Hat Bugzilla 2001-09-11 12:59:55 UTC
oh right, sorry....got the impression people were actually trying to do
something about it....

No, Intel's latest bios update (April 2001) claims to have all sorts of updates
to the adaptec scsi interface, but it doesn't solve the problem...indeed, Intel
steer well clear of claiming Linux will work on the motherboard and haven't even
tested it!

Comment 7 Red Hat Bugzilla 2001-09-11 13:03:12 UTC
They told us different. And we do try to work around this brokenness, with the
"apic" option etc etc, but there is a limit to what can be done.

Comment 8 Red Hat Bugzilla 2001-09-19 00:45:25 UTC
I am also still struggling to get 7.1 installed on an Intel 440 running 6.2,
for which I have just dowloaded a new BIOS from intel.com. It shows now:
L440GX+ Production Release 14.3
BIOS Build 133
Adaptec AIC-7896 SCSI BIOS v257S2B3
 
Apparently no newer BIOS is available.
I've downloaded the http://people.redhat.com/dledford/440gx/bootnet.img disk
and booted with 'linux apic'. This brings me past the aic7xxx loading but the
installer
later terminated with an anaconda error/dump when trying to perform an
'upgrade'.
It lets me select NFS, configure network etc and then produced an anaconda dump
(included as attachment to this comment). There is not yet a shell on Alt-F2,
Alt-F3
displays '* no IDE floppy devices found' and Alt-F4 displays 3 times '<6>cdrom:
open
failed.'

Comment 9 Red Hat Bugzilla 2001-09-19 00:46:29 UTC
Created attachment 32118 [details]
anaconda dump

Comment 10 Red Hat Bugzilla 2001-09-20 20:45:36 UTC
I've also produced a 'dmidecode' dump, if anyone is interested I'll upload it as
attachment.

Comment 11 Red Hat Bugzilla 2003-06-06 16:38:46 UTC
Red Hat now uses some alternative fixes we finally managed to get out of Intel.



Note You need to log in before you can comment on or make changes to this bug.