Bug 52575

Summary: kernel panic on startup but install works
Product: [Retired] Red Hat Linux Reporter: Steve Romero <romstev>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: cnkeller
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-06-05 17:17:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
This is data that I was able to copy down after one of the events, all other events spew similar data none

Description Steve Romero 2001-08-25 18:52:45 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.4.3-SGI_XFS_1.0.1_PR3smp i686;
en-US; rv:0.9.1) Gecko/20010622

Description of problem:
After an install the system will puke on every attempt to start-up with a
kernel panic, "In interrup handler - not syncing".  System is a Sony
PCG-FX215 notebook.  Previous installs of RH 7.1 with kernel version 2.4.2
and 2.4.3 ran as did SGI XFS release 1.0 and 1.0.1

Version-Release number of selected component (if applicable):2.4.7


How reproducible:
Always

Steps to Reproduce:
1.Install the OS, features not important
2.Reboot, observe...
3.
	

Actual Results:  4 installs, (reformatted each time) and the results the
same each time.

Expected Results:  system should have booted the OS

Additional info:

Comment 1 Steve Romero 2001-08-25 18:53:55 UTC
Created attachment 29554 [details]
This is data that I was able to copy down after one of the events, all other events spew similar data

Comment 2 Arjan van de Ven 2001-08-25 18:56:00 UTC
Is this the 2.4.6-3.1 or 2.4.7-2 kernel ?


Comment 3 Steve Romero 2001-08-25 19:16:19 UTC
This is the 2.4.7-2 kernel, Roswell Beta 2

Comment 4 Arjan van de Ven 2001-08-26 13:25:28 UTC
Does it still do this if you boot with "apm=off" on the commandline of the
kernel ?


Comment 5 Arjan van de Ven 2001-08-26 13:34:31 UTC
Is this the vaio with the Athlon/Duron CPU ?

Comment 6 Steve Romero 2001-08-26 22:03:48 UTC
Sorry but I'm unable to try the "apm=off" thing had to get my laptop ready for
work Monday...

Also yes this laptop has the Duron CPU, /proc/cupinfo follows:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 3
model name	: AMD Duron(tm) Processor
stepping	: 1
cpu MHz		: 800.052
cache size	: 64 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips	: 1595.80

Comment 7 Christopher Keller 2001-08-27 15:48:27 UTC
Same problem.

Athlon 1.33Ghz/266

KT7A-RAID, two hard disks, one on /dev/hde (seawolf) and /dev/hdg (roswell),
CDRW on /dev/hda. Note, mobo BIOS is latest -- 3R. Disks are not in a RAID
configration, merely using the controller.

This was the original release,  I wasn't aware if Beta 2 included new ISO's or
everything was via up2date. 

Note that this is a new system configuration and upgrading kernels on the 7.1
disk is giving me the same problem, ie kernel panic on startup. The default 7.1
kernel works fine. Only uprades causing the kernel panic. I believe this is a
kernel RAM disk issue, though I've never had to run mkinitrd before.  


Comment 8 Arjan van de Ven 2001-08-27 15:51:51 UTC
cnkeller: did you upgrade to the i686 or to the athlon kernel ?

Comment 9 Christopher Keller 2001-08-28 18:02:38 UTC
arjanv:

Roswell: since it doesn't boot, I haven't upgraded anything obviously, nor have
I burned any copies of the newer roswell releases. This appears to be a kernel
or filesystem based problem.

On 7.1: i've tried 2.4.3 (failed) and 2.4.7-2 (failed).2.4.2 (default works fine).

It fails with LILO or GRUB and fails with or without a kernel ramdisk.

KT7A-RAID -- latest BIOS -- 3R
HPT370: two disks (/dev/hde & /dev/hdg non-RAID) with /dev/hda being a CDROM

The error message is along the following lines: 232k freed, unable to open file
system on /dev/hde, kernel panic, try passing init= to the kernel. This is with
a kernel ramdisk created. Note that on the working 2.4.2 kernel, no ram disk is
used.

If you need the exact error message, I can copy it down tonight.

Comment 10 Christopher Keller 2001-08-28 18:13:36 UTC
Arjan, sorry also forgot to add something. On the roswell list, somone is seeing
the exact behavior I am with an Asus A7V. I guess the Highpoint controller and
Promise controllers are sharing the same code?


Comment 11 Arjan van de Ven 2001-08-28 18:47:09 UTC
> the exact behavior I am with an Asus A7V. I guess the Highpoint controller and
> Promise controllers are sharing the same code?

They share most likely the same bug: the bios marks the disk as raid and is then
handled as raid partition, not a normal one.

However this is unrelated to the vaio oops I think.

Comment 12 Arjan van de Ven 2001-08-28 18:49:55 UTC
Promise Fasttrak(tm) Softwareraid driver for linux version 0.02
No raid array found
Highpoint HPT370 Softwareraid driver for linux version 0.01
No raid array found


should be what the kernel says. If it DOES find arrays, you (or the bios) have
made raid partitions and then ignored them. Can you please check this ?

Comment 13 Arjan van de Ven 2001-08-28 20:23:03 UTC
cnkeller: can you try the 2.4.3-17 kernel at
http://people.redhat.com/arjanv/testkernels ?
it's the same as 2.4.3-12 with one change I like to see if it makes a difference
(and with a reiserfs bug fixed)

Comment 14 Christopher Keller 2001-08-28 20:36:14 UTC
I will check the kernel message regarding "No raid array found" as soon as I get
home tonight. I do not ever recall making a RAID, so if it does find the array,
it's news to me. :-) 

I will also download the 2.4.3-12 kernel tonight and get back to you guys ASAP.
It's starting to become apparent that this is a different bug than the VAIO,
should we open a new bug ID?


Comment 15 David Crim 2001-08-28 21:57:39 UTC
It sounds like cnkeller's problem is that same as 62651.

Comment 16 Christopher Keller 2001-08-28 22:37:39 UTC
That's 52651 in case you were really confused (as I was).

Comment 17 Christopher Keller 2001-08-29 01:13:06 UTC
Arjan,

Using the 2.4.3-17 kernel results in the following:

.
.
.
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID array
autorun...
...autorun DONE
NET4: Linux TCP/IP 1.0 for NET4.0
.
.
.
EXT2-fs: unable to read superblock
isofs_read_super: bread failed, dev21 :0a, iso_blknum=16, block=32
kernel panic: VFS: Unable to mount root fs on 21: 0a

Comment 18 Christopher Keller 2001-08-29 02:19:38 UTC
Okay, compiled the 2.4.7-2 kernel from rawhide. I had very high hopes when I saw
the software RAID with HPT370 option. No dice, same result of being unable to
read the superblock of EXT-2. The error messages were similar to those on bug
52651. I tried with and without a kernel ramdisk. 

Also, it did print out the message "No raid array found" after the Highpoint
HPT370 message. 


Comment 19 Christopher Keller 2001-09-10 16:25:31 UTC
Arjan,

Sorry for the long delay. I've been testing a few of your kernels from your
homepage. It seems that you've fixed the problem. What was the problem and I'm
assuming it will make it into the final 7.2? I believe the latest one I'm using
is -19. At any rate, this bug can now be closed from my point of view; though
turns out that my situation was a little different than the original report.



Comment 20 Doug Smith 2001-11-06 21:22:07 UTC
This sounds similar to a problem I am having.  I bought a Sony Vaio FXA32 last
night and 7.2 panics when the kernel loads.  Something about problem reading
virtual memory.  I loaded 7.1 and it works.  I can give more info but I would
have to load 7.2 and reload 7.1 etc. etc.  Let me know if you need me too.  This
is an AMD Duron.

Found this information which might be it.  Have to find time to try this.
http://www.uwsg.iu.edu/hypermail/linux/kernel/0110.0/0048.html

I also found another person with this problem but no solution.
http://marc.theaimsgroup.com/?l=linux-kernel&m=100482241923313&w=2

Any ideas of where to go from here would be great or maybe someone already has
more info on this...

Comment 21 Christopher Keller 2001-11-06 22:06:44 UTC
We're actually having the same problem with enigma on a PCG-FX215 (AMD Duron I
imagine). I'm guessing the way to solving it is with by using the i686 kernel?

Comment 22 Christopher Keller 2001-11-06 22:14:52 UTC
Update: by passing "noathlon" on the kernel line, the machine booted fine.

Comment 23 Doug Smith 2001-11-07 18:45:52 UTC
Passing "noathlon" on the kernel line fixed my machine too, WooHoo!  Thanks
cnkeller.  I would think this will be a problem with all Sony Vaios (maybe
specific to Athlon chip).  I wonder if FX215 is really an Athlon or Intel
though, I was told FXA=AMD and FX=Intel.  

I'm not sure how cnkeller came across this fix but after many hours I never
found it till this thread and his post.  A search for "noathlon" on RH and
Google only turn up kernel release notes.  I wish I could make this information
more available as I'm sure it affects all Sony Vaio owners installing 7.2.

I'm a bit confused whether this problem relates to the original problem
submitted.  I think maybe something should be added to the RH 7.2 documentation
about this fix.

Comment 24 Arjan van de Ven 2001-11-07 18:48:40 UTC
The noathlon _is_ in the documentation, in fact, it's even in the release notes
on every CD, and also in the release notes that you can view during the
installation ;(

Comment 25 andre azaroff 2002-06-14 13:08:35 UTC
unfortunately the installer kernel will not run on my new hardware. here is a
copy of the email i sent to the vallhalla-list

i will try passing the noathlon to the boot kernel and see if this corrects the
problem.


when i try to run the installer the kernel panics and of course shuts down.

my new system
tyan s2460 motherboard
dual athlon mp 1800+
lsi logic lsiu160 ultra 160 scsi card (64 bit pci)
4x 9 gig quantum disks
24x ide cdrom (master on primary ide controller)
dlink systems gigabit ethernet dge 500t (32 bit pci)
1 gig ram (registered ecc pc 2100 sdram)


in the bios all on board peripherals are turned off (both serial, all four usb,
parallel port, floppy, and secondary ide) except primary ide

the following is left on display when kernel panics (unfortunately i cannot
scroll back so this is all i could get):
pde=00000000
oops: 0000
cpu 0
eip: 0010:[<c011458e>] not tainted
eflags: 00010286
eip is at 2.4.18-3boot
eax:33eca000  ebx:c0265f18  ecx:f7ecbfa0  edx:c025dc08
esi:33eca000  edi:00000000  ebp:c026f28  esp:c0265f14
ds:0018  es:0018  ss:0018
process swapper pid:0   stack page=c0265000

stack
00000000    00000001    33eca000    c011e83c    00000000    c0265f38    c011469f
   33eca000
00000000    00000046    c011e6eb    33eca000    00000000    c0290e70    00000000
   00000046
c011b813    0011b749    00000000    00000001    c0290e80    fffffffe    c011b58
   c0290e80

call trace [<c011e83c>]
[<c011469f>]
[<c011e6eb>]
[<c011b813>]
[<c011b749>]
[<c011b58f>]
[<c01099b8>]
[<c0105000>]
[<c010bea8>]
[<c0106b74>]
[<c0105000>]
[<c0106b97>]
[<c0106bdc>]

code 8b 56 20 8d 04 d2 8d 04 c2 8d 04 42 c1 ed 04 8d 88 00 05 29

interrupt handler - not synching

i read in the errata and found that the 2.4.18-3 kernel has a race condition
that will cause a panic with ext3 filesystem. since i have nothing installed on
this machine yet i don't know if this is the problem.
i have downloaded the latest iso disk images but the checksum matches the last
image i downloaded so i don't believe the 2.4.18-4 kernel is included and
without an install i can't apply the patches.



Comment 26 Arjan van de Ven 2002-06-14 13:13:08 UTC
 andreazaroff:that looks like a different issue :(

Comment 27 andre azaroff 2002-06-24 13:15:48 UTC
yes it is a different issue.

the noathlon parameter did nothing when passed to the installer kernel (thanks
for the email :( but i did read the docs prior to posting), as i suspected it
would not, since the docs all refer to a post install situation as is everyone
else commenting on this bug. i probably should have posted in another place or
not at all.

for whatever it is worth
i resolved the problem. it seems to have been hardware related. upon careful
observation i noticed the kernel panic allways happend at the same place in the
install right, right after it probed the scsi adapter for devices and found
them. there was a bunch of text that scrolled off of the screen that went by so
fast i could not see it clearly. i think it was related to the panic, though i
can not really say for sure.

one of two things could have resolved the problem since i did them at the same
time and the installer ran afterwards i really don't know which it was.


on the tyan motherboard there is a bios setting called "smp version". it can be
set to either 1.4 or 1.1. i changed the setting to 1.1 as the description stated
that some operating systems need the 1.1 setting to operate properly

i used the scsi setup utility to initialize all of my drives.
my guess would be the smp version, then again it may just have been coincidence.

since it installed  my system has been up for 5 days without a hiccup running an
athlon optimized kernel (no noathlon parameter).

Comment 28 Alan Cox 2003-06-05 17:17:01 UTC
Closing. Assuming the VIA chipset flaw worked around by all current errata