Bug 140367
Summary: | FC3 kernel panics with SATA and kernel on install disc | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Matthew E. Lauterbach <lauterm> | ||||||
Component: | kernel | Assignee: | Jeff Garzik <jgarzik> | ||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3 | CC: | benny+bugzilla, davej, gajownik, janes.rob, mattdm, michael.wiktowy, peterm, rob, wtogami | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-07-20 19:02:15 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Matthew E. Lauterbach
2004-11-22 16:58:27 UTC
Created attachment 107192 [details]
lspci output
Also happens on Nforce2-based systems. This sounds very much like http://bugzilla.kernel.org/show_bug.cgi?id=3352 I use the workaround patch found at that link. Except that the work around is for nf3 boards. I have the same issue, but with a nf2 board, so it appears to be a problem in the core of the SATA drivers. Never patched a source rpm before. Is there a link on how to do this properly? I installed the src.rpm an copied the patchfile into /usr/src/redhat/SOURCES. Then I added that patchfile into the spec file as Patch1154 right after the other sata patches. Then I did rpmbuild -bb kernel-2.6.spec --target=i686. It all built fine, but when I looked at the spec file again to doublecheck my work my changes were no longer there. So apparently I don't know enough about what I am trying to do. Why does the installer kernel work just fine and the installed kernel from FC3 final not work. Are they different? btw, this is happening with i686 install and x86_64 install. Same problem here with ASUS AV7333 motherboard and an additional (PCI) SATA-controller (Mercury Sata 150 Raid Controller, Silicon Image chipset)... Using stock FC3-kernel it panics with the following msg: ---------------------------------------------------------- ... loading jdb.ko module loading ext3.ko module creating root device mounting root filesystem mount: error 6 mounting ext3 mount: error 2 mounting none switching to new root switchroot mount failed: 22 umount /initrd/dev failed: 2 kernel panic - not syncing: attempted to kill init ---------------------------------------------------------- Using kernel-2.6.9-1.681_FC3 kernel as suggested by bug-id 139674 flashes a simular msg (to fast to read it), and then goes looping the following msg: ---------------------------------------------------------- atkbd.c: spurious ack on isa0060/serio0. Some program, like XFree86 might be trying access hardware directly ---------------------------------------------------------- Booting from the dvd in linux rescue mode and a chroot /mnt/sysimage gives me my full filesystem (both ide and sata).. Since I couldn't get the patchfile to work properly, I directly edited sata_nv.c and re-tarred and bz2ed the kernel source. I was then able to successfully rebuild the kernel rpm (i686 only so far) to include the potential fix suggested by Benny in comment #3. I've tested the resulting rpm on a working FC3 install. It boots and runs fine there. I will test on the problem machine when I get off work in 2.5 hours. It seems like there's a potential fix for sata_nv.c. Any ideas for sata_sil, which is what my system uses? This fixed the issue for me. I edited sata_nv.c as per http://bugzilla.kernel.org/show_bug.cgi?id=3352 (Thank you, Benny). Apparently the Seagate drives don't like that reset command. However, other drives may need it. Hoping for someone with a little more kernel experience than me will have an idea as to how to make a general kernel patch that will work with the Seagate drives without breaking anything else. I'm going to look at sata_sil to see if it is doing the same sort of thing as per Rob's comment above. Remco, is your drive a Seagate and are you also using sata_sil? My drive is a Maxtor.. not really sure if I am (was) using sata_sil, since I've just finished re-installing my system with boot- and root- partitions on an old-fashioned ide disk, and the rest on sata.. I do have some log's saved however... where can I check what it was using? The output from dmesg will show if you're still using the sata_sil driver, but I don't think this is the core of the problem. The sata_sil.c in 2.6.9 is the same as in 2.6.8.1, so something has changed which broke that driver, and apparently, several others as well. Well, after booting the rescue disk and chrooting the old install, doing a rpm --nodeps on all the kernel packages, booting the rescue cd again and choosing install, write new grub configuration and letting the installer install the kernel and update the MBR, everything works. So this was, for me, apparently a problem caused by not letting the installer update the MBR with the new grub boot loader code. As a side note, I've noticed that performance is *way* down, by about 75%, with my Seagate drive. A code review shows that the author is applying the MOD15WRITE to my particular drive, though it worked fine with the older driver with didn't seem to include it in the blacklist. When someone from Red Hat gets a chance to look at this, the sata_nv does not seem to need "ATA_FLAG_SATA_RESET |" until libata is ready for hotplug. Can this be patched at the in the Fedora rpms until it all gets sorted out at the kernel level? See http://bugzilla.kernel.org/show_bug.cgi?id=3352 for specifics. This is a "me too" post. I upgraded from FC2 to FC3 on an Epox EP-D3VA dual 1GHz PIII motherboard with 6 x Maxtor Maxline II 250GB SATA disks connected to 2 x Promise SATA150 TX4 cards. Root is installed to /dev/md0 (RAID1 built from /dev/sda1 and /dev/sdd1), with swap on /dev/md1 (RAID1 built from /dev/sdb1 and /dev/sde1) and /dev/md2 (RAID1 built from /dev/sdc1 and /dev/df1). /dev/md5 is a large RAID5 array built from /dev/sd[abcdef]2). The upgrade appeared to go smoothly (kernel-2.6.9-1.667smp was installed) but when I rebooted I got a kernel panic with the following message: Loading ext3.ko module md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. Creating root device Mounting root filesystem EXT3-fs: unable to read superblock mount: error 22 mounting ext3 mount: error 2 mounting none Switching to new root switchroot: mount failed: 22 umount /initrd/dev failed: 2 Kernel panic - not syncing: Attempted to kill init! I booted into rescue mode and chrooted into the upgraded system - all seems fine, i.e. all the filesystems are OK. I then upgraded all the packages using "yum upgrade" within the chroot environment. This installed kernel-2.6.9-1.681_FC3smp. I rebooted and got the same error as above. Is there any sign of a fix for this on the horizon? Thanks, R. I'm getting the no volume groups problem, but am attempting with vmlinuz-2.6.10-1.770_FC3. works ok with vmlinuz-2.6.9-1.667, which is before the fixed one, .681! Dell Optiplex GX280. kernel doesn't panic until it finds it has no volume group if i reboot with 2.6.9 667, things are ok. the problem also occurs with vmlinuz-2.6.10-1.760_FC3. so, install FC3 was fine. problem happens after a new kernel from up2date. Any updates on this? rj The reason it worked with 2.6.9-1.667 is because in order to get fedora core 3 to install in the first place i had to turn "compatibility" mode on in the bios. It would appear that kernel versions after the patch supported the "normal" mode for this thing, but strangely enough, they no longer supported the "compatibility" mode. The problem cleared up when I restored "normal" mode in the bios screen for the hard drive. I guess "compatibility" mode was getting in the way of the revamped kernel. I had forgotten I had to turn on compatibility mode to install fc3. I thought it had something to do with the LVM. I reloaded fc3 and ditched the LVM for ext3, and the darn thing still wouldn't go. I've had lots of problems with LVM, but none to speak of with ext2/ext3, so at that point I figured it had to be something else, and then remembered about the compatility setting for this funky ide scsi hybrid drive. so, for future reference, install fc3 with compatibility mode on. flip it off once you get the kernel up2date. (In reply to comment #15) > I'm getting the no volume groups problem, but am attempting with > vmlinuz-2.6.10-1.770_FC3. works ok with vmlinuz-2.6.9-1.667, which is before > the fixed one, .681! > > Dell Optiplex GX280. > > kernel doesn't panic until it finds it has no volume group > > if i reboot with 2.6.9 667, things are ok. > > the problem also occurs with vmlinuz-2.6.10-1.760_FC3. > > so, install FC3 was fine. problem happens after a new kernel from up2date. > > Any updates on this? > > rj > To add another data point: I was trying to migrate my FC3 install from a PATA drive to a SATA drive. I low-level copied everything over using: dd if=/dev/hda of=/dev/sda bs=10M Afterwards everything seemed to mount OK. I pulled out the PATA disk but got these same errors and kernel panics. No amount of grub futzing made it mount the root device. My new SATA hd is a Seagate 250GB 7200rpm. I do not have things set up as a RAID. Created attachment 113587 [details]
output of lspci
This problem occurs running latest kernel 2.6.11-1.14_FC3
see attachment for lspci
I finally had a chance to play with this again. I did a clean install of x86 FC3. I got the kernel panic on first boot. I booted into rescue and ran yum. It updated my kernel to 2.6.11-1.14_FC3. Then, I rebooted, and it is working fine. Interestingly, my newer Seagate 300GB SATA drive did not exhibit the same problem. It is model # ST3300831AS. The older 120GB that did exhibit the problem was model # ST3120026AS. My problem in Comment #17 was solved by booting into rescue mode and uninstalling/reinstalling the kernel mentioned in Comment #19. This was the one that was installed originally but when it was installed, my Sil 3112 SATA onboard controller on my A7N8X mobo was not enabled (hardware jumper was in the off spot). Likely this caused the sil_sata kernel module to not get bundled into the initrd and a reinstall of the kernel forced a mkinitrd which created included the correct modules. An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you. Actually as I stated in Comment #19, kernel 2.6.11-1.14_FC3 seemed to fix it for me. I have moved to FC4, and the problem has not re-occurred. Thanks. |