Description of problem: Updated kernel fails with the following message at boot: init[1] segfault at 00000006 eip 00000006 esp bfbc70ec error 4 This message is repeated alternately with ... printk:2900000 messages suppressed .. where the actual number of messages varies slightly around this value Version-Release number of selected component (if applicable): 2.6.23.1-49.fc8 2.6.23.8-63.fc8 Additional info: 2.6.23.1-42.fc8 works fine.
Hi, I'm not sure if this is where I should be adding this. I am having the same issue on a Dell vostro 200 with ICH9 SATA controller. The following is a copy of a post I made describing the issue. POST I: I have vista (don't ask) installed on another partition, I used it's native tool to shrink my 300G drive to 2x150 paritions. To test the M$ tool, I used gparted bootcd to format the new partition /dev/sda4 to ext2. All seemed fine. The install was hit & miss, with acpi=off and the SATA set to IDE mode in the BIOS it booted and loaded FC8. Grub installed fine and I ran update to get new kernel etc.. Most things seem to work fine, however using kfpgrabber and gftp I have a reproducible error - both apps crash trying to FTP a ~13MB dir structure to my local drive. I get errors from dmesg saying that exceptions occured in the eip & eis registers. POST II: I had to switch the SATA options in the BIOS to RAID mode as opposed to IDE. This allowed me to use the AHCI driver rather that the pix... (I think this is then name of it) I have tried pasing irqpoll with acpi=off to the kernel at boot time. This makes no difference, in fact it may be making things worse for me. Many of my apps are segfault-ing, eclipse, gdm, cisco vpn client, httpd, cupsd, procmail, gftp...etc.. This happens with or without kenerl params. in general the dmesg looks like: app[nnnn]: segfault at xxxxxxx eip xxxxxx esp xxxxxx error 6 I have seen something similar at http://www.fedoraforum.org/forum/showthread.php?t=174130 I can't get output of lspci etc.. at the moment, the BCIM4318 is not supported so I can't get at the box. I can get this info if yoy need it. Thanks, Alan
Hi, I spent a lot time on this and it looks like it may have been more my application of linux than the kernel. I could not update the BIOS, fw kept failing on me. so yyet another reinstall was undertaken. This time I left the BIOS/SATA in IDE mode and passed ACPI=off irqpoll to the kernel. It booted, loaded the pix_... driver in no time and all went smoothly. I am on the Fedora werewolf distro, that is the 2.6.23.42 kernel as far as I know. I haven't dared upgrade. Anyhow, no seg faults and the box is powering along. (The irqpoll param is not present when I boot now, maybe it's hardcoded into the kernel at install) Alan
ref: irqpoll - I do have to pass this to the kernel - although I'm sure it booted without it once? Puzzled
I've tried acpi=off and irqpoll, but neither fix the problem on this laptop.
Just updated to kernel 2.6.23.9-85.fc8 and still no luck. Only versions 2.6.23.1-42.fc8 and earlier are bootable. Any suggestions for providing better diagnostic information gratefully received. Unfortunately, all of my boot.log files are empty.
Try removing "quiet" and "rhgb" from the kernel options to see exactly where it fails in bootup. (And add "debug".)
OK. I've tried this and the first init[1] segfault messages occurs immediately after the "Write protecting the kernel read-only data: 844k" message. This is followed by a couple more routine messages, one from 'input' where the mouse is detected and a couple from atkbd.c where it asks me (twice) to use 'setkeycodes' to make 'e06e' known. It then just endlessly repeats the "init[1] segfault ..." and the "printk:2900000 messages suppressed" messages.
I'm still getting the same problem with kernel-2.6.23.14-107.fc8 with little idea how to debug the problem further. I see others have the same problem ... http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg210980.html Is there any chance that the regression mentioned in the message below could be affecting Fedora kernels beyond -42.fc8 ? http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg202676.html
Apologies for 'bumping' this bug report but the latest rawhide kernel offers more info on this ongoing segfault problem: init[1]: segfault at 6 ip 00000006 sp bf8cf2d8 error 4 in ld-2.7.90.so[110000+1f000] The kernel version is 2.6.25-0.40.rc1.git2.fc9.i686
Oh no, looks like maybe bug 336161 has returned. Can you boot in rescue mode and rebuild the initrd? # chroot /mnt/sysimage # mkinitrd /boot/initrd-<kernelversion>.img <kernelversionversion> See https://fedoraproject.org/wiki/KernelCommonProblems for more detailed directions.
Created attachment 295919 [details] Output from mkinitrd -v As I can still boot into kernel 2.6.23.1-42.fc8 I presumed booting into rescue mode wasn't necessary for this. I've tried mkinitrd on both the latest kernel (2.6.25-0.65.rc2.git7.fc9) and working kernel (2.6.23.1-42.fc8) versions. Interestingly I get the same segfault error from BOTH rebuilt initrd images. Attached is a log from the rebuild process.
Can you attach a working and a broken initrd?
Created attachment 296178 [details] Working initrd (as downloaded)
Created attachment 296179 [details] Segfaulting initrd (as rebuilt using mkinitrd)
Are you using the Fedora 8 mkinitrd or the rawhide one? And you appear to be getting the rawhide glibc shared libraries in your Fedora 8 initrd...
I upgraded the laptop to rawhide once the F8 kernel updates started segfaulting (in the hopes that it would be fixed in rawhide sooner). So, yes, I have been using the rawhide mkinitrd which I expect has been picking up the rawhide glibc shared libraries. I've got a few up-to-date F8 systems on which I've tried creating new initrd images, but I then get into trouble with them not finding the path to the root filesystem (which is starting to get beyond my Linux knowledge). For further info, my disk layout is: /dev/sda2 is /boot with label "/boot" /dev/sda3 is VolGroup00 with LogVol00 as / and LogVol01 as swap, no labels
I had this exact problem. Here is how i fixed it 1. boot into a rescue cd 2. chroot into root filesystem (chroot /mnt/sysimage) 3. remove old /lib/ld-linux.so.n (was not associated with any rpm) 4. rebuild initrd /sbin/mkinitrd -f -v /boot/initrd-2.6.23.1-42.fc8.img 2.6.23.1-42.fc8 5. reboot so basically mkinitrd was picking up an old lib that was causing nash to segfault.
I don't think this is my problem. I have /lib/ld-linux.so.2 linked to /lib/ld-2.7.90.so and /lib/ld-lsb.so.3 linked to /lib/ld-linux.so.2 If I delete /lib/ld-linux.so.2 then the system won't even boot my working kernel! However, I've just tried the latest kernel (2.6.25-0.113.rc5.git2.fc9) and finally it doesn't segfault :) But instead I get a kernel panic at exactly the same point :( /bin/nash: /lib/libc.so.6: version 'GLIBC_2.8' not found (required by /lib/libglib-2.0.so.0) As I presume this kernel works on other systems, does this explain the earlier segfaults? If so, what is the fix? (I've tried mkinitrd on this latest initrd and I get exactly the same error).
/lib/libc.so.6: version 'GLIBC_2.8' not found Waste hours of brainfuckin to find reason of this problem. Most of current rawhide libs (glib, libselinux, libpam etc) requires GLIBC_2.8 (glibc-2.7.90-9.i686 idenifies as 2.8). My system was "rawhided" few years ago so some things is really old. The problem was in /lib: /lib/libc.so.6 was linked to /lib/libc.so.0 (some old file which not owned by any of installed packages) There also libc-2.7.90.so (actual glibc package). Just re-link: cd /lib && ln -sf libc-2.7.90.so libc.so.6 Now all ok !
I'm afraid this link already exists. I've done ldconfig (to make sure the cache is up-to-date) and then ldconfig -p | grep libc.so and I get libc.so.6 (libc6, OS ABI: Linux 2.6.9) => /lib/libc.so.6 Should I expect the Linux version to be greater than 2.6.9 ? Maybe it is still a missing link, but I've got no other libc.so link in /lib.
$ ls -l /lib/libc.so.6 lrwxrwxrwx 1 root root 9 Mar 23 23:14 /lib/libc.so.6 -> libc-2.7.90.so # ---> this is ok $ ldconfig -p|grep libc.so.6 libc.so.6 (libc6, OS ABI: Linux 2.6.9) => /lib/libc.so.6 The problem is before /lib/libc.so.6 was linked to libc.so.0 (too old version of libc not owned by any of current packages)
FIX FOUND!!! Segfaults seemed to have been due to picking up an old libc.so.6 as Chuck guessed in comment #10. The "version 'GLIBC_2.8' not found" error message seemed to confirm this. I finally discovered that mkinitrd was picking up the 2.7 libraries from a directory called /lib/i686/nosegneg I moved this directory out of the way and rebuilt the initrd and hey presto, no boot problems. I guess either the installer needs to check there aren't obsolete libraries in this directory or mkinitrd shouldn't look in this directory when building the initrd.
Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Is anyone still seeing this with F10/rawhide?
Problem not seen here since I applied the fix in comment #22.