|Summary:||mkinitrd appears to over-rely on some Fedora kernels features|
|Product:||[Fedora] Fedora||Reporter:||Michal Jaegermann <michal>|
|Component:||mkinitrd||Assignee:||Jeremy Katz <katzj>|
|Status:||CLOSED RAWHIDE||QA Contact:|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2004-09-28 04:00:21 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
|Bug Blocks:||123268, 133652|
Description Michal Jaegermann 2004-08-11 19:01:59 UTC
Description of problem: On x86 box apart of "regular" Fedora test kernels I have also some custom kernels for other development work. With With initrd images made with mkinitrd-4.0.3-1 I can boot both, say, 2.6.7-1.515 and my custom kernel if a command line is like this (for example): ro root=LABEL=/12 selinux=0 nousb The pictures changes considerably if I will append to the string above " 1" or " 3". Then 2.6.7-1.515 still boots and goes into a desired runlevel but with my custom kernel I see: ..... Switching to new root exec of init failed!!! 14 Kernel panic: Attempted to kill init! This does not change if with a custom kernel I will use a command line like "ro root=LABEL=/12 3" or "ro root=LABEL=/12 1". Only leaving a runlevel specification out allows me to boot. The situation was the same with the previous version of mkinitrd (and I do not know about earlier ones after a change in a type of produced images). OTOH I may be just lucky with Fedora kernels as some reports on fedora-test-list seem to suggest that with somewhat different hardware other people have similar troubles without any custom kernels in play and nothing added to command lines. Details in other cases are not that clear to me. Version-Release number of selected component (if applicable): mkinitrd-4.0.3-1 How reproducible: Always with my custom kernel.
Comment 1 Jeremy Katz 2004-08-11 19:57:45 UTC
Could you try again with mkinitrd 4.0.4 (will be at http://people.redhat.com/~katzj/mkinitrd/ as soon as it's done building)?
Comment 2 Michal Jaegermann 2004-08-11 23:32:03 UTC
I did; and not with very happy results. :-) Regardless of which kernel and which command line options I am using if initrd was done with that version of mkinitrd then I am invariably seeing this: .... Mounting root filesystem. mount: error 6 mouting ext3 Switching to new root switchroot: mount failed: 22 Kernel panic: Attempted to kill init! References to Catch-22? Oops!
Comment 3 Kaj J. Niemi 2004-08-12 13:20:42 UTC
The identical error actually happened to me and I'm using Fedora kernels. grub.conf looks as follows: default=0 timeout=10 title Red Hat Linux (2.6.7-1.517smp) root (hd0,0) kernel /boot/vmlinuz-2.6.7-1.517smp ro root=LABEL=/ acpi=on elevator=deadline initrd /boot/initrd-2.6.7-1.517smp.img
Comment 4 Kaj J. Niemi 2004-08-12 13:35:48 UTC
mkinitrd got upgraded to 4.0.4 before I installed 2.6.7-517smp.
Comment 5 Jeremy Katz 2004-08-12 17:54:44 UTC
I blame lack of sleep or some such, http://people.redhat.com/~katzj/mkinitrd/ has mkinitrd-4.0.5 now which really should be better
Comment 6 Michal Jaegermann 2004-08-12 18:45:06 UTC
Indeed mkinitrd-4.0.5 produces images which allow me to boot 2.6.7-517 and my custom kernel too as opposed to 4.0.4 where everything was blowing up. OTOH it is not that different from mkinitrd-4.0.3 in that that adding a runlevel specification for my custom kernel ends up with "exec of init failed!!! 14" although doing that with 2.6.7-517 or 2.6.7-517smp does not have any ill-effects. Nasty nash. :-)
Comment 7 Jeremy Katz 2004-08-12 22:51:40 UTC
Hrmm... do you have an init=? What does your grub.conf contain? Basically that's saying that the exec of init failed with -EFAULT which seems strange to say the least. Also, you can grab http://people.redhat.com/~katzj/nash-test, cp nash-test /sbin/nash and then remake your initrd to get a little bit more information on what it's exec'ing as init.
Comment 8 Michal Jaegermann 2004-08-12 23:49:56 UTC
> Hrmm... do you have an init=? In a command line? No. Options are listed in my original reports. They are: ro root=LABEL=/12 selinux=0 nousb Do you suggest that I have init if I using that but it magically disappears if I will add "3" to the end? Maybe this is the case but that is weird. I will see what I can get with 'nash-test' later.
Comment 9 Michal Jaegermann 2004-08-13 03:39:01 UTC
I replaced my nash with nash-test and put "id:3:initdefault" in /etc/inittab. After remaking initrd this is what it is on it beyond /proc and /dev drwxr-xr-x 2 root root 0 Aug 12 20:40 sysroot drwxr-xr-x 2 root root 0 Aug 12 20:40 sys drwxr-xr-x 2 root root 0 Aug 12 20:40 loopfs lrwxrwxrwx 1 root root 3 Aug 12 20:40 sbin -> bin drwxr-xr-x 2 root root 0 Aug 12 20:40 lib -rw-r--r-- 1 root root 86700 Aug 12 20:40 lib/jbd.ko -rw-r--r-- 1 root root 128732 Aug 12 20:40 lib/ext3.ko drwxr-xr-x 2 root root 0 Aug 12 20:40 bin lrwxrwxrwx 1 root root 10 Aug 12 20:40 bin/modprobe -> /sbin/nash -rwxr-xr-x 1 root root 66158 Aug 12 20:40 bin/nash -rwxr-xr-x 1 root root 152408 Aug 12 20:40 bin/insmod -rwxr-xr-x 1 root root 451 Aug 12 20:40 init drwxr-xr-x 2 root root 0 Aug 12 20:40 etc so it hardly can be simpler. Just re-made initrd does not bring much information so I edited 'init' script and replaced 'setquiet' with 'showlabels'. With that if I am booting with "ro root=LABEL=/12 selinux=0 nousb" on in kernel options then I see this: Red Hat nash version 4.0.5 starting /dev/hda1 / e1ab2ed6-5e19-11d6-908d-b85d44f2b93d /dev/hda5 /usr e71d092-5e19-11d6-80c1-c792d8497d9c /dev/hda7 /home e258cc44-5e19-11d6-9b84-aa6a385944ae /dev/hda8 spare12 32ade583-34ea-4e6d-8de6-20d2918b962a /dev/hdb1 /boot1 eb4ba56e-2874-48fa-839c-16fbe8c4abae /dev/hdb5 /1 f9ad2458-8dc2-476-8687-dd4a5c1812b5 /dev/hdb6 /usr1 0ac3cb0-1325-4981-b8aa-f6fa381ab8c /dev/hdb7 /var1 5b3a6d74-db61-4e7-a595-91d5de92b38 /dev/hdb8 /home1 4f475734-474b-4195-a6fa-87f12e681af0 /dev/hdb9 /12 48b37e2c-7284-4643-aa9e-1645b2926e0 /dev/hdb10 /usr12 aabb63c0-2d93-44bb-a2e-fd7466e265f Mounted /proc filesystem Mounting sysfs Loading jbd.ko module Loading ext3.ko module Creating block devices Creating root device Mounting root filesystem kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. Switching to new root INIT: version 2.85 booting and the whole startup proceeds normally. If I will use "ro root=LABEL=/12 selinux=0 nousb 3" then everything looks the same up to "Switching ..." line. Then I get: .... Switching to new root exec of init (/sbin/init) failed!!!: 14 Kernel panic: Attempted to kill init! Not that much I can see from nash, I am afraid.
Comment 10 Michal Jaegermann 2004-08-13 03:48:12 UTC
BTW - using instead "ro root=LABEL=/12 selinux=0 nousb 3" an explicit "ro root=/dev/hdb9 selinux=0 nousb 3" does not help. I just tried and ended up with the same "... failed!!!: 14".
Comment 11 Jeremy Katz 2004-08-13 17:18:18 UTC
Okay added even more debugging printfs to nash's exec of init. New nash at http://people.redhat.com/~katzj/nash-test-2. If you could grab that and remake your initrd and let me know what output you get, that would be helpful.
Comment 12 Michal Jaegermann 2004-08-13 19:19:24 UTC
It prints now, apart from what was printing before, ..... Switching to new root initargs: ro initargs: root=LABEL=/12 initargs: selinux=0 initargs: nousb INIT: version 2.85 booting in a "good" case and ..... Switching to new root initargs: ro initargs: root=LABEL=/12 initargs: selinux=0 initargs: nousb initargs: 3 exec of init (/sbin/init) failed!!!: 14 Kernel panic: Attempted to kill init! when this "3" is added. Otherwise not much changed. It would be likely good to check if it is not lying about mounting the real root and/or of a presence /sbin/init but I am not sure how to do that from nash. Dump /proc/mounts?
Comment 13 Steve Grubb 2004-08-14 17:33:57 UTC
Created attachment 102729 [details] Patch that fixes many bugs I gave mkinitrd a code review and found all kinds of bugs. There were uninitialized variables getting used, memory leaks, negative array indexing, important code for device numbers effectively commented out, and execv was being called with stack variables. Please apply this patch. It does change some of the error reported, hopefully for the better.
Comment 14 Steve Grubb 2004-08-14 18:00:35 UTC
Created attachment 102734 [details] Patch that fixes many bugs I sent the second to last patch last time...sorry.
Comment 15 W. Michael Petullo 2004-08-14 18:11:22 UTC
The patch contained in comment #14 also seems to fix bug #129836 for me.
Comment 16 Michal Jaegermann 2004-08-14 21:34:23 UTC
Created attachment 102738 [details] missing declarations in nash.c patch Another small patch to clean up missing declarations in nash.c on the top of the previous one. Substituting on my initrd 'nash' recompiled with these indeed clears the issue for me. After those patches remaining warnings are about an implicit dropping of 'const' qualifier from some pointers.
Comment 17 Jeremy Katz 2004-08-16 17:16:37 UTC
Applied most of the patch here. Some of it isn't quite right and the use of the numeric errnos instead of strerror is intentional (avoids bringing in the strings). All in mkinitrd-4.0.6 -- thanks for the patches.
Comment 18 Michal Jaegermann 2004-08-16 23:08:52 UTC
So far mkinitrd-4.0.6 is in "worksforme" category. I could not reproduce with it troubles I had with other versions. Out of curiosity I looked how much strings bring in from dietlibc and this does not look like a lot. Many modules one may need will be much bigger than that. Although I agree that less bytes on mkinitrd the better.
Comment 19 Steve Grubb 2004-08-19 18:07:18 UTC
Created attachment 102889 [details] Final Patch I reviewed the latest changes. Thanks for applying the bulk of them. The original bug is fixed as far as I can tell. In the interest of having clean code, I have one last set of patches that can be applied against mkinitrd-4.06. They were in the original patch I sent. In grubby.c @ 1536, there really is a potential memory leak. The call to free fixes it. In nash.c @ 59, I moved the allocation of memory in order to avoid free'ing the memory if the open failed. @ 216, rc needs to be initialized. Otherwise it can return whatever the random value the stack has. If this chunk is applied, the other place where rc is set to 1 can be deleted. @ 624, a const was added so that gcc knows not to create the array of strings on the stack and to move it to the .rodata segment. @ 679, initargs was previously malloc'ed. Its first element may need to be set to NULL so exec doesn't derefence a bogus pointer. @ 767, devNum really is an int. If it were unsigned, the test for (devNum < 0) will never be true. I'm continuing to apply this smaller patch against my tree. I don't think there is anything here that is super critical. But it would be nice if we were in sync so I can drop the patch.
Comment 20 Jeremy Katz 2004-08-24 19:14:57 UTC
Applying with the following caveat: * nash.c:216 - rc should be initialized to 0, not 1. Otherwise, we'd just always end up returning 1 (which isn't the intent, 1 is an error) Thanks again for the patches
Comment 21 Jef Spaleta 2004-09-28 04:00:21 UTC
closing this out as resolved rawhide, since discussion in the report has died off and the initial problem is confirmed by the orignal reporter as being resolved in the comments. -jef