Bug 192157
Summary: | kernel 2.6.16-1.2204_FC6 refuses to boot - at least on x86_64 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michal Jaegermann <michal> |
Component: | mkinitrd | Assignee: | Peter Jones <pjones> |
Status: | CLOSED DUPLICATE | QA Contact: | David Lawrence <dkl> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | andreas.ossenbrueggen, davej, hoover, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-08-16 21:17:39 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michal Jaegermann
2006-05-17 22:52:16 UTC
The problem persist with 2.6.16-1.2206_FC6. A kernel which I can boot is 2.6.16-1.2074_FC6 (I could not touch my test box for a rather long time). The same problem with 2.6.16-1.2207_FC6. Things happen really quickly but it appears that 'trap divide error' shows up in initrd, although I am not really sure. OTOH redoing fresh initrd for one of my still booting kernels results in a bootable system. I will skip further comments until a new kernel with which I can boot. I took apart initrd and inserted a bunch of 'echo <something>' and sleep statements to see where things go haywire. There is a fragment there which goes: .... insmod /lib/dm-snapshot.ko mkblkdevs rmparts sdb dm create pdc_cjfeejidea 0 488397056 linear 8:16 0 .... The moment 'dm create ...' is called I am in a "trap divide error" loop. OTOH when I commented out in 'init' from initrd the following block ('rmparts sdb' and 'rmparts sdc' can be left uncommented): rmparts sdb dm create pdc_cjfeejidea 0 488397056 linear 8:16 0 dm partadd pdc_cjfeejidea rmparts sdc dm create pdc_cjhbfdhhaa 0 488397056 linear 8:32 0 dm partadd pdc_cjhbfdhhaa which is followed by (left intact): resume /dev/sda5 echo Creating root device. mkrootdev -t ext3 -o defaults,ro sda11 echo Mounting root filesystem. mount /sysroot echo Setting up other filesystems. setuproot echo Switching to new root and running init. switchroot then I can boot 2.6.16-1.2211_FC6 and I am running it right now. The point is that I can create now, without any manual intervention, an initrd for 2.6.16-1.2074_FC6 and it works. So something happened in the meantime which changes mutual expectations between kernel and dm. 'rmparts' and 'dm' are clearly internal 'nash' commands, although they are not documented as such, and if I will feed a commented out fragment to 'nash' under gdb, and with mkinitrd-debuginfo installed, then I see the following: Program received signal SIGFPE, Arithmetic exception. 0x000000000042a1b7 in _device_probe_geometry () (gdb) where #0 0x000000000042a1b7 in _device_probe_geometry () #1 0x000000000042a3ab in init_generic () #2 0x000000000042a82f in linux_new () #3 0x000000000041745c in ped_device_get () #4 0x0000000000401733 in nashDmCreatePartitions ( path=0x66e940 "/dev/mapper/pdc_cjfeejidea") at dm.c:287 #5 0x000000000040827c in dmCommand ( cmd=0x66ea39 "rmparts sdc\ndm create pdc_cjhbfdhhaa 0 488397056 linear 8:32 0\ndm partadd pdc_cjhbfdhhaa\n", end=0x66ea38 "") at nash.c:1799 #6 0x0000000000408ee8 in runStartup (fd=6, name=0x7fff60e20bce "bomb.sh") at nash.c:2188 #7 0x0000000000409271 in main (argc=1, argv=0x7fff60e1f7f0) at nash.c:2290 #8 0x000000000047ca80 in __libc_start_main () #9 0x00000000004001b9 in _start () #10 0x00007fff60e1f7d8 in ?? () #11 0x0000000000000000 in ?? () (gdb) 'mkinitrd-debuginfo' unfortunately does not provide more than that. So ultimately this is fault of nash not following kernel changes or of kernel which pulls a carpet under nash? After the most recent updates (kernel-2.6.16-1.2215_FC6 and mkinitrd-5.0.41-1) I was able to boot again without modifications to initrd. Two things happened. One is that incriminated fragment of 'init' script, i.e. rmparts sdb dm create pdc_cjfeejidea 0 488397056 linear 8:16 0 dm partadd pdc_cjfeejidea rmparts sdc dm create pdc_cjhbfdhhaa 0 488397056 linear 8:32 0 dm partadd pdc_cjhbfdhhaa does not show up anymore. That means that a corresponding fragment of my new script now looks like follows: ..... insmod /lib/dm-snapshot.ko mkblkdevs resume /dev/sda5 echo Creating root device. ..... No idea if this is good or bad. I still cannot get lvm2 to do something with /dev/sdb and /dev/sdc devices does not matter what (cf. bug #176623). The second thing is that feeding nash those "lost lines" is not causing SIGFPE anymore. It just returns and I failed to observe any other effects. Here is a good initrd init script and a bad one.... When I add the lines missing, the kernel finds the raid0 and boots ok. Good.... #!/bin/nash mount -t proc /proc /proc setquiet echo Mounting proc filesystem echo Mounting sysfs filesystem mount -t sysfs /sys /sys echo Creating /dev mount -o mode=0755 -t tmpfs /dev /dev mkdir /dev/pts mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts mkdir /dev/shm mkdir /dev/mapper echo Creating initial device nodes mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mknod /dev/systty c 4 0 mknod /dev/tty c 5 0 mknod /dev/console c 5 1 mknod /dev/ptmx c 5 2 mknod /dev/rtc c 10 135 mknod /dev/tty0 c 4 0 mknod /dev/tty1 c 4 1 mknod /dev/tty2 c 4 2 mknod /dev/tty3 c 4 3 mknod /dev/tty4 c 4 4 mknod /dev/tty5 c 4 5 mknod /dev/tty6 c 4 6 mknod /dev/tty7 c 4 7 mknod /dev/tty8 c 4 8 mknod /dev/tty9 c 4 9 mknod /dev/tty10 c 4 10 mknod /dev/tty11 c 4 11 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 echo Setting up hotplug. hotplug echo Creating block device nodes. mkblkdevs echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" insmod /lib/sd_mod.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading sata_via.ko module" insmod /lib/sata_via.ko echo "Loading jbd.ko module" insmod /lib/jbd.ko echo "Loading ext3.ko module" insmod /lib/ext3.ko echo "Loading dm-mod.ko module" insmod /lib/dm-mod.ko echo "Loading dm-mirror.ko module" insmod /lib/dm-mirror.ko echo "Loading dm-zero.ko module" insmod /lib/dm-zero.ko echo "Loading dm-snapshot.ko module" insmod /lib/dm-snapshot.ko echo Making device-mapper control node mkdmnod mkblkdevs rmparts sdb rmparts sda dm create via_ecfdfiehfa 0 312499998 striped 2 128 8:0 0 8:16 0 dm partadd via_ecfdfiehfa echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01 echo Creating root device. mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00 echo Mounting root filesystem. mount /sysroot echo Setting up other filesystems. setuproot echo Switching to new root and running init. switchroot Bad......... PAE kernels. #!/bin/nash mount -t proc /proc /proc setquiet echo Mounting proc filesystem echo Mounting sysfs filesystem mount -t sysfs /sys /sys echo Creating /dev mount -o mode=0755 -t tmpfs /dev /dev mkdir /dev/pts mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts mkdir /dev/shm mkdir /dev/mapper echo Creating initial device nodes mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mknod /dev/systty c 4 0 mknod /dev/tty c 5 0 mknod /dev/console c 5 1 mknod /dev/ptmx c 5 2 mknod /dev/rtc c 10 135 mknod /dev/tty0 c 4 0 mknod /dev/tty1 c 4 1 mknod /dev/tty2 c 4 2 mknod /dev/tty3 c 4 3 mknod /dev/tty4 c 4 4 mknod /dev/tty5 c 4 5 mknod /dev/tty6 c 4 6 mknod /dev/tty7 c 4 7 mknod /dev/tty8 c 4 8 mknod /dev/tty9 c 4 9 mknod /dev/tty10 c 4 10 mknod /dev/tty11 c 4 11 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 echo Setting up hotplug. hotplug echo Creating block device nodes. mkblkdevs echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" insmod /lib/sd_mod.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading sata_via.ko module" insmod /lib/sata_via.ko echo "Loading jbd.ko module" insmod /lib/jbd.ko echo "Loading ext3.ko module" insmod /lib/ext3.ko echo "Loading dm-mod.ko module" insmod /lib/dm-mod.ko echo "Loading dm-mirror.ko module" insmod /lib/dm-mirror.ko echo "Loading dm-zero.ko module" insmod /lib/dm-zero.ko echo "Loading dm-snapshot.ko module" insmod /lib/dm-snapshot.ko echo Making device-mapper control node mkdmnod echo Attaching to iSCSI storage mkblkdevs echo Scanning logical volumes lvm vgscan --ignorelockingfailure echo Activating logical volumes lvm vgchange -ay --ignorelockingfailure VolGroup00 resume /dev/VolGroup00/LogVol01 echo Creating root device. mkrootdev -t ext3 -o defaults,ro /dev/VolGroup00/LogVol00 echo Mounting root filesystem. mount /sysroot echo Setting up other filesystems. setuproot echo Switching to new root and running init. switchroot Here is the output of mkinitrd -v -f /tmp/foo.img $(uname -r) Creating initramfs Looking for deps of module sata_via: libata scsi_mod Looking for deps of module libata: scsi_mod Looking for deps of module scsi_mod Looking for deps of module sd_mod: scsi_mod Looking for deps of module ide-disk Looking for deps of module ext3: jbd Looking for deps of module jbd Looking for driver for device mapper/via_ecfdfiehfap2 Looking for deps of module dm-mod Looking for deps of module dm-mirror: dm-mod Looking for deps of module dm-zero: dm-mod Looking for deps of module dm-snapshot: dm-mod Using modules: /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/scsi_mod.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/sd_mod.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/libata.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/sata_via.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/fs/jbd/jbd.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/fs/ext3/ext3.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-mod.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-mirror.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-zero.ko /lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-snapshot.ko /sbin/nash -> /tmp/initrd.qW9233/bin/nash /sbin/insmod.static -> /tmp/initrd.qW9233/bin/insmod copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/scsi_mod.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/scsi_mod.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/sd_mod.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/sd_mod.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/libata.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/libata.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/scsi/sata_via.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/sata_via.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/fs/jbd/jbd.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/jbd.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/fs/ext3/ext3.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/ext3.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-mod.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/dm-mod.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-mirror.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/dm-mirror.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-zero.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/dm-zero.ko' [elf32-i386] copy from `/lib/modules/2.6.17-1.2307_FC6PAE/kernel/drivers/md/dm-snapshot.ko' [elf32-i386] to `/tmp/initrd.qW9233/lib/dm-snapshot.ko' [elf32-i386] /sbin/lvm.static -> /tmp/initrd.qW9233/bin/lvm /etc/lvm -> /tmp/initrd.qW9233/etc/lvm `/etc/lvm/lvm.conf' -> `/tmp/initrd.qW9233/etc/lvm/lvm.conf' Adding module scsi_mod Adding module sd_mod Adding module libata Adding module sata_via Adding module jbd Adding module ext3 Adding module dm-mod Adding module dm-mirror Adding module dm-zero Adding module dm-snapshot Just to add an additional datapoint to this - I just installed a new AMD64 box in 64 bit mode using the FC5 respin from Fedora Unity. The machine is using nvidia SATA raid and the RAID-1 set showed up using device mapper. With the original installation everything worked fine. Then I yum updated to all of the latest packages. On a boot with the new kernel (2.6.17-1.2139_FC5) it got the same trap divide errors as listed in this bug. Booting the updated box with the older kernel (2.6.16-1.2122_FC5) works fine. (I had seen this same behavior with a 32 bit install, but just resinstalled it all now in 64 bit to check) > On a boot with the new kernel (2.6.17-1.2139_FC5) it got the
> same trap divide errors as listed in this bug
AFAICT this is not a kernel problem but of initrd. More precisely,
of nash which is used by initrd. Did you update 'mkinitrd' package
before installing 2.6.17-1.2139_FC5?
FWIW I have here an x86_64 machine running 2.6.17-1.2139_FC5 right now;
but it is not using RAID.
Last time I checked nash in 'mkinitrd' from the current rawhide did
not suffer from this problem. Quite possibly that mkinitrd will work
on an FC5 installation but I do not know that for sure (maybe recompilation
is needed?). You may try but make sure that you can back off.
To re-state the obvious: it is not enough to replace 'mkinitrd'; you have
also redo initrd which gives you troubles.
My point is that in my case I am only installing the released updates from the FC5 updates using yum and this kernel has this problem. I don't know how the release system built the kernel. I have not built anything myself. *** This bug has been marked as a duplicate of 199224 *** |