After fresh install of Fedora 9 on two disk in software-RAID, the booting failes with following: root (hd0,0) Filesystem type is ext2fs, partition type 0xfd kernel /boot/vmlinuz-2.6.25-14.fc9.x86_64 ro root=UUID=576c8a30-5637-4b72-93a1-d5fbcb800169 rhdb quiet [Linux-bzImage, setup=0x2e00, size=0x1f63b8] initrd /boot/initrd-2.6.25-14.fc9.x86_64.img [Linux-initrd @ 0x37cdf000, 0x31080c bytes] Decompressing Linux... done. Booting the kernel. Red Hat nash version 6.0.52 starting device-mapper: table: 253:0: mirror: Device lookup failure device-mapper: reload ioctl failed: No such device or address mdadm: /dev/md0 has been started with 1 drive (out of 2). mdadm: /dev/md1 has been started with 1 drive (out of 2). device-mapper: table ioctl failed: No such device or address device-mapper: deps ioctl failed: No such device or address init[1]: segfault at 10 ip 7f1d8a3b5004 sp 7fff927c2ef8 error 4 in libdevmapper.so.1.02[7f1d8a3a7000+15000] nash received SIGSEGV! Backtrace (16): /bin/nash[0x40d093] /lib64/libc.so.6[0x7f1d87cce2a0] /lib64/libdevmapper.so.1.02(dm_task_get_deps+0x4)[0x7f1d8a3b5004] /usr/lib64/libnash.so.6.0.52[0x626c1b] /usr/lib64/libnash.so.6.0.52(nashDmDevGetName+0x3d)[0x627ab0] /usr/lib64/libnash.so.6.0.52[0x6242e3] /usr/lib64/libnash.so.6.0.52[0x6243fc] /usr/lib64/libnash.so.6.0.52(nashBdevIterNext+0x120)[0x624871] /usr/lib64/libnash.so.6.0.52[0x624abb] /usr/lib64/libnash.so.6.0.52(nashFindFsByName+0x60)[0x624b86] /usr/lib64/libnash.so.6.0.52(nashAGetPathBySpec+0x86)[0x624c73] /bin/nash[0x408b24] /bin/nash[0x40cf49] /bin/nash[0x40d576] /lib64/libc.so.6(__libc_start_main+0xfa)[0x7f1d87cba32a] /bin/nash[0x404509]
almost same problem on intel RAID, MB dp35dp, 3 SATA drives striped: on install all fine, but during first boot table: 253:0 striped segfault at 10 Couldn't parse string destination error 4 in libdevmapper.so.1.02 nash received SIGSEGV
Yes, I forgot to mention I have an internal PCI SATA RAID card (4-port HighPoint HPT374 chip which is RocketRAID 1640). I installed with it because I had two disks with data attached to it and it was working fine under FC3. Now even if I remove it and boot, I still get the segfault. I guess re-installing FC9 without the card would make it work, but I need the card to access the drives (don't have enough SATA ports on motherboard).
This bug is visible also on simple notebook T61. Problem might be in the incorrect ram disk creation - for now check the bug #443332.
Since this is an important server to me I've used the workaround of re-installing FC9 without the PCI SATA RAID card and added the card after. I will not be able to replicate the problems as I don't want to re-build this system anymore.
I'm having problems that seem very similar: https://bugzilla.redhat.com/show_bug.cgi?id=447724 In fact much of the backtrace I get is very similar to the data originally posted for this bug report, though there are some differences to be seen in the console text and in the particulars of the system configurations.
*** Bug 447724 has been marked as a duplicate of this bug. ***
initrd/nash debugging first I think
It looks like one bug is in the nash - it doesn't properly check the result after dm_task_run - i.e. missing dm_task_get_info - so the device without table crashes nash - this should be easy to fix - however another part of the problem is that this device should not have the empty table.
In the case where I encountered the bug I had selected an encrypted LVM with the anaconda check-box. Shouldn't it be right about in the area of having booted from the clear text boot and proceeding to start mounting the encrypted root that it should ask for a password before it can really find/access the LVM? Or is it crashing just before that code? Also it is noteworthy that I had just installed the system using the F9 media supplemented by the default internet based repository, so clearly at install time it had been able to probe / configure / create / access the LVM and other system storage devices, so something that works apparently fine at install time mustn't be doing so at first-boot time. In case it is relevant, during install on the same system I did get a dialog box about some unreadable storage devices *not* used for Fedora 9 install and not selected to be mounted in the installed system: https://bugzilla.redhat.com/show_bug.cgi?id=447729 WARNING The partition table on device mapper/mpath0 was unreadable. To create new partitions it must by initialized, causing the loss of ALL DATA on this drive. This operation will override any previous installation choices about which drives to ignore. Would you like to initialize this drive, erasing ALL DATA? -- I answered NO, so out of 5 physical drives, only 'sde' contained "/" and "/boot" etc. and the other drives were variously blank or parts of an old unrelated irrelevant to FC9 RAID set or had ext3 on them. AHCI mode was selected for all drives, four SATA, one PATA, split between ICH9R SATA ports and the motherboard's JMicron PATA/eSATA controller. In any case the system is still in the broken state and I haven't found a palatable work-around (not wanting to rebuild the box and unplug the drives yet), so if there are any helpful diagnostics I could run, I'm glad to try. However, within a day or two I may put the system into service, though, which might limit debug capability subsequently.
nash segfault reproducer: 1) run in one terminal: while :; do dmsetup create xxx --notable ; dmsetup remove xxx ; done (so device xxx is appearing and disappearing continuously) 2) in second terminal run mkinitrd # mkinitrd --force-lvm-probe -v /boot/initrd-2.6.26-rc3.img 2.6.26-rc3 Creating initramfs device-mapper: table ioctl failed: No such device or address device-mapper: deps ioctl failed: No such device or address nash received SIGSEGV! Backtrace (16): /sbin/nash[0x40d093] /lib64/libc.so.6[0x7f6ada8412a0] /lib64/libdevmapper.so.1.02(dm_task_get_deps+0x4)[0x7f6adcf23004] /usr/lib64/libnash.so.6.0.52[0x7f6add33ec1b] /usr/lib64/libnash.so.6.0.52(nashDmDevGetName+0x3d)[0x7f6add33fab0] /usr/lib64/libnash.so.6.0.52[0x7f6add33c2e3] /usr/lib64/libnash.so.6.0.52[0x7f6add33c3fc] /usr/lib64/libnash.so.6.0.52(nashBdevIterNext+0x120)[0x7f6add33c871] /usr/lib64/libnash.so.6.0.52[0x7f6add33cabb] /usr/lib64/libnash.so.6.0.52(nashFindFsByName+0x60)[0x7f6add33cb86] /usr/lib64/libnash.so.6.0.52(nashAGetPathBySpec+0x86)[0x7f6add33cc73] /sbin/nash[0x4088a1] /sbin/nash[0x40cf49] /sbin/nash[0x40d576] /lib64/libc.so.6(__libc_start_main+0xfa)[0x7f6ada82d32a] /sbin/nash[0x404509] # rpm -q nash nash-6.0.52-2.fc9.x86_64 (used nash has already patched bug #443332 with this patch: @@ -984,7 +984,7 @@ nashDmDevGetType(nashContext *nc, dev_t while ((obj = dm_iter_next(iter, 1)) && (obj->devno != devno)) ; - if (obj) { + if (obj && obj->type) { strncpy(buf, obj->type, 31); dm_iter_destroy(iter); return buf; )
Please yum install mkinitrd-debuginfo so that we get to see what line it fails on. It may be a manifestation of a buffer overrun bug for which I posted a patch (possibly incorrectly) under bug 443332. For me the suggestion from Zdenek Kabalec from 443332 doesn't work, the obj->type is never zero (for me) when the memory isn't corrupted in the first place. YMMV...
Created attachment 311365 [details] IMAGE: traceback when entering wrong password for encrypted device I'm seeing a similar traceback (see image) when I enter wrong password for the encrypted device. This is on a Xen PV guest with stock F9 install. Will apply latest updates and try again.
comment #12 happens when entering wrong password for the encrypted device 3 times (xen pv guest on i386). RPM versions: mkinitrd-6.0.52-2.fc9.i386 initscripts-8.76.2-1.i386 device-mapper-libs-1.02.24-11.fc9.i386 Please advice if I'm seeing the same bug or a different issue. Thanks!
Any news / possible workaround / fix on the horizon on this? Does it still happen with F10-alpha? IIRC someone was blaming dmraid/device-mapper problems on glibc2 versioning at one point either in FC8 or F9 in a different bug report or maybe a forum post. I think they said that reverting glibc2 to a previous version fixed their problem even though the new one was "officially" backward compatible. I don't unfortunately recall what their symptoms were, but if this problem is remaining totally elusive to identify / solve maybe it is worth a check. If a fix was added to a package in 'updates' would there there any easy way to perform a proper Fedora-9 install (on a system that would crash otherwise) inclusive of the fix short of doing an FTP/HTTP install? i.e. could one install from the F9 DVD with networking enabled and have the system install the non-updated software from DVD but install any available updates from the updates repository over the internet?
(In reply to comment #14) > Does it still happen with F10-alpha? > I still have a trace back with F10-alpha even after the 1st entry of wrong password, not the 3rd as in comment #13
Created attachment 314294 [details] screen shot of the segfault / xen guest
Hi Alexander, Comment 15 is an unrelated issue in the F10 Alpha. See the release notes for the alpha for more information.
I'm seeing a similar crashdump as in https://bugzilla.redhat.com/show_bug.cgi?id=446669#c10 I'm testing Fedora RawHide. I booted without quiet nor rhgb and noticed that the kernel panics, but when using rhgb nash seems to be spawned any way without having a clue that the kernel has taken a trip to never land.
After getting the password right in using rhgb, the distro booted any way. Even though it reported a kernel panic
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
It is fixed in FC11. I did upgrade from FC9 straight to FC11. It works fine now. Of course, with every upgrade, I have to clean up rpms on subject of duplex rpms, and unsatisfied dependencies, but this is minor nuisance.
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.