Bug 196556 - Fedora hangs on boot when kernel-2.6.17-1.2139_FC5smp is used.
Fedora hangs on boot when kernel-2.6.17-1.2139_FC5smp is used.
Status: CLOSED DUPLICATE of bug 189708
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
5
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Alasdair Kergon
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-06-24 13:47 EDT by Ondrej Dolak
Modified: 2007-11-30 17:11 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-31 00:13:06 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
lspci (1.90 KB, text/plain)
2006-06-24 13:47 EDT, Ondrej Dolak
no flags Details

  None (edit)
Description Ondrej Dolak 2006-06-24 13:47:13 EDT
Description of problem:
Fedora hangs on boot when kernel-2.6.17-1.2139_FC5smp is used.

Version-Release number of selected component (if applicable):
kernel 2.6.17-1.2139_FC5smp

How reproducible:
always

Steps to Reproduce:
1. choose kernel 2.6.17 from grub
2. boot

  
Actual results:
- hangs

Expected results:
- boot to prompt

Additional info:
work in previous kernels e.g. 2.6.16-1.2133_FC5smp
Comment 1 Ondrej Dolak 2006-06-24 13:47:13 EDT
Created attachment 131483 [details]
lspci
Comment 2 Ondrej Dolak 2006-06-24 13:53:06 EDT
Sorry, I forgot one important thing, it hangs after device-mapper is initialized.
Comment 3 Dmitry Burstein 2006-06-24 18:11:53 EDT
Same for me: I have Intel Corporation 82801ER (ICH5R) SATA Controller (rev 02)
Comment 4 Gawain Lynch 2006-06-24 19:25:48 EDT
Also have Intel Corporation 82801EB (ICH5) SATA Controller (rev 02) and is
hanging at `Making device-mapper control node'
Comment 5 Vic Ricker 2006-06-25 19:16:58 EDT
I'm having the same problem.  It hangs after "device-mapper: 4.6.0 - ioctl
(2006-02-07) initialized: dm-devel@redhat.com"

I'm running the i686 uniprocessor kernel on AMD64/MSI K8N Neo2.

My board has 2 nVidia SATA controllers but I don't use them so they were
disabled in the BIOS.  I didn't know what they were so I re-enabled them so I
could do lspci:
00:09.0 IDE interface: nVidia Corporation CK8S Serial ATA Controller (v2.5) (rev a2)
00:0a.0 IDE interface: nVidia Corporation CK8S Serial ATA Controller (v2.5) (rev a2)
Comment 6 Keith G. Robertson-Turner 2006-06-25 19:51:42 EDT
Also hanging here at the following point:

#] device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com
#] Loading dm-mirror.ko module
#] Loading dm-zero.ko module
#] Loading dm-snapshot.ko module

System then hangs indefinately, but *does* respond to <CTRL><ALT><DEL> and
reboots cleanly.

Mobo is Giga-byte GA8ANXP-D with Intel ICH6R, two Raptors (unused by Linux)
containing a WinXP install on a striped fake/bios-raid. Linux is installed on
/dev/hda on the standard IDE bus (not SATA). Works fine on all previous kernel
releases.

Am rebuilding now to disable the experimental device-mapper support, since all
other attempts to disable it have failed, including:

1) ... Commenting out the whole of the "#device mapper & related initialization"
section of /etc/rc.d/rc.sysinit

2) ... Turning off the mdmpd service

3) ... Even deleting the three modules, dm-mirror.ko, dm-zero.ko, and dm-snapshot.ko

Yet still I get the above message, followed by a hang, which is very odd
considering that the referenced modules have been deleted???
Comment 7 Keith G. Robertson-Turner 2006-06-25 20:13:58 EDT
Just a note while I rebuild, I am seeing a large amount of the following warnings:

".config:<nnnn>:warning: trying to reassign symbol <OPT>"

Where nnnn is a series of numbers and OPT is pretty much every config option,
i.e. PCI, ISA, HOTPLUG, AGP, I2C, etc.

Some also read "trying to reassign nonexistant sysmbol", e.g. XEN_PHYSDEV_ACCESS.

An earlier rebuild succeeded however (with the same warnings), and this is
probably not related to this bug.

Is this due to the experimental module versioning feature?
Comment 8 Dmitry 2006-06-26 13:12:40 EDT
My box hangs at 

>> device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com

Though I don't think I have an SMP kernel (unless all of them are SMP). The box
is ASUS Terminator C3, based on VIA C3 processor. There's one 160GB Seagate SATA
drive and 512MB RAM in it. 
Comment 9 Keith G. Robertson-Turner 2006-06-27 09:56:42 EDT
The rebuild didn't help:

#] Loading ext3.ko module
#] /proc/misc: Mo entry for device-mapper found.
#] Is device-mapper driver missing from kernel?
#] nash received SIGSEGV! Backtrace:
<...>
#] kernel panic - not syncing: Attempted to kill init!

So it looks like we're absolutely dependent on device-mapper now, even on
systems with no LVM or RAID.

Anyway, this is obviously +UPSTREAM.
Comment 10 Leo Canale 2006-06-27 11:52:45 EDT
My box hangs at
device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com

Also.
I tried both kernel-2.6.17-1.2139_FC5smp and kernel-2.6.17-1.2139_FC5. I have 2
sata 80g raid 1. It is an Intel raid chip set. I have 1G of RAM in this machine.
The kerne-2.6.16-1.2133_FC5smp works.

My other box is a non raid and using kernel-2.6.17-1.2139_FC5 it boots ok but is
spits out a lot of info. 

Comment 11 Andreas O. 2006-06-30 17:33:21 EDT
I've the same prob on a Promise FastTrak 20276 onboard SATA-RAID controller with
two HDs attached as RAID 0. The boot process of the 2.6.17 (non smp) kernel
hangs at the same position - device-mapper initialised.

A few lines above "sdb: unknown partition table" is written. It seems that the
stripe set is not recognized but both discs are handled seperately!? Kernel
2.6.16 is still working.
Comment 12 Alasdair Kergon 2006-06-30 20:33:14 EDT
Is this a duplicate of bug 186842 or something different?
Comment 13 Leo Canale 2006-07-01 08:59:31 EDT
Is this a duplicate of bug 186842 or something different?

No way it boots with the previous kernel

[engwnbie@smokey ~]$ su -
Password:
[root@smokey ~]# dmraid -r
/dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
/dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
[root@smokey ~]# dmraid -s
*** Group superset isw_ececagfhaj
--> Active Subset
name   : isw_ececagfhaj_RAID_Volume1
size   : 160086016
stride : 128
type   : mirror
status : ok
subsets: 0
devs   : 2
spares : 0
[root@smokey ~]# dmraid -rD
/dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
/dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
[root@smokey ~]#
Comment 14 Dmitry Burstein 2006-07-01 15:04:01 EDT
This is not a duplicate of bug 186842: everything is working for me with
kernel-smp-2.6.16-1.2133_FC5, but hangs on kernel-smp-2.6.17-1.2139_FC5.
I've tried to install the latest of dmraid (1.0.0.rc11) and device-mapper
(1.02.07) from the development branch, but with no positive results.
Just for your information, the output of "dmraid -rD" is:

/dev/sda: isw, "isw_ecidiahfeh", GROUP, ok, 160086526 sectors, data@ 0
/dev/sdb: isw, "isw_ecidiahfeh", GROUP, ok, 160086526 sectors, data@ 0
Comment 15 fimefija 2006-07-03 07:13:41 EDT
Under kernal-2.6.16 by boot raid 0 volume:-
[root@www ~]# dmraid -s
*** Active Set
name   : pdc_fiagfhab
size   : 625163264
stride : 128
type   : stripe
status : ok
subsets: 0
devs   : 2
spares : 0

All FC5 kernel-2.6.17 including 2139  won't boot on my system that boots a raid
0 array on a Promise PDC20376 (FastTrak) :-

#Loading jbd.ko module
#Loading ext3.ko module
#Locading dm-mod.ko module
#device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel@redhat.com
#Loading dm-mirror.ko module
#Loading dm-zero.ko module
#Loading dm-snapshot.ko module  
#Making device-mapper control mode 

Kernel 2.6.17-2139_FC The hangs !!
Comment 16 Ondrej Dolak 2006-07-04 17:42:35 EDT
Still doesn't work under kernel-2.6.17-1.2145_FC5-smp-i686.

My raid for additional info:

sudo dmraid -rD
/dev/sda: pdc, "pdc_gdfdahcie", mirror, ok, 156250000 sectors, data@ 0
/dev/sdb: pdc, "pdc_gdfdahcie", mirror, ok, 156250000 sectors, data@ 0

sudo dmraid -s
*** Active Set
name   : pdc_gdfdahcie
size   : 156250000
stride : 128
type   : mirror
status : ok
subsets: 0
devs   : 2
spares : 0
Comment 17 Keith G. Robertson-Turner 2006-07-05 00:04:49 EDT
Re: comment 16

Confirmed.
Comment 18 Dmitry 2006-07-09 00:26:47 EDT
Same here, 2145 hangs on boot as well. My machine does not have RAID, so it
seems that the issue is not RAID-related.
Comment 19 Keith G. Robertson-Turner 2006-07-09 05:10:37 EDT
Re: comment 18

Can you boot to runlevel 3 only, and please describe the last few lines of
output before the hang? I.e. is it the same as in comment 6 ?

If not, and you really don't have SATA RAID (in use or otherwise) then you
should open a separate bug report.
Comment 20 Leo Canale 2006-07-09 10:37:43 EDT
Re: comment 19
I installed 2145 also. My system is a Raid 1 configuration it will boot with
kernel-smp-2.6.16-1.2133_FC5 see comment 13. But with 2145 and 2139 it wont let
me boot at all. So I cannot boot runlevel 3. My system with 2139 only gets to
this line. 
#] device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com
With 2145 all it gets to is
Uncompressing Linux.. Ok, booting the kernel
Red Hat nash version 5.0.32 starting

Then it just sits there.
Comment 21 Keith G. Robertson-Turner 2006-07-09 14:13:31 EDT
Leo, I was addressing Comment #18 From Dmitry, which seems like another issue
since he does not have raid, although yes the 2.6.17 kernels seem to have
multiple issues, which will all need addressed in bug reports.
Comment 22 Dmitry 2006-07-09 18:03:54 EDT
Keith, my situation is the same as Leo's. It boots to the same lines (in both
cases) and then just sits there. So I'm afraid I can't boot into RL3 either. If
there's anything else you'd like me to try, I could do that today (tomorrow the
server is going back into its closet, running an older kernel until this bug is
fixed).

One other bit of info. I don't have a RAID array, but I do use LVM (since it's a
default install option).

Folks who reported raid issues where it hangs at different lines of output, seem
to be reporting a different issue.
Comment 23 Keith G. Robertson-Turner 2006-07-09 20:58:06 EDT
Mine is the opposite situation; I don't have any LVM filesystems, but I do have
an (unused by Linux) SATA BIOS RAID (ICH6R) used by Windows. It's not set to
automount, although I believe "dmraid -ay" is called by init, and the new(ish)
HAL/udev stuff seems to be doing the same during stage1 (which might account for
why I can't disable it), but does not account for why a kernel rebuild (omitting
device-mapper) also fails (absolute dependency on DM?).

Anyway, I am *not* the bug assignee, nor the package maintainer, I'm just an
affected user much like yourself. This does AFAICT appear to be purely an
upstream issue at kernel dev, and short of patch workarounds, there really isn't
much we can do. There are some fairly major changes in 2.6.17 compared to
2.6.16, and as I've said elsewhere, it looks like there's going to be quite a
few teething problems with this release.

My advice is (not that we have much choice right now) stick with a working
kernel and resist updates until the resolution has been found. It's a poor
choice for those hoping for kernel updates to resolve earlier issues, but that's
the way it is right now.
Comment 25 Leo Canale 2006-07-10 20:39:27 EDT
Keith Comment 21 I realized after I posted and read again what you meant. I 
was going to post to quantify, but never made it. Also like you I wish to 
help, I'm not wining. You are right there is a lot of noise on this release of 
the kernel from most distro's See here: https://www.redhat.com/archives/fedora-
test-list/2006-July/msg00048.html. I don't think they know what caused it yet. 
Comment 26 Leo Canale 2006-07-10 20:40:34 EDT
Keith Comment 21 I realized after I posted and read again what you meant. I 
was going to post to quantify, but never made it. Also like you I wish to 
help, I'm not wining. You are right there is a lot of noise on this release of 
the kernel from most distro's See here: https://www.redhat.com/archives/fedora-
test-list/2006-July/msg00048.html. I don't think they know what caused it yet. 
Comment 27 Chris Wiita 2006-07-14 18:38:03 EDT
I am having exactly the same error as Leo.  Booting off a promise tx150 in
mirrored mode.  Not using an SMP kernel.
Comment 28 Leo Canale 2006-07-15 11:42:28 EDT
Just installed updates.
Still doesn't work under kernel-2.6.17-1.2157_FC5-smp-i686.
Comment 29 Keith G. Robertson-Turner 2006-07-17 19:51:45 EDT
Wahey!

kernel-smp-2.6.17-1.2157_FC5.i686 WorksForMe®.

No errors or warnings.

Well done Dave and Juan!
Comment 30 fimefija 2006-07-18 06:33:35 EDT
Still can't boot kernel-2.6.17-1.2157_FC5

Under kernal-2.6.16 by boot raid 0 volume:-
[root@www ~]# dmraid -s
*** Active Set
name   : pdc_fiagfhab
size   : 625163264
stride : 128
type   : stripe
status : ok
subsets: 0
devs   : 2
spares : 0

All FC5 kernel-2.6.17 including 2157  won't boot on my system that boots a raid
0 array on a Promise PDC20376 (FastTrak) :-

#Loading jbd.ko module
#Loading ext3.ko module
#Locading dm-mod.ko module
#device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel@redhat.com
#Loading dm-mirror.ko module
#Loading dm-zero.ko module
#Loading dm-snapshot.ko module  
#Making device-mapper control mode 

kernel-2.6.17-1.2157_FC5 still hangs following device-mapper and is obviously
still failing to find the RAID 0 volume  /dev/mapper/pdc_fiagfhab 
Comment 31 Dmitry 2006-07-18 23:08:26 EDT
2157 still hangs on

>> Uncompressing Linux.. Ok, booting the kernel
>> Red Hat nash version 5.0.32 starting

in my case. Back to 33.
Comment 32 Dmitry 2006-08-01 01:27:50 EDT
I believe this is related to 196626. Fix suggested there may fix this bug as well.
Comment 33 Gawain Lynch 2006-08-01 03:26:37 EDT
Well, well, well... Three holes in the ground.

This indeed fixed it for me, rebuilt parted using the rawhide version and then
rebuilt mkinitrd against that and all is goodness!
Comment 34 Ondrej Dolak 2006-08-02 15:28:39 EDT
(In reply to comment #32)
> I believe this is related to 196626. Fix suggested there may fix this bug as well.

Confirmed.
Rebuilding parted, mkinitrd and reinstaling kernel fix this :)
Comment 35 fimefija 2006-08-13 07:48:36 EDT
kernel-2.6.17-1.2174_FC5 still fails to boot my  raid
0 array on a Promise PDC20376 (FastTrak). Hangs after device-mapper:-

#Loading jbd.ko module
#Loading ext3.ko module
#Locading dm-mod.ko module
#device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel@redhat.com
#Loading dm-mirror.ko module
#Loading dm-zero.ko module
#Loading dm-snapshot.ko module  
#Making device-mapper control mode 

Under kernal-2.6.16 by boot raid 0 volume:-
[root@www ~]# dmraid -s
*** Active Set
name   : pdc_fiagfhab
size   : 625163264
stride : 128
type   : stripe
status : ok
subsets: 0
devs   : 2
spares : 0

Still having to stay with 2.6.16-1.2133_FC5
Comment 36 Jason Smyth 2006-08-19 05:41:59 EDT
Re: Comment 35
Did you try the fix posted in Bug 196626, fimefija? It looks like your system is
stopping at exactly the same point mine was, and this fix seems to have worked
for me:

#yum -y --enablerepo development update mkinitrd
#mv /boot/initrd-2.6.17-1.2174_FC5.img /boot/initrd-2.6.17-1.2174_FC5.img.old
#mkinitrd /boot/initrd-2.6.17-1.2174_FC5.img 2.6.17-1.2174_FC5

Your RAID array information looks similar to mine, as well:

*** Active Set
name   : pdc_fejcbccf
size   : 1250284544
stride : 128
type   : stripe
status : ok
subsets: 0
devs   : 2
spares : 0
Comment 37 fimefija 2006-08-19 10:54:20 EDT
Thanks having:- 

Rebuilt mkinitrd my system finally boots kernel 2.6.17!

#yum -y --enablerepo development update mkinitrd
#mv /boot/initrd-2.6.17-1.2174_FC5.img /boot/initrd-2.6.17-1.2174_FC5.img.old
#mkinitrd /boot/initrd-2.6.17-1.2174_FC5.img 2.6.17-1.2174_FC5

*** Active Set
name   : pdc_fiagfhab
size   : 625163264
stride : 128
type   : stripe
status : ok
subsets: 0
devs   : 2
spares : 0


Now runing 2.6.17-1.2174 !
Comment 38 Leo Canale 2006-08-20 11:25:22 EDT
And for those of us that boots kernel 2.6.17 smp versions

[root@smokey ~]# yum -y --enablerepo development update mkinitrd
[root@smokey ~]# mv /boot/initrd-2.6.17-1.2174_FC5smp.img
/boot/initrd-2.6.17-1.2174_FC5smp.img.old
[root@smokey ~]# mkinitrd /boot/initrd-2.6.17-1.2174_FC5smp.img 2.6.17-1.2174_FC5smp

[engwnbie@smokey ~]$ su -
Password:
[root@smokey ~]# dmraid -rD
/dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
/dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0
[root@smokey ~]# dmraid -s
*** Group superset isw_ececagfhaj
--> Active Subset
name   : isw_ececagfhaj_RAID_Volume1
size   : 160086016
stride : 128
type   : mirror
status : ok
subsets: 0
devs   : 2
spares : 0
[root@smokey ~]#
[root@smokey ~]# uname -r
2.6.17-1.2174_FC5smp
[root@smokey ~]#
Comment 39 Marc Jadoul 2006-08-26 08:46:03 EDT
I have similar problem on hp nx9420.

kernel-smp-2.6.17-1.2174_FC5 does not boot while kernel-smp-2.6.17-1.2157_FC5
was OK!!

Have no raid but have LVM.
I tried the proposed fix but can't find right mkinitrd: I get version 5.0.40 or
version needing glibc upgrade!!   
Comment 40 Jason Smyth 2006-08-26 14:24:47 EDT
Re: Comment 39

Yes, Marc, you will need to upgrade glibc as well as, if I recall correctly, one
other library in order to upgrade mkinitrd. If you update using yum with the
commands listed, it should automatically install all the dependencies for you.
Comment 41 Ronald Cole 2006-08-27 17:58:58 EDT
I've added bug 204260 and prodded bug 189708.  Perhaps we'll get an errata for
parted and mkinitrd sometime soon.
Comment 42 Peter Jones 2006-08-31 00:13:06 EDT

*** This bug has been marked as a duplicate of 189708 ***

Note You need to log in before you can comment on or make changes to this bug.