Bug 483493 - Crash after 'Starting UDEV' on an ECS A740GM-M motherboard
Summary: Crash after 'Starting UDEV' on an ECS A740GM-M motherboard
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 10
Hardware: i686
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-01 19:59 UTC by Leslie Brooks
Modified: 2009-12-18 07:46 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2009-12-18 07:46:21 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Leslie Brooks 2009-02-01 19:59:05 UTC
Description of problem: I have installed FC10 on a system with an ASUS K8NE-Deluxe motherboard and GeForce 6200 video; it works perfectly and was updated with the latest patches today.  When I move that harddrive to the system containing my new ECS A740GM-M motherboard (onboard ATI video, 2GB RAM, AMD 1640 CPU) it crashes during the boot process; the last line is 'Starting UDEV'.  A few seconds after that line appears the screen goes blank.

I tried adding 'modprobedebug' to the kernel parameters (see Bug 465684); that showed me the 'Starting UDEV' line but nothing beyond that.  Booting to runlevel 3 also crashes after 'Starting UDEV'.


Version-Release number of selected component (if applicable):


How reproducible: Always


Steps to Reproduce:
1.
2.
3.
  
Actual results: Crashes


Expected results: Boots


Additional info:

Comment 1 Harald Hoyer 2009-02-02 09:57:22 UTC
add "udevdebug" to the kernel command line please

Comment 2 Leslie Brooks 2009-02-02 19:23:03 UTC
Adding 'udevdebug' produces lots of additional lines that scroll past too quickly to read - then it crashes and wipes the screen.  There is absolutely no pause between the last line of output and the screen wipe.

I tried photographing the screen as the debug info scrolls past, but the result is only barely readable.  The last line visible (which is almost certainly not the last line output) says "??or=189, minor=512, mode=0644, uid=0, gid=0".

I can send you the screenshot if you want it, but imagine I will have to compress it first; it is 1.5MB.

What other methods are there for capturing the udev debug output?

Comment 3 Harald Hoyer 2009-02-03 09:41:07 UTC
"modprobedebug" really should show which modules get loaded.

"udevinfo" has less output.. pressing "scroll lock/pause/roll" might help or route console output over a serial port..

Comment 4 Leslie Brooks 2009-02-04 01:53:43 UTC
Catching output over many attempts; here is what I saw:

UDEVDEBUG
udevd[641]: udev_done: seq 641, pid [778] exit with 0, 0 seconds old
udevd[641]: udev_done: seq 641 forked, pid [779], 'add' 'acpi', 0 seconds old
udevd_event[800]: pass_env_to_socket: passed -1 bytes to socket '@/org/freedesktop/hal/udev_event',
udevd_event[800]: pass_env_to_socket: passed -1 bytes to socket '@/org/kernel/udev/monitor',
udevd-event[800]: udev_event_run: seq 642 finished with 0

That is very close to the crash; I can't press Scroll-Lock twice fast enough to pause anything after that line.  I think the last line I saw said 'seq 635'; can they be out of order?

MODPROBEDEBUG
 - produces only two lines of 'insmod'; hitting Scroll-Lock after those two lines doesn't prevent the crash, so I can't see what they say.

UDEVINFO
 - I saw seq 717, pid [794]
              seq 611, pid [816] exit with 0, 44 seconds old
   udevd-event[814]: udev_event_run: seq 636 finished with 0
  udevd-event[787]: run_program: '/sbin/modprobe input: b0019v0000p0002e0000-e0,1,k74,ramlsfw'

Is this of any help?  For what should I be looking?  I think I have some RS-232 cables around still, but will have to go dig them out to route console output out a serial port.

Is there any advantage to using 'Pause' or 'Roll' rather than Scroll Lock?

Comment 5 Harald Hoyer 2009-02-04 08:58:52 UTC
so the crash seems to be produces by an insmod .. the modprobedebug output of the insmods would be the most interesting to catch.

It might be this one
/sbin/modprobe input:b0019v0000p0002e0000-e0,1,k74,ramlsfw

Do you have a kernel with which you can start your system?

Comment 6 Leslie Brooks 2009-02-04 20:51:32 UTC
I believe it boots properly with both FC9 and Ubuntu 8.?.  At the moment I don't have access to it to test, but I know I have booted at least one of the Live CDs on it.  Please explain what you would like me to do.

Comment 7 Harald Hoyer 2009-02-05 07:26:53 UTC
well, boot with the normal ISO in rescue mode, chroot to your F10 system, then 
# mv /lib/udev/rules.d/80-drivers.rules /lib/udev/rules.d/80-drivers.rules.bak

after that one reboot into F10 and if you managed:

# find /sys -name modalias|xargs cat| while read modules ; do \
                        if [ -n "$modules" ]; then \
                                echo "Loading module for $modules in 5 seconds"; \
                                sleep 5; \
                                /sbin/modprobe -a -v -q $modules; \
                                udevadm settle; \
                        fi; \
                done 

and try to catch the last line before the system crashes...

This "sleep" is a very good idea for "modprobedebug", I guess.. I will add that to start_udev.

Comment 8 Leslie Brooks 2009-02-06 01:31:31 UTC
I renamed 80-drivers and rebooted, and it got almost to the end of the boot process and crashed again, this time immediately after loading smartd.  (I enable smartd on all of my systems.)

I moved the HD to another motherboard, booted, and disabled smartd.  Moved it back to the ECS mobo, booted again and it crashed again, this time just after loading anacron.

What would you like me to do next?

Comment 9 Leslie Brooks 2009-02-06 01:33:47 UTC
I should add that the ECS mobo boots fine under FC9 Live; that is what I used to rename 80-drivers.

Comment 10 Harald Hoyer 2009-02-06 10:18:47 UTC
If you can boot this far, then you can boot into single mode.
Add "s" to the kernel command line, then you should be able to boot in a shell and execute the things from comment #7

Comment 11 Leslie Brooks 2009-02-08 01:56:06 UTC
The last line was pci:v00001002d0000796Esv00001019sd00002615bc03sc00i00.

Should we assume that the other crash, at the end of the boot process with 80-drivers.rules missing, is a distinct bug and start tracking it separately?  Even with the file missing I can boot from that HD on my other mobo.

Comment 12 Harald Hoyer 2009-02-09 11:05:34 UTC
# modprobe -av pci:v00001002d0000796Esv00001019sd00002615bc03sc00i00
insmod /lib/modules/2.6.27.12-170.2.5.fc10.x86_64/kernel/drivers/i2c/algos/i2c-algo-bit.ko 
insmod /lib/modules/2.6.27.12-170.2.5.fc10.x86_64/kernel/drivers/gpu/drm/drm.ko 
insmod /lib/modules/2.6.27.12-170.2.5.fc10.x86_64/kernel/drivers/gpu/drm/radeon/radeon.ko 

Ok... you might blacklist the radeon driver to be autoloaded and reenable 80-drivers.rules by moving it back to its old place. It might also fix the crash at the end of the boot process.

# echo blacklist radeon > /etc/modprobe.d/blacklist-radeon

Comment 13 Leslie Brooks 2009-02-10 01:31:34 UTC
Blacklisting the Radeon driver did fix the udev crash, but not the crash right at the end of booting, just before the login prompt would appear.

Do we need to troubleshoot the Radeon driver further to narrow down the cause of that crash?

Do I need to open another bug for the second crash?  How do we go about troubleshooting that one?

I am downloading the Fedora 11 alpha (live-i686) now and will let you know if that boots.

Comment 14 Harald Hoyer 2009-02-10 08:56:23 UTC
Sure, the radeon driver needs work!

Now, that the radeon driver is blacklisted, you do the procedure from comment #10 again.. boot with "s" and execute the modprobe loop several times.

You could also disable smartd to see if it is causing the crash.

Comment 15 Leslie Brooks 2009-02-17 02:12:45 UTC
OK, sorry it has taken me so long to respond, but I ran into a different bug in the process of trying to troubleshoot the second crash.

Rather than moving my primary desktop system's HD every time I need to run a test, I decided to install FC10 on a spare HD and put it into the A740GM-M system.  However, when I did that I got:

Could not detect stabilization, waiting 10 seconds.
Unable to access resume device (UUID...)
.
.
mount: error mounting /dev/root on /sysroot as ext3: No such file or directory.

But, that same HD boots fine on my primary desktop.  So, I completely reinstalled FC10 on the same HD (on my primary desktop); it boots fine.  Blacklist the Radeon driver, transfer it to the A740GM, and it fails again with the same errors.

The spare HD is an IDE drive where the previous drive is a SATA drive, an I installed a PCI/IDE controller to run it, so it isn't a perfect comparison.  When I get time I will switch back to the SATA drive and finish debugging the crash.

Fedora 11 (i386-live Alpha) boots and runs just fine on the A740GM.

Comment 16 Leslie Brooks 2009-02-21 22:26:48 UTC
OK, using the original SATA HD again, I disabled smartd and did the steps from comment #7, and all of the modules loaded without an error and without a crash.  However, if I let the system boot normally the white bar gets all the way to the right (no blue bar showing) and then it crashes.

What additional tests would you like me to run?

Comment 17 Bug Zapper 2009-11-18 10:56:56 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 18 Bug Zapper 2009-12-18 07:46:21 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.