Red Hat Bugzilla – Bug 483493
Crash after 'Starting UDEV' on an ECS A740GM-M motherboard
Last modified: 2009-12-18 02:46:21 EST
Description of problem: I have installed FC10 on a system with an ASUS K8NE-Deluxe motherboard and GeForce 6200 video; it works perfectly and was updated with the latest patches today. When I move that harddrive to the system containing my new ECS A740GM-M motherboard (onboard ATI video, 2GB RAM, AMD 1640 CPU) it crashes during the boot process; the last line is 'Starting UDEV'. A few seconds after that line appears the screen goes blank.
I tried adding 'modprobedebug' to the kernel parameters (see Bug 465684); that showed me the 'Starting UDEV' line but nothing beyond that. Booting to runlevel 3 also crashes after 'Starting UDEV'.
Version-Release number of selected component (if applicable):
How reproducible: Always
Steps to Reproduce:
Actual results: Crashes
Expected results: Boots
add "udevdebug" to the kernel command line please
Adding 'udevdebug' produces lots of additional lines that scroll past too quickly to read - then it crashes and wipes the screen. There is absolutely no pause between the last line of output and the screen wipe.
I tried photographing the screen as the debug info scrolls past, but the result is only barely readable. The last line visible (which is almost certainly not the last line output) says "??or=189, minor=512, mode=0644, uid=0, gid=0".
I can send you the screenshot if you want it, but imagine I will have to compress it first; it is 1.5MB.
What other methods are there for capturing the udev debug output?
"modprobedebug" really should show which modules get loaded.
"udevinfo" has less output.. pressing "scroll lock/pause/roll" might help or route console output over a serial port..
Catching output over many attempts; here is what I saw:
udevd: udev_done: seq 641, pid  exit with 0, 0 seconds old
udevd: udev_done: seq 641 forked, pid , 'add' 'acpi', 0 seconds old
udevd_event: pass_env_to_socket: passed -1 bytes to socket '@/org/freedesktop/hal/udev_event',
udevd_event: pass_env_to_socket: passed -1 bytes to socket '@/org/kernel/udev/monitor',
udevd-event: udev_event_run: seq 642 finished with 0
That is very close to the crash; I can't press Scroll-Lock twice fast enough to pause anything after that line. I think the last line I saw said 'seq 635'; can they be out of order?
- produces only two lines of 'insmod'; hitting Scroll-Lock after those two lines doesn't prevent the crash, so I can't see what they say.
- I saw seq 717, pid 
seq 611, pid  exit with 0, 44 seconds old
udevd-event: udev_event_run: seq 636 finished with 0
udevd-event: run_program: '/sbin/modprobe input: b0019v0000p0002e0000-e0,1,k74,ramlsfw'
Is this of any help? For what should I be looking? I think I have some RS-232 cables around still, but will have to go dig them out to route console output out a serial port.
Is there any advantage to using 'Pause' or 'Roll' rather than Scroll Lock?
so the crash seems to be produces by an insmod .. the modprobedebug output of the insmods would be the most interesting to catch.
It might be this one
Do you have a kernel with which you can start your system?
I believe it boots properly with both FC9 and Ubuntu 8.?. At the moment I don't have access to it to test, but I know I have booted at least one of the Live CDs on it. Please explain what you would like me to do.
well, boot with the normal ISO in rescue mode, chroot to your F10 system, then
# mv /lib/udev/rules.d/80-drivers.rules /lib/udev/rules.d/80-drivers.rules.bak
after that one reboot into F10 and if you managed:
# find /sys -name modalias|xargs cat| while read modules ; do \
if [ -n "$modules" ]; then \
echo "Loading module for $modules in 5 seconds"; \
sleep 5; \
/sbin/modprobe -a -v -q $modules; \
udevadm settle; \
and try to catch the last line before the system crashes...
This "sleep" is a very good idea for "modprobedebug", I guess.. I will add that to start_udev.
I renamed 80-drivers and rebooted, and it got almost to the end of the boot process and crashed again, this time immediately after loading smartd. (I enable smartd on all of my systems.)
I moved the HD to another motherboard, booted, and disabled smartd. Moved it back to the ECS mobo, booted again and it crashed again, this time just after loading anacron.
What would you like me to do next?
I should add that the ECS mobo boots fine under FC9 Live; that is what I used to rename 80-drivers.
If you can boot this far, then you can boot into single mode.
Add "s" to the kernel command line, then you should be able to boot in a shell and execute the things from comment #7
The last line was pci:v00001002d0000796Esv00001019sd00002615bc03sc00i00.
Should we assume that the other crash, at the end of the boot process with 80-drivers.rules missing, is a distinct bug and start tracking it separately? Even with the file missing I can boot from that HD on my other mobo.
# modprobe -av pci:v00001002d0000796Esv00001019sd00002615bc03sc00i00
Ok... you might blacklist the radeon driver to be autoloaded and reenable 80-drivers.rules by moving it back to its old place. It might also fix the crash at the end of the boot process.
# echo blacklist radeon > /etc/modprobe.d/blacklist-radeon
Blacklisting the Radeon driver did fix the udev crash, but not the crash right at the end of booting, just before the login prompt would appear.
Do we need to troubleshoot the Radeon driver further to narrow down the cause of that crash?
Do I need to open another bug for the second crash? How do we go about troubleshooting that one?
I am downloading the Fedora 11 alpha (live-i686) now and will let you know if that boots.
Sure, the radeon driver needs work!
Now, that the radeon driver is blacklisted, you do the procedure from comment #10 again.. boot with "s" and execute the modprobe loop several times.
You could also disable smartd to see if it is causing the crash.
OK, sorry it has taken me so long to respond, but I ran into a different bug in the process of trying to troubleshoot the second crash.
Rather than moving my primary desktop system's HD every time I need to run a test, I decided to install FC10 on a spare HD and put it into the A740GM-M system. However, when I did that I got:
Could not detect stabilization, waiting 10 seconds.
Unable to access resume device (UUID...)
mount: error mounting /dev/root on /sysroot as ext3: No such file or directory.
But, that same HD boots fine on my primary desktop. So, I completely reinstalled FC10 on the same HD (on my primary desktop); it boots fine. Blacklist the Radeon driver, transfer it to the A740GM, and it fails again with the same errors.
The spare HD is an IDE drive where the previous drive is a SATA drive, an I installed a PCI/IDE controller to run it, so it isn't a perfect comparison. When I get time I will switch back to the SATA drive and finish debugging the crash.
Fedora 11 (i386-live Alpha) boots and runs just fine on the A740GM.
OK, using the original SATA HD again, I disabled smartd and did the steps from comment #7, and all of the modules loaded without an error and without a crash. However, if I let the system boot normally the white bar gets all the way to the right (no blue bar showing) and then it crashes.
What additional tests would you like me to run?
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '10'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 10's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 10 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.