Bug 439633 (NoDevNode)
Summary: | F9-Beta anaconda crash with PVCreateError | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Konstantin Ryabitsev <icon> | ||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | barkah71, cebbert, clydekunkel7734, dchapman, dominic.dunlop, godfearingminter, grgustaf, harald, notting, petersen, req1348, robi.petranovic | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | NeedsRetesting | ||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-05-08 08:53:08 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 235706 | ||||||||
Attachments: |
|
Description
Konstantin Ryabitsev
2008-03-29 20:53:07 UTC
Created attachment 299593 [details]
Dump of anaconda crash.
Hmm, this looks like another of the cases where udev isn't creating device nodes for us :-/ *** Bug 438981 has been marked as a duplicate of this bug. *** *** Bug 439034 has been marked as a duplicate of this bug. *** *** Bug 443361 has been marked as a duplicate of this bug. *** Bug 443361 is under qemu/kvm BTW. Should this not be on the blocker list? Adding it for now anyway. This appears to be fixed finally with today's rawhide. Successfully installing rawhide on kvm with both F8 and F9 virt-manager. Sorry I was over-optimistic - backtrace still happens on F8 but not F9. Did you really mean that it still happens on F8 but not F9? We don't really care about F8 anaconda because we're not going to be releasing an update for it. However if it's still happening on F9 (which I'm sure it is - we don't know the root cause here) then that is something we do care about. I meant "on" as in host not guest: see yes still happens for me on F8 host but not F9 host when installing a rawhide guest. Jens -- when the traceback occurs, can you look to see if the devices exist as well as if udevd is still running? *** Bug 443729 has been marked as a duplicate of this bug. *** Created attachment 303522 [details] text version of the anaconda traceback in comment #1 *** Bug 443828 has been marked as a duplicate of this bug. *** *** Bug 443979 has been marked as a duplicate of this bug. *** (In reply to comment #12) > Jens -- when the traceback occurs, can you look to see if the devices exist as > well as if udevd is still running? When I hit this on ia64 with our ia64-beta tree (which is really close to the current x86 rawhide) I checked and the /dev/ entries were not there. After I rebooted back to the previously installed image (on a different drive) the /dev/ entries did exist. So yes, it sounds like udev is not creating new /dev entries under anaconda. I am trying this again with the latest rawhide on x86_64 to see if the behavior is the same there. I will leave this in NEEDINFO until I verify that, or until Jens has a chance to confirm what he is seeing. After a few additional tries on both x86_64 and ia64 I have not been able to reproduce this again. This includes trying the exact same bits and options where I hit this earlier today. So, kind of a tricky one to debug, I am willing to try additional stuff if anybody has ideas but at this point I am stumped. I just hit this problem without LVM and it fails while trying to format the swap partition. 1. Booted in rescue mode and ran these commands to wipe the partition tables: dd if=/dev/zero of=/dev/sda bs=32k count=1024 dd if=/dev/zero of=/dev/sdb bs=32k count=1024 2. Rebooted to the installer and configured partitions with a custom layout (no LVM.) 3. The log says: Running... ['mkswap', '-v1', '/dev/sdb5'] /dev/sdb5: No such file or directory 4. /proc/partitions contains an entry for sdb5 but there is no /dev/sdb5 device node. 5. udevd is NOT running at this point. Is this consistent? Okay, that matches what clumens saw the one time he was able to reproduce this. For some reason, udevd is dying. The question is really now to "why" and "how do we reliably make this happen" :-/ I tried again and it failed the same way. I watched vt5 and the udevsettle command takes a long time to run before mkswap then runs and fails. Just like before, udevd is not running, sdb5 is in /proc/partitions and there is no /dev/sdb5. (removing NEEDINFO) (In reply to comment #12) > Jens -- when the traceback occurs, can you look to see if the devices exist as > well as if udevd is still running? Sorry I shouldn't have rebooted the host I guess - can't reproduce anymore. *** Bug 444233 has been marked as a duplicate of this bug. *** (In reply to comment #22) > I tried again and it failed the same way. I watched vt5 and the udevsettle > command takes a long time to run before mkswap then runs and fails. Just like > before, udevd is not running, sdb5 is in /proc/partitions and there is no /dev/sdb5. udevsettle will wait until its timeout if udevd isn't running iirc. If you can still reproduce this, want to try http://katzj.fedorapeople.org/updates-restartudev.cgz as an awful hack that restarts udev when we save partitions? I just spent the better part of an hour trying different partitioning schemes, etc to get this to happen without any luck at all (In reply to comment #25) > udevsettle will wait until its timeout if udevd isn't running iirc. If you can > still reproduce this, want to try > http://katzj.fedorapeople.org/updates-restartudev.cgz as an awful hack that > restarts udev when we save partitions? > How do I try it? > I just spent the better part of an hour trying different partitioning schemes, > etc to get this to happen without any luck at all Did you zero out the partition table before starting install? if udevd is not running then udevsettle can wait until the end of the world :) Please can anybody get a udevd coredump, so that we can analyze where and why it crashes??? (In reply to comment #26) > (In reply to comment #25) > > udevsettle will wait until its timeout if udevd isn't running iirc. If you can > > still reproduce this, want to try > > http://katzj.fedorapeople.org/updates-restartudev.cgz as an awful hack that > > restarts udev when we save partitions? > > How do I try it? http://fedoraproject.org/wiki/Anaconda/Updates > > I just spent the better part of an hour trying different partitioning schemes, > > etc to get this to happen without any luck at all > > Did you zero out the partition table before starting install? Yep (In reply to comment #27) > if udevd is not running then udevsettle can wait until the end of the world :) Well, we run it with --timeout :) > Please can anybody get a udevd coredump, so that we can analyze where and why it > crashes??? If I could reproduce it and find out _when_ it crashes, it'd start to be easier to get a coredump. Unfortunately, we start it from the loader's init so it's still not going to be entirely easy to get a core :/ It looks like udevd is already gone by the time anaconda starts. With a small change, the ugly hack of restarting it works pretty well so we do have a fallback now. Hacky workaround (restarting udev) at http://katzj.fedorapeople.org/updates-restartudev.cgz Information on updates.img at http://fedoraproject.org/wiki/Anaconda/Updates If anyone who can actually hit this can verify this helps, I'll commit it. It works for me when I manually kill udevd at least :-) Committed the hack described in comment #30 I'll retest next time I get some new install media. *** Bug 445493 has been marked as a duplicate of this bug. *** Works with the latest RC compose. *** Bug 446590 has been marked as a duplicate of this bug. *** *** Bug 446782 has been marked as a duplicate of this bug. *** |