Description of problem: The last koji-built kernel that boots successfully on my LUKS-encrypted system is 2.6.31.1-48.fc12. Version-Release number of selected component (if applicable): Fails with either of: kernel-2.6.31.1-52.fc12.x86_64 kernel-2.6.31.1-56.fc12.x86_64 (-53 and -54 not tested) Other packages: device-mapper-1.02.38-2.fc12.x86_64 lvm2-2.02.53-2.fc12.x86_64 cryptsetup-luks-1.1.0-0.1.fc12.x86_64 How reproducible: 100% Steps to Reproduce: 1.Boot with kernle-2.6.31.1-52.fc12 or later Actual results: Fails. Expected results: Succeeds. Additional info: There are two SATA hard drives connected to this system. /dev/sda1 is /boot and not encrypted /dev/sda2 is swap Other /dev/sda* partitions are encrypted LUKS devices containing LVM2 physical volumes. /dev/sdb1 is an encrypted ext2 partition. All encrypted devices use the same password. When booting I am prompted for the encryption password, then these messages appear: ==> device-mapper: remove ioctl failed: Device or resource busy Key slot 0 unlocked. error: unexpectedly disconnected from boot status daemon WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/. No root device found Boot has failed, sleeping forever. <== but stair-stepped as though the stty settings are wrong. Filing this against the kernel but CC'ing lvm2 maintainer.
I think this is not kernel problem but some race or misconfiguration in dracut probably. BTW it would be very useful, if dracut have some debug swtich and run cryptsetup with --debug (this is new option, in F12 and rawhide). The same for lvm (here it is -vvvv switch). (Should I report RFE for that?) Then we can see if there is problem with crypto mapping or with other later commands, missing modules etc.
The message device-mapper: remove ioctl failed: Device or resource busy can be caused by scan of temporary cryptsetup device by other process but cryptsetup will retry, so it is problem to solve but probably not root cause why it doesn't boot (the --debug log should show that it retries the operation with success)
Changing component to dracut. dracut-002-8.git845dd502.fc12.noarch
try dracut-002-9 and yes, adding a --debug and -vvvv switch would be useful. will do that
you might also want to revert the lvm and device-mapper packages to current rawhide
They *are* current rawhide. I updated it earlier today using PackageKit. $ koji latest-pkg dist-f12 lvm2 Build Tag Built by ---------------------------------------- -------------------- ---------------- lvm2-2.02.53-2.fc12 dist-f12 agk $ rpm -qi device-mapper | grep 'Source RPM' Group : System Environment/Base Source RPM: lvm2-2.02.53-2.fc12.src.rpm
(In reply to comment #4) > try dracut-002-9 From dist-f13?
Also, do I just need to 'rpm -Fvh dracut*' and reboot with the troublesome kernel, or do I need to run some command to make a new initramfs image?
you need to run dracut like you did with mkinitrd
$ koji latest-pkg dist-f12 dracut Build Tag Built by ---------------------------------------- -------------------- ---------------- dracut-002-8.git845dd502.fc12 dist-f12 wtogami damn! why did dracut-002-9 not enter F-12 ???
I didn't do anything with mkinitrd, I just installed F-12 from rawhide several months ago and have been upgrading with gnome-packagekit ever since. Please tell me the command line to use.
ah, I know what's going on... preparing a patch
# dracut /boot/initramfs-<kernel version>.img <kernel version> or if the image already exists # dracut -f /boot/initramfs-<kernel version>.img <kernel version>
Hmm... seems like the plymouth client disconnects from the daemon.. very strange error: unexpectedly disconnected from boot status daemon
cryptsetup with --debug would not work, because this is executed by /bin/plymouth ask-for-password
twaugh can you add to the kernel command line "rdinfo quiet rdshell" and hit <alt+enter> as soon as you see the graphical screen or "rdinitdebug quiet rdshell" and make a photo of the last interesting pages
oh, and you might want to remove "rhgb"
I upgraded to dracut-002-9.git99fd62e3.fc13, removed the -56 kernel, and reinstalled it. With 'rdinfo rdshell' I don't see any extra information on the screen, and it ends with: sh: can't access tty; job control turned off # (but I can't type at the prompt) With 'rdinitdebug rdshell' the output flows past too fast for me to capture, even when videoing it. But: removing 'rhgb' from the kernel boot command line avoids the problem entirely.
so plymouth seems b0rken
what version of plymouth, what's the output of plymouth-set-default-theme ?
Yes, there is a problem with plymouth.x86_64 0.8.0-0.2009.29.09.1.fc12. I downgraded to a previous version plymouth-0.8.0-0.2009.28.09.fc12.x86_64 (yes, I keep a lot of "older" packages in a local mirror) and it works. I am having a big problem on another baremetal system with X and colsoles dying. I was trying to locate the problem (just which packages) so I was updating a few packages at a time and then rebooting. When I updated plymouth, the bootup died. IIRC, there was some message from glibc about some kind of loop (or something like that). I have not had time right now to go back and get better documentation. There is nothing in /var/log/messages about this.
$ rpm -q plymouth plymouth-0.8.0-0.2009.29.09.1.fc12.x86_64 $ plymouth-set-default-theme charge
Arrgh ... it appears that I no longer have the problem! As I said ... I have two basemetal systems running F12-alpha-rawhide. This first is a dual processor AMD 4400+ (falcon) and the second is a quad processor AMD 940 (hawk). I had applied a bunch of updates to falcon and suddenly had a big problem with X (the only way in was thru ssh). I tried downgrading sone updates but could not find the right one. So, since I had not updated hawk yet, I starting updating a few packages at a time and then rebooting. During this proces on hawk, I hit some kind of problem with plymouth and, after coming up in init level 3, downgraded plymouth. After that, I continued updating until only plymouth was left. I just updated plymouth and the "problem" no longer occurs! BTW, I reinstalled F12alpha using the TC F12beta DVD on falcon and then updated a few packages at a time with a reboot. Turns out the problem was xorg-x11-server-{common,Xorg} which had already been BZ'ed. Sorry I cannot help on this one. Maybe plymouth was interacting with some other package which was not updated yet. If it is a real problem, it will return.
For completeness: Yes, I get: [gc@hawk ~]$ rpm -q plymouth plymouth-0.8.0-0.2009.29.09.1.fc12.x86_64 [gc@hawk ~]$ plymouth-set-default-theme charge [gc@hawk ~]$ Also for completeness, I do not have a problem with plymouth on falcon either.
Reviewed in today's beta blocker bug review meeting. We cannot reproduce this on QA's test beds or on Jesse Keating's personal system (all of which are configured similarly). Tim definitely confirms it with today's Rawhide, however. Tim has KMS disabled. He will test with it enabled. He confirms that disabling plymouth (removing rhgb from kernel parameters) works around the issue. Tim will also try to get more diagnostics on the problem if he can. We agreed this will not be promoted to beta blocker as we cannot reproduce it on other systems, and there's a usable workaround. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
I *thought* I'd disabled KMS but on closer inspection it seems that KMS just doesn't work for me with the default parameters. I just tried booting with 'vga=792' added to the command line, making the full command line the following: ro root=/dev/mapper/vg_worm01-LogVol00 rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_GB.UTF-8 KEYTABLE=gb rd_plytheme=charge vga=792 This boots successfully, and even with a graphical boot sequence. So there's another work-around: add 'vga=792'. When I don't add 'vga=792', I don't get a graphical boot sequence and haven't done for months. So is it that plymouth is behaving badly in that situation? Here's the information about my graphics card: 01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B64 [FireGL V3100 (PCIE)] (rev 80) 01:00.0 0300: 1002:5b64 (rev 80) (prog-if 00 [VGA controller]) Subsystem: 1002:0102 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f0000000 (32-bit, prefetchable) [size=128M] Region 1: I/O ports at dc00 [size=256] Region 2: Memory at fe9e0000 (32-bit, non-prefetchable) [size=64K] Expansion ROM at fea00000 [disabled] [size=128K] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <128ns, L1 <2us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <128ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Kernel modules: radeon, radeonfb
can you play with the 'radeon.modeset=0' (forces KMS off) and 'radeon.modeset=1' (should force KMS on) kernel parameters? (You'll get a message that the parameters are invalid - it's wrong, ignore it.) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
it's interesting that the same error message crops up as in the firstboot bug we're also looking at: error: unexpectedly disconnected from boot status daemon -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
that's https://bugzilla.redhat.com/show_bug.cgi?id=526842 , btw. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
(In reply to comment #27) > can you play with the 'radeon.modeset=0' (forces KMS off) and > 'radeon.modeset=1' (should force KMS on) kernel parameters? Results: radeon.modeset=0: Same problem as normal, no difference radeon.modeset=1: This seems to avoid the problem. I say "seems" because although it does boot, the gdm screen never appears. It looks like X is stopping and restarting continuously -- if I time it right I can press Ctrl-Alt-F1 and Ctrl-Alt-Delete to reboot.
Tim -- Your results with radeon.modeset=1 appears to me to be a lot like problem: https://bugzilla.redhat.com/show_bug.cgi?id=526380 I had the problem described in 526380 with an ATI video card and it was fixed with the latest rawhide update. Do you have the latest xorg-x11-server-* updates applied?
tim: that's interesting - sounds like perhaps this bug is triggered with modesetting disabled but Plymouth enabled - bad interactions there, perhaps - and you're running into an unrelated X bug with modesetting enabled. Jesse, could you see if you can reproduce the bug if you boot with modesetting disabled on your test system? jlaska, ditto with the test systems we couldn't reproduce on? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
error: unexpectedly disconnected from boot status daemon means plymouth crashed. If modesetting isn't working then we'll fall back to the text plugin, so maybe the bug is there? I'll try to do an install with an encrypted disk to try to reproduce this.
This should be fixed in plymouth-0.8.0-0.2009.29.09.3.fc12
Fix confirmed. Thanks!
Has been tagged: https://fedorahosted.org/rel-eng/ticket/2341 closing. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers