Bug 737118 - firstboot-text prevents system from booting
Summary: firstboot-text prevents system from booting
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: firstboot
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Martin Gracik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 730926 (view as bug list)
Depends On:
Blocks: F16Beta, F16BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2011-09-09 16:36 UTC by Tim Flink
Modified: 2013-07-04 12:58 UTC (History)
11 users (show)

Fixed In Version: firstboot-16.4-1.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-09-27 16:37:40 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
serial console output with systemd.log_level=debug (34.29 KB, text/plain)
2011-09-09 16:36 UTC, Tim Flink
no flags Details

Description Tim Flink 2011-09-09 16:36:32 UTC
Created attachment 522366 [details]
serial console output with systemd.log_level=debug

Description of problem:
If firstboot is set to run when booting into "runlevel 3", the boot process hangs without much in the way of output.

While firstboot isn't installed for a minimal install, this would affect users that do a graphical install but end up booting into text-mode before completing firstboot.

Version-Release number of selected component (if applicable):


How reproducible:
I am able to reproduce this every time I'm booting into multi-user mode and firstboot is set to run.

Steps to Reproduce:
1. Do a fresh install that pulls in firstboot (default graphical works)
2. On the first reboot after install, add "3" to the end of the kernel command line
  
Actual results:
Boot process hangs

Expected results:
Go into text-mode firstboot or at least, complete boot process.

Additional info:
I've attached output from serial console with the following changes to the default kernel command line
 - removed rhgb, quiet
 - added systemd.log_level=debug systemd.log_target=kmsg \ 
         console=ttyS0,19200n8 3

Comment 1 Tim Flink 2011-09-09 16:40:41 UTC
This is a little bit of a stretch, but proposing as Fedora 16 Beta blocker under the following alpha release criterion [1]:

When booting a system installed without a graphical environment, or when using a correct configuration setting to cause an installed system to boot in non-graphical mode, the system should boot to a state where it is possible to log in through at least one of the default virtual consoles

Yes, that criterion does specify that the system doesn't have a graphical environment installed but I don't think that booting into multi-user mode before firstboot is run should prevent the system from booting.

[1] http://fedoraproject.org/wiki/Fedora_16_Alpha_Release_Criteria

Comment 2 Adam Williamson 2011-09-09 18:12:28 UTC
Discussed at the 2011-09-09 blocker review meeting, accepted as a blocker under the above criterion. Note that it's not a stretch: you can install without a graphical environment but with firstboot. Tim needs to re-test with TC2 and confirm, though.

Comment 3 Adam Williamson 2011-09-13 00:21:54 UTC
Tim: any news on the re-test? I'll check this too.

Lennart, Chris, Martin: any thoughts on a fix for this? Only two days till Beta RC1 time.

Comment 4 Adam Williamson 2011-09-13 00:39:08 UTC
seems to be valid in TC2. I did a 'minimal' install with the sole change of adding the 'firstboot' package to the loadout, and it's sitting here at the boot progress bar, not doin' anything.

Comment 5 Adam Williamson 2011-09-13 00:41:20 UTC
seems like if you drop 'rhgb quiet' from the boot parameters, you can see the firstboot screen, but as soon as you hit a key, you get the 'progress bar' screen again, and it just gets stuck there. definitely irritating, and definitely looks like systemd.

Comment 6 Adam Williamson 2011-09-13 00:42:33 UTC
ccing Johann in case he has ideas.

Comment 7 Adam Williamson 2011-09-13 00:51:03 UTC
Aha.

16.3-1, which I found in https://bugzilla.redhat.com/show_bug.cgi?id=734306 , appears to fix this. So we just need to pull that in for RC1.

Comment 8 Martin Gracik 2011-09-13 06:21:03 UTC
I tested this yesterday and got different results.

The /usr/bin/setup screen appeared, and I could move the cursor around the items, but pressing Enter on anything didn't work. Even not on Quit, so I got stuck too.

If you're right and 16.3-1 fixes this, than it was plymouth's fault. I had the same problem with the graphical firstboot before. In 16.3-1 I quit plymouth manually before running firstboot-text.

Comment 9 Adam Williamson 2011-09-13 16:24:16 UTC
I tested by booting to rescue mode, disabling the firstboot-text service, booting back to normal mode, doing 'yum --enablerepo=updates-testing update firstboot', enabling the firstboot-text service again, and rebooting. On the reboot it performed perfectly.

Comment 10 Tim Flink 2011-09-13 17:08:55 UTC
I tried a new graphical install (beta TC2 media, skipped evolution to work around dependency issues) and made sure that firstboot-16.3-1 was installed.

I'm still seeing the same issue, though. Booting without rhgb, quiet and adding systemd.log_level=debug systemd.log_target=kmsg plymouth:debug 3

When I disable firstboot-text.service, I am able to boot without any problems - same as before.

Comment 11 Adam Williamson 2011-09-14 19:15:12 UTC
Huh. I will try some more tests, I guess. mgracik, can you look at it again too, and see if 16.3-1 fixes it for you?

Comment 12 Adam Williamson 2011-09-14 22:06:14 UTC
so, I've been testing with these two kickstarts:

http://www.happyassassin.net/extras/firstboot-old.ks
http://www.happyassassin.net/extras/firstboot-test.ks

both do a fully automated installation from TC2 DVD, with the 'minimal' package set plus firstboot. -old just uses the TC2 DVD repo, and gets firstboot-16.1-2.fc16 . -test adds a special repo I set up which contains nothing but firstboot-16.3-1.fc16 , and makes sure to pull firstboot from that repo.

I've verified the only diff between the package sets of the two installs is:

--- /tmp/pkgs-with-repo	2011-09-14 14:44:08.351425318 -0700
+++ /tmp/pkgs-without-repo	2011-09-14 15:02:29.377951434 -0700
@@ -98,7 +98,7 @@
 finger-0.17-43.fc15.x86_64
 fipscheck-1.3.0-2.fc15.x86_64
 fipscheck-lib-1.3.0-2.fc15.x86_64
-firstboot-16.3-1.fc16.x86_64
+firstboot-16.1-2.fc16.x86_64
 fontconfig-2.8.0-4.fc16.x86_64
 fpaste-0.3.7-1.fc16.noarch
 fprintd-0.2.0-3.fc15.x86_64

for me, installing with -old.ks reproduces the bug and I cannot work around it in any way other than to boot to 'single' mode and disable the service. Dropping 'rhgb' from the kernel parameters causes what Martin saw - you get to see the firstboot tool, but plymouth spews startup messages over it, and as soon as you press a key, you switch to the progress bar screen, where it gets stuck.

Installing with -test.ks, at my last test, made it work right out of the box.

Next stop: similar test but with a 'full' package set, as Tim's been testing.

Comment 13 Adam Williamson 2011-09-16 07:42:05 UTC
welp. definitely ain't fixed in rc1 :/ don't know what's going on here, but I can't actually get an install of rc1 with firstboot-old.ks to boot *at all* - not by taking out rhgb quiet, not even into 'single' mode. so I can't boot the install at all. which is...bad.

Comment 14 Adam Williamson 2011-09-16 07:44:35 UTC
okay, i had to take out rhgb and quiet and add single and enforcing=0, and it booted, i could disable firstboot-text.service, reboot, and got a normal boot. definitely busted.

Comment 15 Martin Gracik 2011-09-16 08:06:56 UTC
So I tried this myself, I installed Fedora 16 Beta TC2. Didn't work. I added the ExecStartPre=-/bin/plymouth quit to the firstboot-text.service (this is what the new version got) and restarted, and firstboot worked.

I still got 2-3 lines of some strange output over the firstboot screen, but pressing Ctrl+L cleared the screen and firstboot worked as it should.

I don't know what service writes over the firstboot screen, so I don't know how to fix that thing.

Comment 16 Adam Williamson 2011-09-16 08:31:13 UTC
mgrac: best just try it with RC1, the behaviour changed (got worse) for me. this could be down to the faster kernel changing timings, it's certainly a racy thing.

rc1:

http://dl.fedoraproject.org/pub/alt/stage/16-Beta.RC1/

Comment 17 Adam Williamson 2011-09-16 08:43:11 UTC
Adding CCs for halfline and lennart, as there are systemd and plymouth interactions to be concerned with here.

Comment 18 Martin Gracik 2011-09-16 10:12:46 UTC
Yes. The same service file that works in tc2, doesn't work in rc1 at all. If I remove the ExecStartPre=-/bin/plymouth quit, it gets stuck. If I put it in, the system boots, I get a login screen, but there was no firstboot screen at all, even though I can see the service is starting. And systemctl status says everything went fine.

Comment 19 Martin Gracik 2011-09-16 10:19:09 UTC
The systemctl status firstboot-text.service command after the boot, says that ExecStart=/usr/bin/setup was code=killed, signal=PIPE, that does not sound good.

Comment 20 Adam Williamson 2011-09-16 17:33:23 UTC
worth noting that tflink had trouble even in tc2, but it does seem to be worse in rc1. I think there may be a timing element to this bug, and rc1 has a non-debug kernel, which changes boot timings quite substantially.

Comment 21 Adam Williamson 2011-09-16 17:52:21 UTC
setting back to assigned, as this is clearly not fixed.

Comment 22 Tim Flink 2011-09-17 02:32:59 UTC
Discussed in the 2011-09-16 blocker review meeting. This seems to have gotten worse from TC2 to RC1 and seems to involve systemd/plymouth interactions.

If a fix isn't available for RC2, disable firstboot-text.service for beta and move this to a final blocker.

Comment 23 Fedora Update System 2011-09-19 20:23:43 UTC
firstboot-16.4-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/firstboot-16.4-1.fc16

Comment 24 Tim Flink 2011-09-20 01:15:27 UTC
I tested this with fresh F16 minimal + firstboot installs using both firstboot-16.4.1.fc16 and firstboot-16.3.1.

Rebooting after install, the machine with firstboot-16.4.1 boots to a login screen and 16.3.1 doesn't

Comment 25 Fedora Update System 2011-09-20 19:04:10 UTC
Package firstboot-16.4-1.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing firstboot-16.4-1.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/firstboot-16.4-1.fc16
then log in and leave karma (feedback).

Comment 26 Martin Gracik 2011-09-21 10:03:12 UTC
*** Bug 730926 has been marked as a duplicate of this bug. ***

Comment 27 Adam Williamson 2011-09-24 06:44:03 UTC
I tested the update with a pre-RC2 image I built here and it worked - first-time boot to runlevel 3 post install did not show any kind of firstboot at all. Will confirm with the official RC2 images.

Comment 28 Fedora Update System 2011-09-26 01:41:56 UTC
firstboot-16.4-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 29 Hongqing Yang 2011-09-27 09:36:10 UTC
reproduced with F16 beta rc3 with partition options use all space and use free space and minimal installation.
The following message displayed after 'localhost login:' hint.
ALSA sound/pci/hda/hda_intel.c:748 azx_get_response timeout, switching to polling mode: last cmd=0x000f0000

Comment 30 Martin Gracik 2011-09-27 10:26:57 UTC
Hi,

why do you think this has anything with firstboot?

Could you provide some more information about what you're experiencing?

Comment 31 Hongqing Yang 2011-09-27 10:37:50 UTC
after installation with minimal package adn partition use all space or use free space, at the firstboot, the following message display on the terminal:

localhost login: ALSA sound/pci/hda/hda_intel.c:748 azx_get_response timeout, switching to
polling mode: last cmd=0x000f0000

after I hit the Enter key, it will login as the  ALSA sound/pci/hda/hda_intel.c:748 azx_get_response timeout, switching to
polling mode: last cmd=0x000f0000

after failed, the message disappeared, root can login successfully.
This does not happen with default partition.

Comment 32 Martin Gracik 2011-09-27 10:46:21 UTC
This is a bugzilla for the firstboot application. You are experiencing some errors during the "first" booting process of your installed system that have nothing to do with the actual firstboot application.

You need to create a new bugzilla and assign it to some other component. Maybe ALSA? But I'm not really sure.

Comment 33 Tim Flink 2011-09-27 16:37:40 UTC
I just re-verified this for Fedora 16 beta RC3 using an x86_64 netinstall iso and autopart full disk.

Hongqing, I don't think that the issues you're seeing are from firstboot. Maybe from one of the dependencies that it pulls in?

Either way, could you re-file that issue as another bug? I'm re-closing this one.

Comment 34 Adam Williamson 2011-09-27 19:05:41 UTC
hongqing: if you have a 'login:' prompt then it's working. you can login with it. the ALSA message is just a warning which is being sent to console when it probably ought to be logged.


Note You need to log in before you can comment on or make changes to this bug.