Bug 1996901 - gnome-initial-setup hangs when clicking Next on the Software page
Summary: gnome-initial-setup hangs when clicking Next on the Software page
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gnome-initial-setup
Version: 35
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kalev Lember
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker openqa
Depends On:
Blocks: BetaBlocker, F35BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2021-08-23 23:14 UTC by Adam Williamson
Modified: 2021-08-24 22:13 UTC (History)
7 users (show)

Fixed In Version: gnome-initial-setup-41~beta-1.fc35.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-24 22:13:16 UTC
Type: Bug


Attachments (Terms of Use)
/var/log tarball from a failed test run (599.16 KB, application/gzip)
2021-08-23 23:14 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2021-08-23 23:14:17 UTC
Created attachment 1816995 [details]
/var/log tarball from a failed test run

In Fedora-35-20210823.n.0, gnome-initial-setup-41~beta-1.fc35 landed, bringing back the "Software" page that stopped working several releases ago - see https://gitlab.gnome.org/GNOME/gnome-initial-setup/-/merge_requests/121 upstream. However, if we do a clean install and boot the installed system, when we reach that page and click the Next button, g-i-s just seems to hang with the button in 'activated' state. After waiting several minutes, it's still in that state. No tty is accessible so the system is unusable at this point.

Note to get even that far in a local VM I have to boot with 'enforcing=0' (it seems selinux is denying some things that prevent g-i-s reaching that point otherwise). openQA seems to reach the hang without needing to go to permissive mode, I'm not sure why (it may be because in openQA we create a root account in the installed system after installing but before rebooting).

I can't immediately see what the cause of this is. I'll attach a tarball of /var/log from an affected openQA test.

Proposing as a Beta blocker per Basic criterion "A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility" - we don't get to a working desktop.

It's easy to reproduce - get https://kojipkgs.fedoraproject.org/compose/branched/Fedora-35-20210823.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-35-20210823.n.0.iso , run an install, then just boot the installed system and follow through g-i-s till it hangs. If it doesn't even boot to the first screen of g-i-s, boot with `enforcing=0` (I will file separate bugs on the SELinux denials).

Comment 1 Michael Catanzaro 2021-08-23 23:36:08 UTC
Hm, gnome-initial-setup assumes that fedora-third-party finishes instantaneously. I can try reworking it to use a spinner until the command finishes, which would avoid a UI hang, but of course that won't fix the OpenQA test.

Would it be possible to run 'sudo fedora-third-party' in a tty to see why it's getting stuck?

Comment 2 Adam Williamson 2021-08-23 23:40:21 UTC
I couldn't get into a tty when I first tested. I'll try it again tomorrow...

Comment 3 Lukas Ruzicka 2021-08-24 15:48:42 UTC
I can confirm, this is also happening for me. I cannot log onto the VM tty, because the liveuser user does not seem to work any longer and no new user has been created, so there is no way how to log into it.

Comment 4 Adam Williamson 2021-08-24 16:05:50 UTC
You can set a root password and/or create a user from the installed system root after install but before rebooting (as we do in openQA), I will try that today. There's also the systemd debug shell, but for some reason that didn't work when I tried it yesterday; I'll try that again today as well.

Comment 5 Michael Catanzaro 2021-08-24 16:14:04 UTC
(In reply to Adam Williamson from comment #4)
> You can set a root password 

That's probably best. I'm going to try soon, probably later today since this is pretty urgent, and see what I find.

> and/or create a user from the installed system
> root after install but before rebooting (as we do in openQA),

Huh, doesn't that prevent gnome-initial-setup from running at all? It should.

> I will try
> that today. There's also the systemd debug shell, but for some reason that
> didn't work when I tried it yesterday; I'll try that again today as well.

I can't use that because it's going to be qwerty only and it's just too hard to guess which key corresponds to which letter.

Comment 6 Adam Williamson 2021-08-24 16:41:05 UTC
> Huh, doesn't that prevent gnome-initial-setup from running at all? It should.

Oh, yeah, I guess creating a user would. Wasn't thinking. :D

> I can't use that because it's going to be qwerty only and it's just too hard to guess which key corresponds to which letter.

I'm not actually sure it necessarily is. We need to be able to set a correct keymap much earlier to decrypt encrypted partitions, so theoretically the rescue shell could use the correct layout. I don't think I've ever tested to see if it *does*, though.

Comment 7 Adam Williamson 2021-08-24 18:05:13 UTC
OK, so now I have the rescue shell working. ps aux shows `pkexec --user root /usr/bin/fedora-third-party disabled` running, and `/usr/lib/polkit-1/polkit-agent-helper-1 root`. Perhaps there's an issue with policykit expecting user interaction or something?

Comment 8 Adam Williamson 2021-08-24 18:08:26 UTC
attaching strace to the fedora-third-party process shows it sitting at:

restart_syscall(<... resuming interrupted read ...>

and stracing the polkit-agent-helper-1 process shows:

read(0, 

so it definitely looks like they're just sort of sitting around waiting for...something...that isn't happening. Journal doesn't show anything interesting from polkit.

Comment 9 Adam Williamson 2021-08-24 18:10:33 UTC
Running either 'fedora-third-party disable' or 'pkexec --user root /usr/bin/fedora-third-party disable' from the root console directly returns immediately.

Comment 10 Adam Williamson 2021-08-24 18:29:51 UTC
Oh, the problem definitely *is* that we're waiting for authentication. I killed the fedora-third-party process from the debug shell. That caused tty1 (where g-i-s was running) to go to the "Oh no! Something went wrong" screen. I then alt-f4'ed the "Oh no" screen, and found an authentication prompt hiding "behind" it:

==== AUTHENTICATING FOR org.fedoraproject.thirdparty.run ====
Authentication is required to configure software repositories
Authenticating as: root
Password: 

so that definitely appears to be the issue.

Comment 11 Adam Williamson 2021-08-24 18:39:01 UTC
Looks like adding `org.fedoraproject.thirdparty` to the things allowed in the g-i-s policykit policy fixes this, I tested by editing it live. kalev has run a scratch build with the same change, I'll run that through openQA to confirm the fix there.

Comment 12 Kalev Lember 2021-08-24 20:16:46 UTC
Should be hopefully fixed in gnome-initial-setup-41~beta-1.fc35.1

Comment 13 Fedora Update System 2021-08-24 20:17:23 UTC
FEDORA-2021-41d8b36cd2 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-41d8b36cd2

Comment 14 Adam Williamson 2021-08-24 21:55:20 UTC
+3 in https://pagure.io/fedora-qa/blocker-review/issue/400 , marking accepted.

Comment 15 Fedora Update System 2021-08-24 22:13:16 UTC
FEDORA-2021-41d8b36cd2 has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.