Bug 2005625
Summary: | gnome-keyring does not work correctly in gnome-initial-setup session (causes delay on user creation, incorrect login keyring password, failure to connect to wifi network...) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Michael Catanzaro <mcatanza> | ||||||||
Component: | gnome-keyring | Assignee: | Matthias Clasen <mclasen> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 35 | CC: | alciregi, awilliam, bberg, caillon+fedoraproject, debarshir, dueno, geraldo.simiao.kutz, gmarr, gnome-sig, jstpierr, kparal, mclasen, pwhalen, robatino, rstrode, sandmann, sgrubb, stefw, tiagomatos, walters | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | AcceptedBlocker AcceptedFreezeException | ||||||||||
Fixed In Version: | gnome-keyring-40.0-3.fc35 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2021-09-24 00:31:13 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1891953, 1891954, 1891955 | ||||||||||
Attachments: |
|
Description
Michael Catanzaro
2021-09-18 19:03:27 UTC
Proposed as a Blocker for 35-beta by Fedora user catanzaro using the blocker tracking app because: "A working mechanism to create a user account must be clearly presented during installation and/or first boot of the installed system." If this is happening for everyone, I'd say it's an obvious blocker since it prevents successful completion of the initial setup process. If it happens more rarely, then maybe not. (In reply to Michael Catanzaro from comment #0) > Description of problem: gnome-initial-setup hangs after passing Er: it hangs after entering a password on the password page. mutter's unresponsive application dialog appears and does not disappear within a reasonable amount of time. Allright, that's on me. I'll redo the test and see how many seconds I wait the dialog to disappear and the process to go on. :D FYI I set up a VM using virt-manager, UEFI firmware (no secure boot), 6Gb RAM and 3 to 5 CPUs, and virtio as video. Created attachment 1824289 [details]
video of GIS successful
The dialog stood only for a few seconds, I clicked on the "wait" button (two times) and it finished.
In my case, yes there is a delay after I enter a password and before the Start using Fedora Linux window; the delay lasts some seconds, but mutter's unresponsive application dialog doesn't appear. (virt-manager, BIOS, 2 cpu, 4GB ram, video QXL) I tried again and this time it took about 30 seconds, with no unresponsive application dialog. Not the end of the world, but still pretty bad. Likely the same underlying cause as bug #2004565. I tested with Fedora-Workstation-Live-x86_64-35-20210919.n.0.iso. I can confirm clicking Next at the password step takes 20-25 seconds before it continues, and if you try to click on the form in the meantime, you can "Initial Setup is not responding. [Wait] [Kill]" dialog. Discussed during the 2021-09-20 blocker review meeting: [0] The decision to classify this bug as both a "RejectedBlocker (Beta)", an "AcceptedBlocker (Final)" and an "AcceptedFreezeException (Beta)" was made as a conditional violation of the criterion: "A working mechanism to create a user account must be clearly presented during installation and/or first boot of the installed system." We don't think it's bad enough to block Beta, but it is bad enough to warrant a freeze exception [0] https://meetbot.fedoraproject.org/fedora-blocker-review/2021-09-20/f35-blocker-review.2021-09-20-16.00.txt So I reproduced this and grabbed the logs. There are a ton of AVCs to do with watches for a zillion font paths and a few others that seem to be irrelevant. Filtering those out, I get this: Sep 20 15:35:48 fedora usermod[1563]: change user 'test' password Sep 20 15:35:48 fedora /usr/libexec/gdm-wayland-session[996]: dbus-daemon[996]: [session uid=983 pid=996] Activating service name='org.freedesktop.secrets' requested by ':1.64' (uid=983 pid=1381 comm="/usr/libexec/gnome-initial-setup " label="system_u:system_r:xdm_t:s0-s0:c0.c1023") Sep 20 15:35:48 fedora gnome-keyring-daemon[1496]: The Secret Service was already initialized Sep 20 15:35:48 fedora /usr/libexec/gdm-wayland-session[1571]: SSH_AUTH_SOCK=/run/user/983/keyring/ssh Sep 20 15:36:13 fedora gnome-initial-s[1381]: Failed to get secret service: Error calling StartServiceByName for org.freedesktop.secrets: Timeout was reached Sep 20 15:36:15 fedora gdm-password][1578]: gkr-pam: unable to locate daemon control file Sep 20 15:36:15 fedora audit[1578]: USER_AUTH pid=1578 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:authentication grantors=pam_usertype,pam_localuser,pam_unix,pam_gnome_keyring acct="test" exe="/usr/libexec/gdm-session-worker" hostname=fedora addr=? terminal=/dev/tty1 res=success' Sep 20 15:36:15 fedora gdm-password][1578]: gkr-pam: stashed password to try later in open session Sep 20 15:36:15 fedora audit[1578]: USER_ACCT pid=1578 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:accounting grantors=pam_unix,pam_localuser acct="test" exe="/usr/libexec/gdm-session-worker" hostname=fedora addr=? terminal=/dev/tty1 res=success' Sep 20 15:36:15 fedora audit[1578]: CRED_ACQ pid=1578 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_localuser,pam_unix,pam_gnome_keyring acct="test" exe="/usr/libexec/gdm-session-worker" hostname=fedora addr=? terminal=/dev/tty1 res=success' Sep 20 15:36:15 fedora audit[1578]: USER_ROLE_CHANGE pid=1578 uid=0 auid=1000 ses=2 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='pam: default-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 selected-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 exe="/usr/libexec/gdm-session-worker" hostname=fedora addr=? terminal=/dev/tty2 res=success' I also tested from the same snapshot but with enforcing=0; behaviour did not change, so SELinux is not the issue here. (In reply to Adam Williamson from comment #10) > So I reproduced this and grabbed the logs. There are a ton of AVCs to do > with watches for a zillion font paths and a few others that seem to be > irrelevant. Can you report new bugs for these, please? It seems lie the selinux policy for xdm_t has somehow become generally much too strict. I suspect https://bugzilla.redhat.com/show_bug.cgi?id=2006314 is yet another consequence of the same problem. There's an earlier error that might be significant: Sep 20 15:34:40 fedora gnome-keyring-daemon[1496]: couldn't register in session: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name is not activatable Created attachment 1825028 [details]
entire journal content
Here's the full journal, just in case I missed anything else useful.
So there's a long discussion about this in #fedora-desktop which boils down to 'we know what's going on and now we're debating what to do about it'. But sounds like we can come up with a quick hack for Beta at least. [benzea] this is glib refusing to use DBUS_SESSION_BUS_ADDRESS since quite recently when AT_SECURE is active sorry, there was further progress. this seems to be https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2212 , basically - upstream reverted a workaround that we were still relying on. the easiest short-term fix should be just to revert the revert, and restore the workaround. I am currently testing a scratch build with that change. (In reply to Adam Williamson from comment #17) > sorry, there was further progress. this seems to be > https://gitlab.gnome.org/GNOME/glib/-/merge_requests/2212 , basically - > upstream reverted a workaround that we were still relying on. the easiest > short-term fix should be just to revert the revert, and restore the > workaround. I am currently testing a scratch build with that change. Please also test this scratch build, if you don't mind: https://koji.fedoraproject.org/koji/taskinfo?taskID=76081174 (Without any changes in glib.) This removes the capability to allocate extra secure memory. If you have a huge keyring, it should fall back to using normal malloc(), so users shouldn't notice any difference. Applications never lock the memory used to store the passwords after they're retrieved from D-Bus anyway, nor does D-Bus itself, so I'm not convinced the locking ever actually accomplished much good regardless. sure, I've triggered openQA on that scratch build, will grab and test the ISO when it's done. I am currently testing an install of the ISO built with my scratch build of glib. my glib scratch build does indeed avoid the issue, but will also test mcatanzaro's gnome-keyring scratch build before we do any real ones. *** Bug 2004565 has been marked as a duplicate of this bug. *** mcatanzaro's build doesn't seem to do the trick. the delay after creating a user is still there, the login keyring shows as locked. as we're short on time, I'll do a real build of glib and we'll go with that for the RC. FEDORA-2021-fa340d4bf0 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-fa340d4bf0 Not the best place to follow up. I am a bit confused by mcatanzaro's scratch build not working, not sure what the change was, but I still see: %attr(0755,root,root) %caps(cap_ipc_lock=ep) %{_bindir}/gnome-keyring-daemon in the spec file. So maybe just the wrong .spec file got build? The gnome-keyring-daemon is started in a weird way, IIRC. i.e. by the PAM module. I wonder if we could increase its ulimit somehow without affecting the rest of the session. Some knobs we can modify are: 1. user@.service LimitMEMLOCK= value; which should affect most of the session (but not e.g. the shell through SSH) 2. DefaultLimitMEMLOCK= to decrease it again for most applications 3. LimitMEMLOCK= for single units But, they keyring is started from a PAM module and does seteuid/setegid to drop privileges. Or, maybe a stupid idea. But could we possibly use socket-activation and write the password into a systemd managed socket to start the keyring? Then we have a systemd unit in the user session where we can play around with the options and the above knobs could be enough to do everything. (In reply to Benjamin Berg from comment #24) > Not the best place to follow up. I am a bit confused by mcatanzaro's scratch > build not working, not sure what the change was, but I still see: > > %attr(0755,root,root) %caps(cap_ipc_lock=ep) > %{_bindir}/gnome-keyring-daemon > > in the spec file. So maybe just the wrong .spec file got build? Nah, I simply didn't notice that. I just added --without-libcap-ng to the configure line, so it no longer attempts to use the capability, but of course that does no good if the binary is still installed with the capability! I will do a new scratch build. BTW Adam, if this new scratch build works, then I think we should prefer a gnome-keyring update rather than a glib update. (That said, of course it was good that you prepared the glib fix when you saw that my gnome-keyring fix didn't work.) > The gnome-keyring-daemon is started in a weird way, IIRC. i.e. by the PAM > module. I wonder if we could increase its ulimit somehow without affecting > the rest of the session. > > Some knobs we can modify are: > 1. user@.service LimitMEMLOCK= value; which should affect most of the > session (but not e.g. the shell through SSH) > 2. DefaultLimitMEMLOCK= to decrease it again for most applications > 3. LimitMEMLOCK= for single units > > But, they keyring is started from a PAM module and does seteuid/setegid to > drop privileges. That's going to make it *real* hard to debug why gnomes-keyring behaves different when started via the command line. > Or, maybe a stupid idea. But could we possibly use socket-activation and > write the password into a systemd managed socket to start the keyring? Then > we have a systemd unit in the user session where we can play around with the > options and the above knobs could be enough to do everything. Might still be confusing, but that seems a lot simpler. Anyway, it's beyond the scope of this issue IMO. Adam, here is a second scratch build to test: https://koji.fedoraproject.org/koji/taskinfo?taskID=76116044 If that works, I'll prepare a real update. Thanks, Michael. We already built RC1 with the glib update in it. If the new scratch build works we can take the workaround back out of glib as a post-Beta update, though. I'll test it. Hah, so I tested that and it doesn't work in the built image...then I checked and found the built image doesn't have that build in it. You've built it as gnome-keyring-40.0-1golbat2.fc35 , but the current stable is 40.0-2.fc35, which is higher versioned. So that's what gets pulled in when we build the image, not the scratch build. I'll re-run the test and pull in the scratch build manually after install but before running g-i-s, I guess. OK, with the scratch build forced in, it looks like it does work. No delay, login keyring is unlocked. (In reply to Adam Williamson from comment #28) > Hah, so I tested that and it doesn't work in the built image...then I > checked and found the built image doesn't have that build in it. You've > built it as gnome-keyring-40.0-1golbat2.fc35 , but the current stable is > 40.0-2.fc35, which is higher versioned. So that's what gets pulled in when > we build the image, not the scratch build. Sigh, sorry about that. The whole point of the weird custom NVR was to keep it higher than the current NVR but lower than the next real build. Oh well. Anyway, since we have the glib workaround this is no longer urgent. Daiki, is it OK to build gnome-keyring using --without-libcap-ng and remove the capabilities from the binary? I can handle it if you agree. *** Bug 2006314 has been marked as a duplicate of this bug. *** yeah, 2golbat2 would've worked :D you may just have forgotten to 'git pull' before doing the scratch build, I guess. I left the glib workaround in for Beta RC2 as it seemed safer. I think it's fine to go with that for Beta then we can get it flipped over with post-Beta updates. (In reply to Michael Catanzaro from comment #30) > (In reply to Adam Williamson from comment #28) > > Hah, so I tested that and it doesn't work in the built image...then I > > checked and found the built image doesn't have that build in it. You've > > built it as gnome-keyring-40.0-1golbat2.fc35 , but the current stable is > > 40.0-2.fc35, which is higher versioned. So that's what gets pulled in when > > we build the image, not the scratch build. > > Sigh, sorry about that. The whole point of the weird custom NVR was to keep > it higher than the current NVR but lower than the next real build. Oh well. > > Anyway, since we have the glib workaround this is no longer urgent. Daiki, > is it OK to build gnome-keyring using --without-libcap-ng and remove the > capabilities from the binary? I can handle it if you agree. I guess I have to agree, sorry for not responding in upstream. Steve, do you have any strong objections? We could make it better after migrating to a new keyring file format... (In reply to Adam Williamson from comment #32) > yeah, 2golbat2 would've worked :D you may just have forgotten to 'git pull' > before doing the scratch build, I guess. Nah, I just messed up and assumed the current version was -1. A look at the diff would have avoided that. Oh well. > I left the glib workaround in for Beta RC2 as it seemed safer. I think it's > fine to go with that for Beta then we can get it flipped over with post-Beta > updates. Agreed. For simplicity, I'm going to revert your glib change in rawhide now, but I'll leave it alone in F35 since there is no particularly urgent reason to drop your workaround. (In reply to Daiki Ueno from comment #33) > I guess I have to agree, sorry for not responding in upstream. Steve, do you > have any strong objections? We could make it better after migrating to a new > keyring file format... I've gone ahead with the change in Fedora. Let's keep discussing in https://gitlab.gnome.org/GNOME/gnome-keyring/-/issues/77. FEDORA-2021-fa340d4bf0 has been pushed to the Fedora 35 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-fa340d4bf0` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-fa340d4bf0 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2021-824a0b6f04 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2021-824a0b6f04 I suppose it's OK. I really have no idea how gnome-keyring has changed since the last time I looked at it. Created attachment 1825541 [details]
third party repos screenshot
I tested F35 RC2 with
glib2-2.70.0-2.fc35.x86_64
gnome-initial-setup-41~rc-3.fc35.x86_64
I can't test wifi configuration, but the other problems seem resolved. Location services and problem reporting toggles are honored. The login keyring is now functional.
I do have a question about third party repositories. Enabling them in gnome-initial-setup toggles "Enable New Repositories" under "Fedora Third Party Repositories", but doesn't actually enable any of those 4 listed (see the attached screenshot). So when searching for Chrome or Steam, there are no results. Is that expected/desired? Should I file a separate bug about it? (The flathub filtered repo is also nowhere to be found, expected?)
tested Fedora-Workstation-Live-x86_64-35_Beta-1.2.iso and it seems to me this bug is really fixed now. Made a video from the sucessful first and second login: https://youtu.be/7S-bMIyJbwQ (In reply to Kamil Páral from comment #38) > I do have a question about third party repositories. Enabling them in > gnome-initial-setup toggles "Enable New Repositories" under "Fedora Third > Party Repositories", but doesn't actually enable any of those 4 listed (see > the attached screenshot). That's all expected. > So when searching for Chrome or Steam, there are > no results. Is that expected/desired? Should I file a separate bug about it? Not expected. The repo metadata should be enabled. Please report a bug against gnome-software. > (The flathub filtered repo is also nowhere to be found, expected?) Not sure. I would start with a bug against fedora-third-party. (In reply to Michael Catanzaro from comment #41) > Not sure. I would start with a bug against fedora-third-party. Actually I think that's just bug #2001837, which is not fixed yet. FEDORA-2021-fa340d4bf0 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2021-824a0b6f04 has been pushed to the Fedora 35 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-824a0b6f04` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-824a0b6f04 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2021-824a0b6f04 has been pushed to the Fedora 35 stable repository. If problem still persists, please make note of it in this bug report. (In reply to Michael Catanzaro from comment #34) > Agreed. For simplicity, I'm going to revert your glib change in rawhide now, > but I'll leave it alone in F35 since there is no particularly urgent reason > to drop your workaround. Note: I'm going to revert the glib workaround in F35 now, since it should no longer be needed. If anything goes wrong we can always put it back, but I don't expect problems. |