Bug 1691909
Summary: | GDM fallback from Wayland to X11 no longer works because it takes too long to start gnome-shell (affects 'basic graphics mode' / nomodeset, maybe other cases) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> | ||||
Component: | gdm | Assignee: | Ray Strode [halfline] <rstrode> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 30 | CC: | caillon+fedoraproject, fzatlouk, gmarr, gnome-sig, john.j5live, julen, kparal, mclasen, normand, rhughes, robatino, rstrode, samuel-rhbugs, satellitgo, zbyszek | ||||
Target Milestone: | --- | Keywords: | CommonBugs | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | https://fedoraproject.org/wiki/Common_F30_bugs#basic-graphics-fails AcceptedBlocker | ||||||
Fixed In Version: | gdm-3.32.0-3.fc30 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-04-17 16:04:40 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1574714, 1574715 | ||||||
Attachments: |
|
Description
Adam Williamson
2019-03-22 19:34:22 UTC
Created attachment 1547109 [details]
journal messages from an affected F30 boot, with precise timestamps
due to https://fedoraproject.org/wiki/Changes/Login_Screen_Over_Wayland bios boot install of workstation stops at gdm Work around: install cinnamon then workstation as 2nd DE with dnf groupinstall ...... reboot log in to xorgGNOME Discussed during the 2019-03-25 blocker review meeting: [1] The decision to delay the classification of this as a blocker bug was made as we felt we couldn't vote on this without a clearer understanding of what a possible fix might look like. We will delay for now for more information and vote again. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-03-25/f30-blocker-review.2019-03-25-16.01.txt Discussed during the 2019-03-25 blocker review meeting: [1] The decision to delay the classification of this as a blocker bug was made as this issue affects 'basic graphics' on BIOS, so in part we are delaying the decision to after we decide whether the 'basic graphics' criterion will continue to apply to Final. We also want to look further into what other cases still rely on the X11 fallback. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-03-25/f30-blocker-review.2019-03-25-16.01.txt Just tested to boot 1.7 with basic video mode on uefi with ryzen 5 1600X + nvidia 1060. The box is not frozen (I can switch ttys and use console mode without any issues), but the first tty ends with an ethernal blinking cursor after "gdm starts" systemd notice. It works properly with nouveau if basic video mode is not set. jlanda: you're probably running into some subsequent bug, like I did testing on bare metal. Check the journal messages, there'll probably be some kinda error in there. Discussed during the 2019-04-01 blocker review meeting: [1] The decision to classify this bug as an "AcceptedBlocker" was made as it is a violation of the 'basic' graphics requirement for Workstation on BIOS. It likely also affects other cases and is a showstopping bug whenever it happens. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-04-01/f30-blocker-review.2019-04-01-16.00.txt I'm hoping to look into this tomorrow afternoon or monday. if someone wants to investigate in the interim, it would be useful to know if there's the gigantic boot slow down with an f29 kernel too. An easy workaround for this bug would be a udev rule that disables wayland if nomodeset is on the kernel command line so we aren't relying on fallback. Of course, that's not going to help in other situations where fallback is required. We could also ressurrect a patch I have upstream to do in-session registration via an autostart file, and use that registration as a marker for success instead of assuming N seconds without failure is success. But if we can root out why it's taking 8 seconds to fallback when in f29 it took less than a second, then that's best. I mean, if f30 is shipping with slow boot that's a problem in its own right. I'll try doing a fresh install and reproduce soonish. You're one version out on the window - the bug affects both F29 and F30. F28 is the last release where this worked. So, you would find results on F30 but with an F28 kernel interesting? well f29 is failing because of the cangraphical patch right? not slow boot? if boot is slow in f29 too, then yea f28 kernel i guess. f29 has both bugs, just like f30 does. So i'm a little at a loss for this bug. I can't reproduce the slow down on a fresh install. I ran some filtering over the log. First i trimmed everything before Mar 22 11:42:10 localhost.localdomain gdm[992]: GdmDisplay: Managing display: and everything after: Mar 22 11:42:17 localhost.localdomain gnome-shell[1236]: Failed to create backend: No GPUs found with udev then I filtered out all the audit messages where weren't strictly related to boot progression, and finally I sorted the log by succeeding lines that are the longest apart from each other. $ cat journal-log | while read line; do STAMP=$(echo $line |awk '{ print $3 }' | awk -F: '{ print $3 }'); [ -z "$LAST_STAMP" ] && LAST_STAMP=$STAMP; diff=$(echo "$STAMP - $LAST_STAMP" |bc); echo -e "\v{$diff seconds\v$LAST_LINE →\v$line}"; LAST_STAMP=$STAMP; LAST_LINE="$line"; done |sort -rn | fpaste https://paste.fedoraproject.org/paste/hNL41NHQxiSVivjWSthxIg The biggest recurring offender is: {.317214 seconds Mar 22 11:42:13.632235 localhost.localdomain dbus-broker-launch[891]: Activation request for 'org.freedesktop.resolve1' failed: Unit dbus-org.freedesktop.resolve1.service not found. Mar 22 11:42:13.949449 localhost.localdomain dbus-broker-launch[891]: Activation request for 'org.freedesktop.resolve1' failed: Unit dbus-org.freedesktop.resolve1.service not found. } So I think there's some problem with the installation. I don't know why resolve wouldn't be found though, it's shipped in systemd proper. Might be an selinux problem or a dbus-broker problem. Anyway, I've built a change to gdm to skip fallback logic entirely if nomodeset is on kernel command line. that should sidestep the bug (at least in the nomodeset case). would be good to get to the bottom of the resolved problem, but I don't see it here, so maybe it's not widespread? moving to MODIFIED given the gdm workaround, but we may want to clone for the slowdown problem. gdm-3.32.0-3.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-de11d64b9e gdm-3.32.0-3.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-de11d64b9e gdm-3.32.0-3.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report. > Mar 22 11:42:13.632235 localhost.localdomain dbus-broker-launch[891]: Activation request for 'org.freedesktop.resolve1' failed: Unit dbus-org.freedesktop.resolve1.service not found.
There was a change wrt to systemd-resolved.service recently. Before, by mistake, systemd-resolved.service was partially
enabled (in the "systemctl enable" sense), because the symlink in multi-user.service.wants/ was not present, but
the dbus activation symlink was present. This means resolved wouldn't be started automatically by systemd, but
it could be auto-activated by dbus. I assumed it doesn't make sense to have systemd-resolved enabled like
and cleaned this up to provide neither symlink, based on the assumption that the main "entry point" for resolved
is though the nss-resolve module, which is not present in /etc/nsswitch.conf in the the default configuration.
If you think systemd-resolved.service should be enabled by default, or maybe just dbus-activatable, please
open a new bug. But on the surface of things, I'd expect all consumers to use nss and not to depend on
systemd-resolved specifically.
|