Created attachment 2022681 [details] photo of initial setup Tested Fedora 40 Beta RC 1.9 KDE on Raspberry Pi 4: - graphical output of initial setup is broken, see attached photo - when rebooted with nomodeset, initial setup works as expected, I can go through it and finish the setup, but graphical desktop won't load, all I see is black screen and mouse cursor - when rebooted without nomodeset, desktop loads and works as expected mesa-dri-drivers-24.0.0-2.fc40.aarch64 kernel-6.8.0-0.rc6.49.fc40.aarch64 bcm283x-firmware-20240229-2.dc94391.fc40.aarch64 initial-setup-0.3.100-3.fc40.aarch64 initial-setup-gui-wayland-plasma-40.0-1.fc40.noarch initial-setup-gui-0.3.100-3.fc40.aarch64 kwin-wayland-6.0.2-1.fc40.aarch64 kwin-6.0.2-1.fc40.aarch64
Created attachment 2022695 [details] journal from the first boot (rhgb quiet)
Created attachment 2022696 [details] journal from the second boot (nomodeset)
Created attachment 2022697 [details] journal from the third boot (rhgb quiet)
I think it's safe to say this is the same regression as seen with GNOME so let's deal with it on one bug not to confuse and keep things in a single locations. *** This bug has been marked as a duplicate of bug 2269412 ***
This scratch build of mesa https://koji.fedoraproject.org/koji/taskinfo?taskID=115518788 fixes the bug 2269412, but doesn't fix this one, it is not the same regression. Reopening.
The bug is still present in rawhide, tested Fedora-KDE-Rawhide-20240618.n.0.aarch64.
Present on Fedora-KDE-41-20240821.n.0 , now blocking candidate.
So if I get this straight, we're talking about initial-setup looking broken *with* modesetting (ie with the normal rpi gfx driver) but fine *without* (ie using simpledrm), but then Plasma is a black screen *without* modesetting but works fine *with* modesetting? This is kind of baffling since we use KWin for both phases.
reproduced with Fedora-KDE-41-20240824.n.0.aarch64.raw.xz. Possibly related kernel messages: | [ 69.410224] vc4-drm gpu: swiotlb buffer is full (sz: 540672 bytes), total 32768 (slots), used 1381 (slots) | [ 71.989444] vc4-drm gpu: swiotlb buffer is full (sz: 1560576 bytes), total 32768 (slots), used 7 (slots) | [ 72.111001] vc4-drm gpu: swiotlb buffer is full (sz: 12582912 bytes), total 32768 (slots), used 1 (slots) | [ 72.471197] vc4-drm gpu: swiotlb buffer is full (sz: 520192 bytes), total 32768 (slots), used 1 (slots) After reading through #2269412 I see that the CMA pool is just 64MB | Reserved memory: created CMA memory pool at 0x0000000035e00000, size 64 MiB instead of 256MB as intended by /boot/efi/config.txt which specifies `dtoverlay=cma,cma-256`. Overriding that via appending "cma=256M" to the kernel command line makes no difference though. The usage of swiotlb is a little strange though. https://github.com/raspberrypi/linux/issues/3416#issuecomment-576375208 suggest v3d should be using direct DMA and not bounce buffers. Could be easily missing support in the upstream kernel.
Discussed during the 2024-08-26 blocker review meeting: [1] The decision to classify this bug as a AcceptedBlocker (Beta) was made: "A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility", for the now release-blocking configuration of KDE on the Raspberry PI [1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-26/f41-blocker-review.2024-08-26-16.00.log.html
This seems to be modifier mismatch between kwin_wayland and Xwayland. Starting wayland applications like konsole on the wayland display like > QT_QPA_PLATFORM=wayland-egl DISPLAY=:0 XDG_RUNTIME_DIR=/tmp/tmp.OJ8Rlhgwig-initial-setup-runtime-dir konsole works without corruption, as does starting glxgears. Classic X11 applications like xeyes show however similar corruption. The corruption looks like it could by interpreting a buffer with a linear image as it had DRM_FORMAT_MOD_BROADCOM_UIF or vice versa. Changing GDK_BACKEND to "wayland" for the initial setup unfortunately does not work as noted in /usr/libexec/initial-setup/initial-setup-graphical. That would have made the issue less severe.
(In reply to Janne Grunau from comment #11) > > Changing GDK_BACKEND to "wayland" for the initial setup unfortunately does > not work as noted in /usr/libexec/initial-setup/initial-setup-graphical. > That would have made the issue less severe. This is something that might be possible once Wayland-native Anaconda is done for F42.
As workaround for initial-setup kwin_wayland's compositing backend could be overridden to QPainter with "KWIN_COMPOSE=Q kwin_wayland ..." in /usr/libexec/initial-setup/run-gui-backend. See https://invent.kde.org/plasma/kwin/-/wikis/Environment-Variables#kwin_compose This still leaves kwin_wayland + Xwayland broken but at least unblocks initial-setup. I still do not understand where / how the mixup between DRM_FORMAT_MOD_LINEAR and probably DRM_FORMAT_MOD_BROADCOM_UIF is coming from.
A slightly better workaround would be to export 'XWAYLAND_NO_GLAMOR=1' before starting kwin_wayland / Xwayland I have identified the root cause of the issue though. kwin_wayland reports the KMS DRM device as main and only device in its dmabuf feedback. This is unexpected since the GPU device would be a much better choice for interoperability between compositor and applications. Due to this choice of device Xwayland can not support explicit modifiers. I think Xwayland handles this situation incorrectly because it decides to allocate buffers with implicit modifiers instead of linear layout as the linux_dmabuf_feedback main_device event documentation specifies. I'll be working on a Xwayland patch/PR to use linear surfaces. I started discussion with kwin developers and will file a bug in their bugzilla tomorrow. I believe this issue affects all devices which use mesa kmsro drivers. Despite this this issue is not an mesa issue but the combination of issues in kwin and Xwayland.
Created attachment 2045039 [details] WIP Xwayland patch resolving the issue work in progress Xwayland patch to fall back to linear surfaces if no modifiers are available due to mismatching DRM devices in xwl_dmabuf_get_modifiers_for_device(). Ignores the "corner" case of of not using dmabuf protocol 4 or later. A better fix for explicit modifier support would be to source the list of supported modifiers in Xwayland independently and intersect that list with the dmabuf feedback list ignoring DRM devices.
Not all drivers support linear buffers, e.g. the nvidia driver currently doesn't in general. So Xwayland can't really just assume it'll work if DRM_FORMAT_MOD_LINEAR isn't advertised.
but how is GPU interop supposed to work in this case? Different GPUs from different vendors are certainly not using compatible implicit modifiers. The patch is supposed to cover the case that `xwl_dmabuf_get_modifiers_for_device()` returns FALSE because `drmDevicesEqual(dev_formats->drm_dev, device)` is never true. kwin sends only dmabuf feedback tranches with the KMS DRM device which Xwayland doesn't use. I still think that kwin's choice is unexpected but arguably Xwayland should determine the modifiers of its DRM device and intersect them with the dmabuf feedback ones. Before I submit a Xwayland PR I will make testing for the device mismatch for `dmabuf_protocol_version >= 4` explicit. As a second step I'll try to make the explicit modifier path useable on this setup (kwin + Xwayland on Fedora RPi4). It is also a little regrettable that it's possible to end up with different implicit modifiers on kmsro setups. I don't think it's avoidable though.
(In reply to Janne Grunau from comment #17) > but how is GPU interop supposed to work in this case? Different GPUs from > different vendors are certainly not using compatible implicit modifiers. Indeed, interop can only work if both devices support at least one common modifier. With upstream drivers, the linear modifier should always work as a worst case fallback. Alas, nvidia. > kwin sends only dmabuf feedback tranches with the KMS DRM device which Xwayland doesn't > use. I still think that kwin's choice is unexpected but arguably Xwayland > should determine the modifiers of its DRM device and intersect them with the > dmabuf feedback ones. You previously wrote Xwayland "can not support explicit modifiers" due to the KMS DRM device being used in the dma-buf feedback. Doesn't that mean the intersection would be empty?
It can not support explicit modifiers because kwin does not report modifiers for the device. For intersecting xwayland would have to use other means to query the supported modifiers, for example eglQueryDmaBufFormatsEXT / eglQueryDmaBufModifiersEXT. xwayland merge request in https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666 The root cause is that kwin announces distinct DRM device in wl_drm and dmabuf_feedback. The issue would resolve itself if either kwin or xwayland would stop using wl_drm.
Created attachment 2045082 [details] patch from https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666
(In reply to Janne Grunau from comment #19) > The root cause is that kwin announces distinct DRM device in wl_drm and > dmabuf_feedback. The issue would resolve itself if either kwin or xwayland > would stop using wl_drm. FWIW, AFAICT Xwayland uses wl_drm only because the main device advertised by the dma-buf protocol doesn't match the device used by Xwayland for glamor.
(In reply to Janne Grunau from comment #20) > Created attachment 2045082 [details] > patch from https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666 This addresses the issue with initial-setup.
We recently merged a fix for v3d upstream to remove the heuristic followed internally for implicit modifiers to always use LINEAR. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30946 The fix was marked for backporting to mesa stable versions so it would reach soon Fedora Mesa package. This addressed an issue on TigerVNC that was similar, v3d was importing as it was tiled (UIF) but it was allocated linear as it was requested as SCANOUT. It is easy to validate this behaviour to check this if you launch the application (Xwayland) and the compositor with V3D_DEBUG=surface so you can check the layout used by the driver to import (KWin) and the one used for allocation (XWayland). I think we were the only driver allowing the tiled format (UIF) for importing buffers with implicit modifiers. We did that to avoid consuming CMA memory in situations where the BOs are not expected to be used for SCANOUT by the display driver. Currently allocated mesa render buffers with INVALID modifiers are forced to have the SCANOUT flag for the worst case scenario, when it is not needed as they are not used full-screen for direct scanout. This could imply that the v3d driver could run out of CMA memory and show lower performance as probably the compositor would need to sample that implies an extra conversion from LINEAR to TILE.
FEDORA-2024-21402656dd (mesa-24.2.1-5.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd
FEDORA-2024-21402656dd has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-21402656dd` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
(In reply to Fedora Update System from comment #24) > FEDORA-2024-21402656dd (mesa-24.2.1-5.fc41) has been submitted as an update > to Fedora 41. > https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd This by itself doesn't solve the issue.
I assume we still need the Xwayland fix too.
FEDORA-2024-bb7298426c has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-bb7298426c` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-bb7298426c See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666 seems to have a couple of objections upstream which makes me hesitate about whether we should backport that. Janne, is there any progress there? Is your proposal that "an alternative fix for kwin could be to just discard wl_drm if its DRM device in not a target_device in any of the dmabuf feedback tranches" something we can look into?
There are two new kwin merge request which fix the root cause of this issue without the suboptimal fallback to linear buffers in the mesa and in my xwayland change. The two kwin changes backport cleanly to kwin-6.1.4. Tested on RPi4 and both the `wayland-info` output and tests of X11 applications under kwin/Xwayland confirm the the issue as fixed. PR with backports of these change for Fedora's kwin packahge created.
FEDORA-2024-b475969e23 (kwin-6.1.4-2.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-b475969e23
FEDORA-2024-a7aaaed6c9 (kwin-6.1.4-2.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7aaaed6c9
FEDORA-2024-a7aaaed6c9 has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-a7aaaed6c9` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7aaaed6c9 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-b475969e23 has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-b475969e23` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b475969e23 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
(In reply to Janne Grunau from comment #30) > There are two new kwin merge request which fix the root cause of this issue > without the suboptimal fallback to linear buffers in the mesa and in my > xwayland change. The two kwin changes backport cleanly to kwin-6.1.4. Tested > on RPi4 and both the `wayland-info` output and tests of X11 applications > under kwin/Xwayland confirm the the issue as fixed. > > PR with backports of these change for Fedora's kwin package created. Thanks a lot, the issue is fixed with kwin-6.1.4-2.fc41 !
FEDORA-2024-a7aaaed6c9 (kwin-6.1.4-2.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2024-b475969e23 (kwin-6.1.4-2.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.