Bug 2270430 - Raspberry Pi 4: KDE initial setup is broken without nomodeset, KDE desktop won't load with nomodeset
Summary: Raspberry Pi 4: KDE initial setup is broken without nomodeset, KDE desktop wo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kwin
Version: rawhide
Hardware: aarch64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Neal Gompa
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: https://discussion.fedoraproject.org...
Depends On:
Blocks: F41BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2024-03-20 11:23 UTC by Lukas Brabec
Modified: 2024-09-13 01:52 UTC (History)
25 users (show)

Fixed In Version: kwin-6.1.4-2.fc41
Clone Of:
Environment:
Last Closed: 2024-09-11 18:21:45 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
photo of initial setup (1.08 MB, image/jpeg)
2024-03-20 11:23 UTC, Lukas Brabec
no flags Details
journal from the first boot (rhgb quiet) (186.95 KB, text/plain)
2024-03-20 11:52 UTC, Lukas Brabec
no flags Details
journal from the second boot (nomodeset) (398.03 KB, text/plain)
2024-03-20 11:53 UTC, Lukas Brabec
no flags Details
journal from the third boot (rhgb quiet) (354.65 KB, text/plain)
2024-03-20 11:54 UTC, Lukas Brabec
no flags Details
WIP Xwayland patch resolving the issue (1.60 KB, patch)
2024-08-30 00:09 UTC, Janne Grunau
no flags Details | Diff
patch from https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666 (4.40 KB, patch)
2024-08-30 22:22 UTC, Janne Grunau
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Fedora Package Sources kwin pull-request 16 0 None None None 2024-09-10 11:45:53 UTC
KDE GitLab plasma kwin merge_requests 6369 0 None opened backends/drm: don't do multi gpu copies if we'd copy from and to the same render device 2024-09-10 11:45:53 UTC
KDE GitLab plasma kwin merge_requests 6370 0 None opened platformsupport/scenes/opengl: use the render node for dmabuf feedback 2024-09-10 11:45:53 UTC
freedesktop.org Gitlab xorg xserver merge_requests 1666 0 None opened xwayland: Use linear buffers when dmabuf_feedback requires it 2024-08-31 00:29:03 UTC

Description Lukas Brabec 2024-03-20 11:23:29 UTC
Created attachment 2022681 [details]
photo of initial setup

Tested Fedora 40 Beta RC 1.9 KDE on Raspberry Pi 4:
- graphical output of initial setup is broken, see attached photo
- when rebooted with nomodeset, initial setup works as expected, I can go through it and finish the setup, but graphical desktop won't load, all I see is black screen and mouse cursor
- when rebooted without nomodeset, desktop loads and works as expected


mesa-dri-drivers-24.0.0-2.fc40.aarch64
kernel-6.8.0-0.rc6.49.fc40.aarch64
bcm283x-firmware-20240229-2.dc94391.fc40.aarch64
initial-setup-0.3.100-3.fc40.aarch64
initial-setup-gui-wayland-plasma-40.0-1.fc40.noarch
initial-setup-gui-0.3.100-3.fc40.aarch64
kwin-wayland-6.0.2-1.fc40.aarch64
kwin-6.0.2-1.fc40.aarch64

Comment 1 Lukas Brabec 2024-03-20 11:52:44 UTC
Created attachment 2022695 [details]
journal from the first boot (rhgb quiet)

Comment 2 Lukas Brabec 2024-03-20 11:53:19 UTC
Created attachment 2022696 [details]
journal from the second boot (nomodeset)

Comment 3 Lukas Brabec 2024-03-20 11:54:15 UTC
Created attachment 2022697 [details]
journal from the third boot (rhgb quiet)

Comment 4 Peter Robinson 2024-03-20 12:01:42 UTC
I think it's safe to say this is the same regression as seen with GNOME so let's deal with it on one bug not to confuse and keep things in a single locations.

*** This bug has been marked as a duplicate of bug 2269412 ***

Comment 5 Lukas Brabec 2024-03-28 12:14:51 UTC
This scratch build of mesa https://koji.fedoraproject.org/koji/taskinfo?taskID=115518788 fixes the bug 2269412, but doesn't fix this one, it is not the same regression. Reopening.

Comment 6 Lukas Brabec 2024-06-20 08:27:19 UTC
The bug is still present in rawhide, tested Fedora-KDE-Rawhide-20240618.n.0.aarch64.

Comment 7 František Zatloukal 2024-08-21 15:20:10 UTC
Present on Fedora-KDE-41-20240821.n.0 , now blocking candidate.

Comment 8 Neal Gompa 2024-08-23 15:03:28 UTC
So if I get this straight, we're talking about initial-setup looking broken *with* modesetting (ie with the normal rpi gfx driver) but fine *without* (ie using simpledrm), but then Plasma is a black screen *without* modesetting but works fine *with* modesetting?

This is kind of baffling since we use KWin for both phases.

Comment 9 Janne Grunau 2024-08-24 13:52:05 UTC
reproduced with Fedora-KDE-41-20240824.n.0.aarch64.raw.xz.

Possibly related kernel messages:
| [   69.410224] vc4-drm gpu: swiotlb buffer is full (sz: 540672 bytes), total 32768 (slots), used 1381 (slots)
| [   71.989444] vc4-drm gpu: swiotlb buffer is full (sz: 1560576 bytes), total 32768 (slots), used 7 (slots)
| [   72.111001] vc4-drm gpu: swiotlb buffer is full (sz: 12582912 bytes), total 32768 (slots), used 1 (slots)
| [   72.471197] vc4-drm gpu: swiotlb buffer is full (sz: 520192 bytes), total 32768 (slots), used 1 (slots)

After reading through #2269412 I see that the CMA pool is just 64MB
| Reserved memory: created CMA memory pool at 0x0000000035e00000, size 64 MiB
instead of 256MB as intended by /boot/efi/config.txt which specifies `dtoverlay=cma,cma-256`. Overriding that via appending "cma=256M" to the kernel command line makes no difference though.

The usage of swiotlb is a little strange though. https://github.com/raspberrypi/linux/issues/3416#issuecomment-576375208 suggest  v3d should be using direct DMA and not bounce buffers. Could be easily missing support in the upstream kernel.

Comment 10 František Zatloukal 2024-08-26 18:05:54 UTC
Discussed during the 2024-08-26 blocker review meeting: [1]

The decision to classify this bug as a AcceptedBlocker (Beta) was made:

"A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility", for the now release-blocking configuration of KDE on the Raspberry PI

[1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-26/f41-blocker-review.2024-08-26-16.00.log.html

Comment 11 Janne Grunau 2024-08-28 21:20:54 UTC
This seems to be modifier mismatch between kwin_wayland and Xwayland. Starting wayland applications like konsole on the wayland display like

> QT_QPA_PLATFORM=wayland-egl DISPLAY=:0 XDG_RUNTIME_DIR=/tmp/tmp.OJ8Rlhgwig-initial-setup-runtime-dir konsole

works without corruption, as does starting glxgears. Classic X11 applications like xeyes show however similar corruption.

The corruption looks like it could by interpreting a buffer with a linear image as it had DRM_FORMAT_MOD_BROADCOM_UIF or vice versa.

Changing GDK_BACKEND to "wayland" for the initial setup unfortunately does not work as noted in /usr/libexec/initial-setup/initial-setup-graphical. That would have made the issue less severe.

Comment 12 Neal Gompa 2024-08-29 00:12:33 UTC
(In reply to Janne Grunau from comment #11)
>
> Changing GDK_BACKEND to "wayland" for the initial setup unfortunately does
> not work as noted in /usr/libexec/initial-setup/initial-setup-graphical.
> That would have made the issue less severe.

This is something that might be possible once Wayland-native Anaconda is done for F42.

Comment 13 Janne Grunau 2024-08-29 18:02:13 UTC
As workaround for initial-setup kwin_wayland's compositing backend could be overridden to QPainter with "KWIN_COMPOSE=Q kwin_wayland ..." in /usr/libexec/initial-setup/run-gui-backend. See https://invent.kde.org/plasma/kwin/-/wikis/Environment-Variables#kwin_compose

This still leaves kwin_wayland + Xwayland broken but at least unblocks initial-setup.

I still do not understand where / how the mixup between DRM_FORMAT_MOD_LINEAR and probably DRM_FORMAT_MOD_BROADCOM_UIF is coming from.

Comment 14 Janne Grunau 2024-08-29 22:57:15 UTC
A slightly better workaround would be to export 'XWAYLAND_NO_GLAMOR=1' before starting kwin_wayland / Xwayland

I have identified the root cause of the issue though. kwin_wayland reports the KMS DRM device as main and only device in its dmabuf feedback. This is unexpected since the GPU device would be a much better choice for interoperability between compositor and applications. Due to this choice of device Xwayland can not support explicit modifiers.

I think Xwayland handles this situation incorrectly because it decides to allocate buffers with implicit modifiers instead of linear layout as the linux_dmabuf_feedback main_device event documentation specifies.

I'll be working on a Xwayland patch/PR to use linear surfaces. I started discussion with kwin developers and will file a bug in their bugzilla tomorrow.

I believe this issue affects all devices which use mesa kmsro drivers. Despite this this issue is not an mesa issue but the combination of issues in kwin and Xwayland.

Comment 15 Janne Grunau 2024-08-30 00:09:26 UTC
Created attachment 2045039 [details]
WIP Xwayland patch resolving the issue

work in progress Xwayland patch to fall back to linear surfaces if no modifiers are available due to mismatching DRM devices in xwl_dmabuf_get_modifiers_for_device(). Ignores the "corner" case of of not using dmabuf protocol 4 or later.

A better fix for explicit modifier support would be to source the list of supported modifiers in Xwayland independently and intersect that list with the dmabuf feedback list ignoring DRM devices.

Comment 16 Michel Dänzer 2024-08-30 09:33:28 UTC
Not all drivers support linear buffers, e.g. the nvidia driver currently doesn't in general. So Xwayland can't really just assume it'll work if DRM_FORMAT_MOD_LINEAR isn't advertised.

Comment 17 Janne Grunau 2024-08-30 11:49:42 UTC
but how is GPU interop supposed to work in this case? Different GPUs from different vendors are certainly not using compatible implicit modifiers.

The patch is supposed to cover the case that `xwl_dmabuf_get_modifiers_for_device()` returns FALSE because `drmDevicesEqual(dev_formats->drm_dev, device)` is never true. kwin sends only dmabuf feedback tranches with the KMS DRM device which Xwayland doesn't use. I still think that kwin's choice is unexpected but arguably Xwayland should determine the modifiers of its DRM device and intersect them with the dmabuf feedback ones.

Before I submit a Xwayland PR I will make testing for the device mismatch for `dmabuf_protocol_version >= 4` explicit. As a second step I'll try to make the explicit modifier path useable on this setup (kwin + Xwayland on Fedora RPi4).

It is also a little regrettable that it's possible to end up with different implicit modifiers on kmsro setups. I don't think it's avoidable though.

Comment 18 Michel Dänzer 2024-08-30 15:15:04 UTC
(In reply to Janne Grunau from comment #17)
> but how is GPU interop supposed to work in this case? Different GPUs from
> different vendors are certainly not using compatible implicit modifiers.

Indeed, interop can only work if both devices support at least one common modifier.

With upstream drivers, the linear modifier should always work as a worst case fallback. Alas, nvidia.

> kwin sends only dmabuf feedback tranches with the KMS DRM device which Xwayland doesn't
> use. I still think that kwin's choice is unexpected but arguably Xwayland
> should determine the modifiers of its DRM device and intersect them with the
> dmabuf feedback ones.

You previously wrote Xwayland "can not support explicit modifiers" due to the KMS DRM device being used in the dma-buf feedback. Doesn't that mean the intersection would be empty?

Comment 19 Janne Grunau 2024-08-30 22:21:25 UTC
It can not support explicit modifiers because kwin does not report modifiers for the device. For intersecting xwayland would have to use other means to query the supported modifiers, for example eglQueryDmaBufFormatsEXT / eglQueryDmaBufModifiersEXT.

xwayland merge request in https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666

The root cause is that kwin announces distinct DRM device in wl_drm and dmabuf_feedback. The issue would resolve itself if either kwin or xwayland would stop using wl_drm.

Comment 21 Michel Dänzer 2024-09-02 12:58:34 UTC
(In reply to Janne Grunau from comment #19)
> The root cause is that kwin announces distinct DRM device in wl_drm and
> dmabuf_feedback. The issue would resolve itself if either kwin or xwayland
> would stop using wl_drm.

FWIW, AFAICT Xwayland uses wl_drm only because the main device advertised by the dma-buf protocol doesn't match the device used by Xwayland for glamor.

Comment 22 František Zatloukal 2024-09-03 11:50:05 UTC
(In reply to Janne Grunau from comment #20)
> Created attachment 2045082 [details]
> patch from https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666

This addresses the issue with initial-setup.

Comment 23 Chema Casanova 2024-09-04 17:00:24 UTC
We recently merged a fix for v3d upstream to remove the heuristic followed internally for implicit modifiers to always use LINEAR. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30946
The fix was marked for backporting to mesa stable versions so it would reach soon Fedora Mesa package.

This addressed an issue on TigerVNC that was similar, v3d was importing as it was tiled (UIF) but it was allocated linear as it was requested as SCANOUT.

It is easy to validate this behaviour to check this if you launch the application (Xwayland) and the compositor with V3D_DEBUG=surface so you can check the layout used by the driver to import (KWin) and the one used for allocation (XWayland).

I think we were the only driver allowing the tiled format (UIF) for importing buffers with implicit modifiers. We did that to avoid consuming CMA memory in situations where the BOs are not expected to be used for SCANOUT by the display driver. Currently allocated mesa render buffers with INVALID modifiers are forced to have the SCANOUT flag for the worst case scenario, when it is not needed as they are not used full-screen for direct scanout. This could imply that the v3d driver could run out of CMA memory and show lower performance as probably the compositor would need to sample that implies an extra conversion from LINEAR to TILE.

Comment 24 Fedora Update System 2024-09-04 17:49:41 UTC
FEDORA-2024-21402656dd (mesa-24.2.1-5.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd

Comment 25 Fedora Update System 2024-09-05 03:02:18 UTC
FEDORA-2024-21402656dd has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-21402656dd`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 26 František Zatloukal 2024-09-05 08:59:11 UTC
(In reply to Fedora Update System from comment #24)
> FEDORA-2024-21402656dd (mesa-24.2.1-5.fc41) has been submitted as an update
> to Fedora 41.
> https://bodhi.fedoraproject.org/updates/FEDORA-2024-21402656dd

This by itself doesn't solve the issue.

Comment 27 Neal Gompa 2024-09-05 20:39:27 UTC
I assume we still need the Xwayland fix too.

Comment 28 Fedora Update System 2024-09-07 01:36:05 UTC
FEDORA-2024-bb7298426c has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-bb7298426c`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-bb7298426c

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 29 Adam Williamson 2024-09-09 19:42:11 UTC
https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/1666 seems to have a couple of objections upstream which makes me hesitate about whether we should backport that. Janne, is there any progress there? Is your proposal that "an alternative fix for kwin could be to just discard wl_drm if its DRM device in not a target_device in any of the dmabuf feedback tranches" something we can look into?

Comment 30 Janne Grunau 2024-09-10 11:45:54 UTC
There are two new kwin merge request which fix the root cause of this issue without the suboptimal fallback to linear buffers in the mesa and in my xwayland change. The two kwin changes backport cleanly to kwin-6.1.4. Tested on RPi4 and both the `wayland-info` output and tests of X11 applications under kwin/Xwayland confirm the  the  issue as fixed.

PR with backports of these change for Fedora's kwin packahge created.

Comment 31 Fedora Update System 2024-09-10 14:12:00 UTC
FEDORA-2024-b475969e23 (kwin-6.1.4-2.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-b475969e23

Comment 32 Fedora Update System 2024-09-10 14:12:01 UTC
FEDORA-2024-a7aaaed6c9 (kwin-6.1.4-2.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7aaaed6c9

Comment 33 Fedora Update System 2024-09-11 02:28:43 UTC
FEDORA-2024-a7aaaed6c9 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-a7aaaed6c9`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-a7aaaed6c9

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 34 Fedora Update System 2024-09-11 02:57:30 UTC
FEDORA-2024-b475969e23 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-b475969e23`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b475969e23

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 35 František Zatloukal 2024-09-11 08:40:34 UTC
(In reply to Janne Grunau from comment #30)
> There are two new kwin merge request which fix the root cause of this issue
> without the suboptimal fallback to linear buffers in the mesa and in my
> xwayland change. The two kwin changes backport cleanly to kwin-6.1.4. Tested
> on RPi4 and both the `wayland-info` output and tests of X11 applications
> under kwin/Xwayland confirm the  the  issue as fixed.
> 
> PR with backports of these change for Fedora's kwin package created.

Thanks a lot, the issue is fixed with kwin-6.1.4-2.fc41 !

Comment 36 Fedora Update System 2024-09-11 18:21:45 UTC
FEDORA-2024-a7aaaed6c9 (kwin-6.1.4-2.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 37 Fedora Update System 2024-09-13 01:52:56 UTC
FEDORA-2024-b475969e23 (kwin-6.1.4-2.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.