Bug 2001010

Summary: Failed to post KMS update: drmModeAtomicCommit: Invalid argument
Product: Red Hat Enterprise Linux 9 Reporter: Tomas Popela <tpopela>
Component: mutterAssignee: Jonas Ådahl <jadahl>
Status: CLOSED WONTFIX QA Contact: Desktop QE <desktop-qa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0CC: extras-qa, fmuellner, gnome-sig, Hi-Angel, jadahl, kherbst, otaylor, philip.wyett, pwhalen, robatino, tpelka, walters
Target Milestone: rc   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1936991 Environment:
Last Closed: 2023-03-03 07:27:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1936991    
Bug Blocks:    

Description Tomas Popela 2021-09-03 14:28:23 UTC
+++ This bug was initially created as a clone of Bug #1936991 +++

Description of problem:

Attempting to boot F34 Workstation on a Jetson Nano, the system boots to the F34 wallpaper briefly before displaying the "Oops" screen with initial-setup on top (BZ#1924908) but no visible pointer. Once the mouse is moved:

Mar 09 10:53:11 nano gnome-shell[1158]: Failed to post KMS update: drmModeAtomicCommit: Invalid argument
Mar 09 10:53:11 nano gnome-shell[1158]: Page flip discarded: drmModeAtomicCommit: Invalid argument

Mar 09 10:55:08 nano gnome-session-f[1616]: Negative content width -7 (allocation 1, extents 4x4) while allocating gadget (node headerbar, owner GtkHeaderBar)
Mar 09 10:55:08 nano gnome-session-f[1616]: gtk_widget_size_allocate(): attempt to allocate widget with width -34 and height 18
Mar 09 10:55:08 nano gnome-session-f[1616]: Negative content width -23 (allocation 1, extents 12x12) while allocating gadget (node label, owner GtkLabel)
Mar 09 10:55:08 nano gnome-session-f[1616]: gtk_widget_size_allocate(): attempt to allocate widget with width -39 and height 0
Mar 09 10:55:08 nano /usr/libexec/gdm-wayland-session[1616]: *** BUG ***
Mar 09 10:55:08 nano /usr/libexec/gdm-wayland-session[1616]: In pixman_region32_init_rect: Invalid rectangle passed
Mar 09 10:55:08 nano /usr/libexec/gdm-wayland-session[1616]: Set a breakpoint on '_pixman_log_error' to debug

Followed by a black screen with flashing cursor. 


Version-Release number of selected component (if applicable):

gnome-shell-40.0~beta-1.fc34.aarch64
kernel-5.11.3-300.fc34.aarch64
mesa-21.0.0~rc5-3.fc34

How reproducible:
Everytime.

--- Additional comment from Paul Whalen on 2021-03-09 16:04:15 UTC ---



--- Additional comment from Adam Williamson on 2021-03-09 17:21:04 UTC ---

Well, if we took the crash as a blocker, it seems like logically speaking this ought to be a blocker too. Sounds like GNOME is still not at all usable on a target platform, right?

--- Additional comment from Jonas Ådahl on 2021-03-09 17:25:44 UTC ---

Could you try two things:

Run with 

MUTTER_DEBUG=kms

in the environment, and attach the whole log from gnome-shell.

Then add

MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0

to the environment again, and try again.

--- Additional comment from Paul Whalen on 2021-03-09 18:30:16 UTC ---



--- Additional comment from Paul Whalen on 2021-03-09 18:30:58 UTC ---



--- Additional comment from Paul Whalen on 2021-03-09 18:32:01 UTC ---

After adding MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0, the pointer is visible on screen and the system is usable.

--- Additional comment from Jonas Ådahl on 2021-03-09 18:36:47 UTC ---

Does adding

DRIVER=="tegra", SUBSYSTEM=="platform", TAG+="mutter-device-requires-kms-modifiers"

to

/usr/lib/udev/rules.d/61-mutter.rules

then removing

MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0

from the env, then rebooting, help?

> After adding MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0, the pointer is visible on screen and the system is usable.

At least a work around is available at our hands then, the issue is when using atomic mode setting.

--- Additional comment from Paul Whalen on 2021-03-09 18:57:08 UTC ---

(In reply to Jonas Ådahl from comment #7)
> Does adding
> 
> DRIVER=="tegra", SUBSYSTEM=="platform",
> TAG+="mutter-device-requires-kms-modifiers"
> 
> to
> 
> /usr/lib/udev/rules.d/61-mutter.rules
> 
> then removing
> 
> MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0
> 
> from the env, then rebooting, help?

The pointer was no longer visible with those changes. 

> 
> > After adding MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0, the pointer is visible on screen and the system is usable.
> 
> At least a work around is available at our hands then, the issue is when
> using atomic mode setting.

There is quite a bit of graphical flashing, but its definitely a usable workaround.

--- Additional comment from Jonas Ådahl on 2021-03-09 19:24:59 UTC ---

> There is quite a bit of graphical flashing, but its definitely a usable workaround.

Are you saying that with MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 there is still graphical flashing? What kind of flashing is this?

--- Additional comment from Paul Whalen on 2021-03-09 19:58:20 UTC ---

(In reply to Jonas Ådahl from comment #9)
> > There is quite a bit of graphical flashing, but its definitely a usable workaround.
> 
> Are you saying that with MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 there is still
> graphical flashing? What kind of flashing is this?

Yes, even with MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0, windows still flash semitransparent when moved or opened.

--- Additional comment from Karol Herbst on 2021-03-09 20:09:46 UTC ---

(In reply to Jonas Ådahl from comment #9)
> > There is quite a bit of graphical flashing, but its definitely a usable workaround.
> 
> Are you saying that with MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 there is still
> graphical flashing? What kind of flashing is this?

That's probably a long outstanding bug on either tegradrms or nouveaus side. I have no good idea on how to fix it. And tagr proposed a solution which changes nouveaus UAPI. Still working on it, but super sure that no solution will be ready for fedora 34.

--- Additional comment from Nicolas Chauvet (kwizart) on 2021-03-09 22:24:53 UTC ---

(In reply to Jonas Ådahl from comment #9)
 > There is quite a bit of graphical flashing, but its definitely a usable workaround.
I wonder if tearing could be fixed by this serie on tegra:
http://patchwork.ozlabs.org/project/linux-tegra/patch/20210302124445.29444-2-digetx@gmail.com/

This relies upon interconnect changes, so that's too much to backport for 5.11 and f34 GA, but probably doable for 5.12 if the serie is good for 5.13-rc1.
Using the grate-driver kernel on jetson-tk1 (armhfp), I'm not experiencing any tearing at all (so maybe others WIP/pending patches are also needed).

--- Additional comment from Adam Williamson on 2021-03-09 22:40:33 UTC ---

so can we at least come up with a udev rules equivalent of the MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 workaround today? This is the last thing blocking a Beta compose at present.

--- Additional comment from Adam Williamson on 2021-03-09 23:10:56 UTC ---

+3 in https://pagure.io/fedora-qa/blocker-review/issue/294 , marking accepted.

--- Additional comment from Fedora Update System on 2021-03-10 02:29:36 UTC ---

FEDORA-2021-7ff726a721 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-7ff726a721

--- Additional comment from Nicolas Chauvet (kwizart) on 2021-03-10 08:37:12 UTC ---

I confirm that the update discard the "drmModeAtomicCommit: Invalid argument" message, but it also doesn't fix any issue for me (using jetson-tk1, so different device and arch).

I've filled two separate issues, that might or might not be related:

1/ Using fedora kernel, there is a nouveau page fault on armhfp
https://bugzilla.redhat.com/show_bug.cgi?id=1937129

2/ Using a patched kernel (linux-next + WIP tegra patches from grate):
It cannot use the GPU acceleration and fall back to nouveau.
https://bugzilla.redhat.com/show_bug.cgi?id=1937236

I wonder if any on theses can be reproduced on nano or other jetson aarch64 boards...

--- Additional comment from Karol Herbst on 2021-03-10 11:13:22 UTC ---

(In reply to Nicolas Chauvet (kwizart) from comment #12)
> (In reply to Jonas Ådahl from comment #9)
>  > There is quite a bit of graphical flashing, but its definitely a usable
> workaround.
> I wonder if tearing could be fixed by this serie on tegra:
> http://patchwork.ozlabs.org/project/linux-tegra/patch/20210302124445.29444-2-
> digetx/
> 
> This relies upon interconnect changes, so that's too much to backport for
> 5.11 and f34 GA, but probably doable for 5.12 if the serie is good for
> 5.13-rc1.
> Using the grate-driver kernel on jetson-tk1 (armhfp), I'm not experiencing
> any tearing at all (so maybe others WIP/pending patches are also needed).

ohh, nice pointing that out. This might indeed explain why it only happens on low bandwidth boards... will try that out on my jetson nano as well. Thanks!

--- Additional comment from Adam Williamson on 2021-03-10 17:17:26 UTC ---

Nicolas: the update basically does the same thing as MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0 for all systems using the "tegra" driver. The idea was to get Jetson boards to the state described by Paul when using that parameter (usable with visible cursor). If it does that and doesn't make things any *worse than they already are* on other Tegra platforms, it's doing its job.

jetson-tk1 is 32-bit, right? If so, we don't block on anything desktop-y on it for F34, AIUI.

--- Additional comment from Fedora Update System on 2021-03-10 18:51:50 UTC ---

FEDORA-2021-7ff726a721 has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-7ff726a721`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-7ff726a721

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

--- Additional comment from Fedora Update System on 2021-03-12 01:36:18 UTC ---

FEDORA-2021-7ff726a721 has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 1 Ben Cotton 2021-09-03 14:33:25 UTC
Removing F34 Beta blocker that was copied from original BZ.

Comment 3 RHEL Program Management 2023-03-03 07:27:53 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.