Bug 1662057

Summary: modeset(0): failed to set mode: No such file or directory | modeset(0): failed to set mode: Invalid argument
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 29CC: airlied, bnater, brian, bskeggs, caillon+fedoraproject, gq, hdegoede, ichavero, itamar, jan.kratochvil, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, lyude, mcdanlj, mchehab, mihai, mjg59, ofourdan, rhughes, richard, rstrode, sandmann, skomra, steved, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-17 20:11:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/Xorg.0.log
none
Xorg.0.log.xz
none
/var/log/Xorg.0.log-failed12
none
xorg-x11-server-1.20.4-1.fc29 patch none

Description Jan Kratochvil 2018-12-25 17:01:18 UTC
Created attachment 1516679 [details]
/var/log/Xorg.0.log

Description of problem:
After suspending/resuming Lenovo X1 Carbon 6th gen, particularly when it was reconnected to its Thunderbolt 3 docking station (both from Red Hat company) I get restart of X server.

Version-Release number of selected component (if applicable):
xorg-x11-server 1.20.3-2.fc29
XFCE

How reproducible:
See above.

Steps to Reproduce:
See above.

Actual results:
[125371.707] (EE) modeset(0): failed to set mode: No such file or directory
[125371.707] (EE)
Fatal server error:
[125371.707] (EE) EnterVT failed for screen 0
...
[125372.555] (EE) Server terminated with error (1). Closing log file.

Expected results:
No server restart.

Additional info:
Couldn't the X server make more copies of the log file than just /var/log/Xorg.0.log.old ? Commonly a crash gets overwriten by a subsequence restart.

Comment 1 Jan Kratochvil 2018-12-26 17:02:46 UTC
After reconnecting laptop to the docking station and resuming the laptop then sometimes external display remains off (internal display lid I keep closed). So I switch to a text console by ctrl-alt-f2 which does wake up the external display. Then switching back by alt-f1 sometimes resumes the X server but sometimes it does crash it with the error above.

Comment 2 Jan Kratochvil 2019-01-09 17:46:14 UTC
Created attachment 1519540 [details]
Xorg.0.log.xz

before:
kernel-4.19.7-300.fc29.x86_64
xorg-x11-server 1.20.3-2.fc29

now:
kernel-4.19.10-300.fc29.x86_64
xorg-x11-server 1.20.3-2.fc29

Comment 3 Jan Kratochvil 2019-01-10 08:59:39 UTC
*** Bug 1662548 has been marked as a duplicate of this bug. ***

Comment 4 Jan Kratochvil 2019-01-10 09:01:26 UTC
(In reply to Jan Kratochvil from comment #1)
> After reconnecting laptop to the docking station and resuming the laptop
> then sometimes external display remains off (internal display lid I keep
> closed).

This may be due to the error messages:
[ 21514.227] (II) modeset(0): Allocate new frame buffer 3840x2160 stride
[ 21514.251] (EE) modeset(0): failed to set mode: Invalid argument
[ 21514.253] (II) modeset(0): Allocate new frame buffer 5760x2160 stride
[ 21514.268] (EE) modeset(0): failed to set mode: Invalid argument

For example now I even was not able to resurrect the X server switching to text mode and back, even after killing Firefox (big app) and after: echo 3 >/proc/sys/vm/drop_caches

Comment 6 Jan Kratochvil 2019-02-17 16:19:21 UTC
Created attachment 1535722 [details]
/var/log/Xorg.0.log-failed12

Tried now a trunk snapshot:
commitid=f6753c117ef0f83499d5e2d6dda226fec9ddf803
[    51.520] Kernel command line: BOOT_IMAGE=/vmlinuz-4.20.8-200.fc29.x86_64 root=/dev/mapper/luks-b5be36c4-d110-4e1c-bcbf-07ddf410db73 ro resume=/dev/mapper/luks-f4ff8e7a-db91-404d-8d2a-22c0dd3b7a58 rd.luks.uuid=luks-b5be36c4-d110-4e1c-bcbf-07ddf410db73 rd.luks.uuid=luks-f4ff8e7a-db91-404d-8d2a-22c0dd3b7a58 LANG=en_US.UTF-8
[    51.520] Build Date: 17 February 2019  04:55:55PM
[    51.520] Build ID: xorg-x11-server 1.20.3-3.20190217.fc29

And it still crashes the same. I will try later to put there some debug dumps (as the error is reported by some far caller).

Comment 7 Jan Kratochvil 2019-02-18 21:51:10 UTC
Created attachment 1536159 [details]
libdrm debug patch

The problem happens in kernel DRM_IOCTL_MODE_ATOMIC, expecting i915.ko.
It is called by libdrm drmModeAtomicCommit().
kernel returns -22=EINVAL=Invalid argument

If one clears the error (the patch does this) X server no longer crashes but the display remains black.

Tried these Fedora kernels but they all behave the same:
FAIL kernel-4.20.8-200.fc29.x86_64
FAIL kernel-5.0.0-0.rc6.git1.1.fc30.x86_64
FAIL kernel-4.18.18-200.fc28.x86_64
Tried drm-tip kernel but it did hang when it should ask for LUKS password on my system.  So I could not test drm-tip.  It was built by kernel.spec as vanilla from:
  https://github.com/freedesktop/drm-tip.git
  7f6ace5f10a9d6c5d277b95e39f862eff87fdb45 = drm-tip

The failed DRM_IOCTL_MODE_ATOMIC data is (I haven't decoded it more):
atomic=0x7ffe3fceaa00 00 05 00 00 03 00 00 00 e0 7d ea ab 97 55 00 00 f0 c2 b1 ab 97 55 00 00 00 a9 33 ab 97 55 00 00 d0 33 d1 ab 97 55 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
objs_ptr=0x5597abea7de0 1c 00 00 00 29 00 00 00 67 00 00 00
count_props_ptr=0x5597abea7de0 0a 00 00 00 02 00 00 00 01 00 00 00
props_ptr=0x5597ab33a900 13 00 00 00 10 00 00 00 0f 00 00 00 0e 00 00 00 0d 00 00 00 0c 00 00 00 0b 00 00 00 0a 00 00 00 09 00 00 00 08 00 00 00 15 00 00 00 14 00 00 00 13 00 00 00
prop_values_ptr=0x5597abd133d0 29 00 00 00 00 00 00 00 6b 00 00 00 00 00 00 00 70 08 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70 08 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 6e 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 29 00 00 00 00 00 00 00
user_data=(nil)
errno=4
ret=-22,errno=22

Comment 8 Jan Kratochvil 2019-02-18 22:08:28 UTC
Submitted it upstream: https://bugs.freedesktop.org/show_bug.cgi?id=109668

Comment 9 Jan Kratochvil 2019-05-31 13:32:01 UTC
Still valid with:
libdrm-2.4.97-2.fc29.x86_64
kernel-5.0.19-200.fc29.x86_64
xorg-x11-server-Xorg-1.20.4-1.fc29.x86_64

With the workaround from Comment 7 one can recover the X session by unplugging and replugging the cable to docking station (sometimes multiple times).

Comment 10 Jan Kratochvil 2019-06-03 11:58:28 UTC
Created attachment 1576601 [details]
xorg-x11-server-1.20.4-1.fc29 patch

xorg-x11-server-1.20.4-1.fc29 patch although it is equivalent to what the libdrm patch 143401 does - ignore any errors from drmModeAtomicCommit().

I can confirm it is unrelated to docking station.  When I removed the docking station (connected to LG 27UK650 (Xorg.log id "LG HDR 4K") by DisplayPort) and connected the display directly to Lenovo X1 Carbon HDMI port it did behaved the same.

Besides ctrl-alt-Fx switching consoles the problem also happens after DPMS Off (xlock -dpmsoff), I have changed it now to DPMS Suspend (xlock -dpmssuspend) and it looks as the locked up display does not happen anymore.

With this patch screen remains black and one can recover it by disconnecting and reconnected the display (after resuming from DPMS Off); the same can be done by disconnecting+reconnecting the docking station (or powercycling the docking station).

I have also updated main BIOS of Lenovo X1 Carbon to 1.38 now, no effect (it even appears to me it happens more often than with 1.34 before).

Comment 11 Jan Kratochvil 2019-06-11 08:34:18 UTC
In https://bugs.freedesktop.org/show_bug.cgi?id=109668 I have posted the problem log difference:

The problem is that "enabled/connectors mismatch" but where is the problem?
kernel-5.1.8-200.fc29.x86_64

-=working xlock resume
+=black/failing alt-f1
 i915 0000:00:02.0: [drm] crtc[47]: pipe A
 i915 0000:00:02.0: [drm]       enable=0
 i915 0000:00:02.0: [drm]       active=0
 i915 0000:00:02.0: [drm]       planes_changed=0
 i915 0000:00:02.0: [drm]       mode_changed=0
 i915 0000:00:02.0: [drm]       active_changed=0
 i915 0000:00:02.0: [drm]       connectors_changed=0
 i915 0000:00:02.0: [drm]       color_mgmt_changed=0
 i915 0000:00:02.0: [drm]       plane_mask=1
-i915 0000:00:02.0: [drm]       connector_mask=0
-i915 0000:00:02.0: [drm]       encoder_mask=4
+i915 0000:00:02.0: [drm]       connector_mask=1
+i915 0000:00:02.0: [drm]       encoder_mask=1
 i915 0000:00:02.0: [drm]       mode: "": 0 0 0 0 0 0 0 0 0 0 0x0 0x0
-i915 0000:00:02.0: [drm] connector[86]: DP-5
-i915 0000:00:02.0: [drm]       crtc=(null)
 [drm:drm_atomic_check_only [drm]] checking 000000004664b9ab
 [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CRTC:47:pipe A] mode changed
 [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CRTC:47:pipe A] enable changed
 [drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CRTC:47:pipe A] active changed
-[drm:drm_atomic_helper_check_modeset [drm_kms_helper]] Updating routing for [CONNECTOR:86:DP-5]
-[drm:drm_atomic_helper_check_modeset [drm_kms_helper]] Disabling [CONNECTOR:86:DP-5]
+[drm:drm_atomic_helper_check_modeset [drm_kms_helper]] [CRTC:47:pipe A] enabled/connectors mismatch

Comment 12 Jan Kratochvil 2019-06-15 20:38:17 UTC
It has been fixed (workarounded?) by Driver "intel" from Bug 1630367 Comment 18.
There is also Bug 1697591 for it.

Comment 13 Justin M. Forbes 2019-08-20 17:45:30 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 29 kernel bugs.

Fedora 29 has now been rebased to 5.2.9-100.fc29.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 30, and are still experiencing this issue, please change the version to Fedora 30.

If you experience different issues, please open a new bug report for those.

Comment 14 Justin M. Forbes 2019-09-17 20:11:48 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.