Bug 2224592 - complete I/O lockup on boot on Eaglelake 8086:2e32 with cmdline length longer than ~104 characters
Summary: complete I/O lockup on boot on Eaglelake 8086:2e32 with cmdline length longer...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: libdrm
Version: 38
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-21 15:16 UTC by Felix Miata
Modified: 2024-05-12 06:01 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-05-12 06:01:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg from Rawhide using ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 mitigations=auto preempt=full drm.debug=0x1e log_buf_len=1M (177.21 KB, text/plain)
2023-07-21 15:16 UTC, Felix Miata
no flags Details

Description Felix Miata 2023-07-21 15:16:35 UTC
Created attachment 1976945 [details]
dmesg from Rawhide using ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 mitigations=auto preempt=full drm.debug=0x1e log_buf_len=1M

Original Summary:
complete I/O lockup on boot on Eaglelake 8086:2e32 with various cmdline combinations of consoleblank=/mitigations=/preempt=

Description of problem:
Typical dmesg tail (without  drm.debug=0x1e log_buf_len=1M):
[    8.253842] pci 0000:00:00.0: detected gtt size: 2097152K total, 262144K mappable
[    8.254160] pci 0000:00:00.0: detected 32768K stolen memory
[    8.254261] i915 0000:00:02.0: vgaarb: deactivate vga console
[    8.254465] Console: switching to colour dummy device 80x25
[    8.301562] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    8.459373] fbcon: i915drmfb (fb0) is primary device
[    8.459638] Console: switching to colour frame buffer device 210x65
[    8.464110] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    8.695036] r8169 0000:01:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   19.418090] i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] flip_done timed out
[   29.658090] i915 0000:00:02.0: [drm] *ERROR* [CRTC:60:pipe B] flip_done timed out
[   29.658109] i915 0000:00:02.0: [drm] *ERROR* pipe A underrun
[   40.410096] i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
[   40.410103] i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] commit wait timed out
[   50.650091] i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
[   50.650098] i915 0000:00:02.0: [drm] *ERROR* [CRTC:60:pipe B] commit wait timed out

Version-Release number of selected component (if applicable):

How reproducible:
0: always with ipv6.disable=1 net.ifnames=0 audit=0 selinux=0
1. More often than not with 6.2.15-300.fc38 and 6.3.12-200.fc38 on F38
2. according to table with 6.3.12-200.fc38 and 6.5rc2 on Rawhide:
Using Dual Displays:

Bad:
consoleblank=0
consoleblank=0 mitigations=auto
consoleblank=0 mitigations=auto preempt=full
consoleblank=0 mitigations=off
consoleblank=0 preempt=full
consoleblank=0 preempt=full mitigations=auto
consoleblank=0 preempt=full mitigations=off
consoleblank=0 preempt=none mitigations=auto
consoleblank=0 preempt=none mitigations=off
mitigations=auto
mitigations=auto preempt=full
mitigations=auto preempt=none
mitigations=off preempt=full
preempt=full
preempt=none

OK:
consoleblank=0 preempt=none
mitigations=off
mitigations=off preempt=none
(none of consoleblank, mitigations, preempt)

Using DVI display Only:

Bad:
consoleblank=0
consoleblank=0 mitigations=auto
consoleblank=0 mitigations=auto preempt=none
consoleblank=0 mitigations=off preempt=full
consoleblank=0 preempt=full
consoleblank=0 preempt=none
mitigations=auto preempt=full
mitigations=off preempt=full
mitigations=off preempt=none

OK:
consoleblank=0 mitigations=auto preempt=full
consoleblank=0 mitigations=off
consoleblank=0 mitigations=off preempt=none
mitigations=auto preempt=none
(none of consoleblank, mitigations, preempt)

Steps to Reproduce:
1. Connect displays to motherboard's DVI and VGA ports
2. Try to boot

Actual results:
1-DRM/CRTC errors
2-primary display flashes two white cursors at high rate on black screen
3-secondary display sleeps due to "out of range"
4-no keyboard response
5-remote login functional

Expected results:
1-Xorg/KDM/Plasma-X11 work normally
2-keyboard is normal/responsive

Additional info:
1-normal operation with two one or two displays in F37/6.3.x, Tumblweed/6.3.9, Trixie/6.3.x, Mageia/6.3
2-# pinxi -GSaz --vs --zl --hostname
pinxi 3.3.28-02 (2023-07-18)
System:
  Host: big41 Kernel: 6.3.12-100.fc37.x86_64 arch: x86_64 bits: 64
    compiler: gcc v: 2.38-27.fc37 clocksource: tsc available: hpet,acpi_pm
    parameters: ro root=LABEL=<filter> ipv6.disable=1 net.ifnames=0 audit=0
    selinux=0 noresume consoleblank=0 preempt=full mitigations=off
  Desktop: KDE Plasma v: 5.27.4 tk: Qt v: 5.15.9 wm: kwin_x11 vt: 7 dm:
    1: KDM 2: XDM Distro: Fedora release 37 (Thirty Seven)
Graphics:
  Device-1: Intel 4 Series Integrated Graphics vendor: Biostar Microtech Intl
    Corp driver: i915 v: kernel arch: Gen-5 process: Intel 45nm built: 2008
    ports: active: HDMI-A-1,VGA-1 empty: DP-1 bus-ID: 00:02.0
    chip-ID: 8086:2e32 class-ID: 0300
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.9
    compositor: kwin_x11 driver: X: loaded: modesetting unloaded: fbdev,vesa
    dri: crocus gpu: i915 display-ID: :0 screens: 1
  Screen-1: 0 s-res: 3600x1200 s-dpi: 120 s-size: 762x254mm (30.00x10.00")
    s-diag: 803mm (31.62")
  Monitor-1: HDMI-A-1 mapped: HDMI-1 pos: primary,left model: NEC EA243WM
    serial: <filter> built: 2011 res: 1920x1200 hz: 60 dpi: 94 gamma: 1.2
    size: 519x324mm (20.43x12.76") diag: 612mm (24.1") ratio: 16:10 modes:
    max: 1920x1200 min: 640x480
  Monitor-2: VGA-1 pos: right model: Dell P2213 serial: <filter> built: 2012
    res: 1680x1050 hz: 60 dpi: 90 gamma: 1.2 size: 473x296mm (18.62x11.65")
    diag: 558mm (22") ratio: 16:10 modes: max: 1680x1050 min: 720x400
  API: OpenGL v: 2.1 Mesa 23.0.3 renderer: Mesa Intel G41 (ELK)
    direct-render: Yes
#

Comment 1 Felix Miata 2023-10-08 15:38:23 UTC
Updates apparently fixed this on 38, but same problem is now on 39 with 6.4.16 and 6.5.5 kernels, and even fewer cmdline option combinations succeed:

Next-last failure was with  ipv6.disable=1 net.ifnames=0 selinux=0 mitigations=off

Last succeed was with  ipv6.disable=1 net.ifnames=0 selinux=0 ibt=off

Last failure was with ipv6.disable=1 net.ifnames=0 selinux=0 video=1440x900@60
# journalctl --no-host | tail
Oct 08 11:34:44 kernel: i915 0000:00:02.0: [drm] *ERROR* [CONNECTOR:64:HDMI-A-1] commit wait timed out
Oct 08 11:34:49 kdm[546]: X server is stuck in D state; leaving it alone
Oct 08 11:34:49 kdm[546]: X server for display :0 cannot be started, session disabled
Oct 08 11:34:54 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Oct 08 11:34:54 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:31:primary A] commit wait timed out
Oct 08 11:35:04 kernel: i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
Oct 08 11:35:04 kernel: i915 0000:00:02.0: [drm] *ERROR* [PLANE:46:primary B] commit wait timed out
Oct 08 11:35:14 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] flip_done timed out
Oct 08 11:35:25 kernel: i915 0000:00:02.0: [drm] *ERROR* [CRTC:60:pipe B] flip_done timed out
Oct 08 11:35:25 kernel: i915 0000:00:02.0: [drm] *ERROR* pipe A underrun
# uname -r
6.5.5-300.fc39.x86_64
#

Comment 2 Felix Miata 2023-11-30 22:27:14 UTC
Problem is now gone on F39:
# cat /proc/cmdline
... ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 mitigations=off ibt=off video=1440x900@60
# uname -a
Linux big41 6.6.3-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 19:11:52 UTC 2023 x86_64 GNU/Linux
#

Comment 3 Felix Miata 2023-11-30 23:36:06 UTC
But it's back in F38:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 audit=0 noresume consoleblank=0 mitigations=off selinux=0
#
Omit mitigations=off and it's OK. Mitigations=auto is also bad:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 audit=0 noresume consoleblank=0 mitigations=auto selinux=0
# dmesg | tail
[    8.520381] i915 0000:00:02.0: vgaarb: deactivate vga console
[    8.520577] Console: switching to colour dummy device 80x25
[    8.548068] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    8.730188] fbcon: i915drmfb (fb0) is primary device
[    8.730417] Console: switching to colour frame buffer device 210x65
[    8.730431] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    9.991983] r8169 0000:01:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   20.103015] i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] flip_done timed out
[   30.343015] i915 0000:00:02.0: [drm] *ERROR* [CRTC:60:pipe B] flip_done timed out
[   30.343033] i915 0000:00:02.0: [drm] *ERROR* pipe A underrun
#
The following was OK:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
[root@big41 ~]# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 audit=0 noresume consoleblank=0 mitigations=auto
#
So it seems the problem isn't what the parameters are, but more like the total cmdline length, because the following was also OK:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
[root@big41 ~]# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 noresume mitigations=off selinux=0
#
And again not:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
[root@big41 ~]# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 noresume mitigations=off selinux=0 video=1440x900@60
[root@big41 ~]# dmesg | tail
[    8.261285] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    8.442207] fbcon: i915drmfb (fb0) is primary device
[    8.442355] Console: switching to colour frame buffer device 180x56
[    8.442364] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    9.987429] r8169 0000:01:00.0 eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   19.591958] i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] flip_done timed out
[   29.831963] i915 0000:00:02.0: [drm] *ERROR* [CRTC:60:pipe B] flip_done timed out
[   29.831983] i915 0000:00:02.0: [drm] *ERROR* pipe A underrun
[   40.583966] i915 0000:00:02.0: [drm] *ERROR* flip_done timed out
[   40.583974] i915 0000:00:02.0: [drm] *ERROR* [CRTC:45:pipe A] commit wait timed out
#
The same cmdline as in comment #2, 101, is also OK in F38:
# uname -a
Linux big41 6.6.3-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Nov 28 20:36:17 UTC 2023 x86_64 GNU/Linux
# cat /proc/cmdline
ro root=LABEL=i256p20f38 ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 mitigations=off ibt=off
#
Character count of 103 is OK, but 107 is not, so I tried these two:
ro root=LABEL=i256p29f39 ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 preempt=no 0123456
= fail
ro root=LABEL=i256p29f39 ipv6.disable=1 net.ifnames=0 audit=0 selinux=0 consoleblank=0 preempt=no 012345
= succeed
So, the success limit appears to be 104 characters, at least in F39.

Comment 4 Felix Miata 2024-05-12 06:01:08 UTC
6.7.9 on F39 and 6.8.9 on F40 don't reproduce this.


Note You need to log in before you can comment on or make changes to this bug.