Bug 1188772 - i915 driver crash hangs xorg on Fedora 21
Summary: i915 driver crash hangs xorg on Fedora 21
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-intel
Version: 22
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1222273 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-03 16:44 UTC by Will Foster
Modified: 2016-07-19 12:45 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-07-19 12:45:49 UTC
Type: Bug


Attachments (Terms of Use)
stack trace from /var/log/messages (3.50 KB, text/plain)
2015-02-03 16:44 UTC, Will Foster
no flags Details

Description Will Foster 2015-02-03 16:44:45 UTC
Created attachment 987684 [details]
stack trace from /var/log/messages

Description of problem:

Ocassionally, the Intel i915 chipset (Lenovo x240 Haswell) will hang.
This causes xorg and the window environment to completely hang.


Version-Release number of selected component (if applicable):

Fedora 21, latest updates as of 2015-02-03


How reproducible:

Occasional, do not know yet what triggers it.
I have tried with both windowing effects via Compiz (seems to trigger more often) and also with xfwm4 as I am an XFCE user.

Of note, when Compiz is enabled it hardlocks the system, while using regular xfwm4 and no windowing effects it locks xorg however the machine still responds otherwise.

When using compiz, I am unable to obtain any stack trace information as the system freezes and the exception is not logged.

Steps to Reproduce:
1. Use laptop in runlevel 5 with i915 Haswell chipset
2. Wait a day or 5
3. Crash!

Actual results:


Expected results:


Additional info:

Here is the stacktrace:

Feb  3 16:24:55 oberschnutz kernel: ------------[ cut here ]------------
Feb  3 16:24:55 oberschnutz kernel: WARNING: CPU: 3 PID: 10861 at drivers/gpu/drm/i915/intel_pm.c:6585 intel_display_power_put+0x15c/0x170 [i915]()
Feb  3 16:24:55 oberschnutz kernel: Modules linked in: cpufreq_stats tun fuse ccm nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw arc4 xfs libcrc32c iwlmvm mac80211 intel_rapl uvcvideo x86_pkg_temp_thermal coretemp iwlwifi videobuf2_vmalloc kvm_intel videobuf2_core videobuf2_memops v4l2_common videodev iTCO_wdt iTCO_vendor_support cfg80211 media kvm btusb bluetooth snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel joydev snd_hda_controller
Feb  3 16:24:55 oberschnutz kernel: snd_hda_codec rtsx_pci_ms serio_raw i2c_i801 snd_hwdep snd_seq memstick snd_seq_device thinkpad_acpi snd_pcm wmi tpm_tis rfkill tpm shpchp snd_timer snd soundcore lpc_ich nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc dm_crypt i915 rtsx_pci_sdmmc mmc_core i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel ptp rtsx_pci mfd_core pps_core video [last unloaded: e1000e]
Feb  3 16:24:55 oberschnutz kernel: CPU: 3 PID: 10861 Comm: kworker/3:3 Tainted: G     U  W      3.18.3-201.fc21.x86_64 #1
Feb  3 16:24:55 oberschnutz kernel: Hardware name: LENOVO 20AMS22U03/20AMS22U03, BIOS GIET75WW (2.25 ) 06/24/2014
Feb  3 16:24:55 oberschnutz kernel: Workqueue: events edp_panel_vdd_work [i915]
Feb  3 16:24:55 oberschnutz kernel: 0000000000000000 00000000e5fea85f ffff880077fa3cf8 ffffffff8175da66
Feb  3 16:24:55 oberschnutz kernel: 0000000000000000 0000000000000000 ffff880077fa3d38 ffffffff81099181
Feb  3 16:24:55 oberschnutz kernel: 0000000000000206 ffff88020ff8002c 000000000000000b ffff88020ff88848
Feb  3 16:24:55 oberschnutz kernel: Call Trace:
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff8175da66>] dump_stack+0x46/0x58
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff81099181>] warn_slowpath_common+0x81/0xa0
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff8109929a>] warn_slowpath_null+0x1a/0x20
Feb  3 16:24:55 oberschnutz kernel: [<ffffffffa0135b2c>] intel_display_power_put+0x15c/0x170 [i915]
Feb  3 16:24:55 oberschnutz kernel: [<ffffffffa01a7a1b>] edp_panel_vdd_off_sync+0xcb/0x1c0 [i915]
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff817624b6>] ? mutex_lock+0x16/0x40
Feb  3 16:24:55 oberschnutz kernel: [<ffffffffa01a7b41>] edp_panel_vdd_work+0x31/0x40 [i915]
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b240d>] process_one_work+0x14d/0x400
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b2d9b>] worker_thread+0x6b/0x4a0
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b2d30>] ? rescuer_thread+0x2a0/0x2a0
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b7fea>] kthread+0xea/0x100
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff8113aa64>] ? __audit_syscall_entry+0xc4/0x120
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b7f00>] ? kthread_create_on_node+0x1b0/0x1b0
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff817646fc>] ret_from_fork+0x7c/0xb0
Feb  3 16:24:55 oberschnutz kernel: [<ffffffff810b7f00>] ? kthread_create_on_node+0x1b0/0x1b0
Feb  3 16:24:55 oberschnutz kernel: ---[ end trace 061df5d943e3847b ]---
F

Comment 1 Will Foster 2015-02-05 17:10:12 UTC
I am going to try the kernel options mentioned here on latest F21 kernel: 3.18.3-201.fc21.x86_64

https://johnlewis.ie/tentative-fixwork-around-for-i915-gpu-hangs/

e.g.

drm.debug=0 drm.vblankoffdelay=1 i915.semaphores=0 i915.modeset=1 i915.use_mmio_flip=1 i915.powersave=1 i915.enable_ips=1 i915.disable_power_well=1 i915.enable_hangcheck=1 i915.enable_cmd_parser=1 i915.fastboot=0 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0 i915.enable_psr=0 

I will report back if I get any more crashes, will try first with compiz enabled windowmanager under XFCE4.  I have all the latest F21 updates as of 2015-02-05.

-will

Comment 2 Will Foster 2015-02-06 13:21:02 UTC
No logs this time like before as I was using compiz as the windowmanager under XFCE4.

This is on newer (latest) kernel: 3.18.5-201.fc21.x86_64

In this last case it was triggered by hooking up an external HDMI monitor.  Booting with said monitor does not trigger it (yet).  
I thought it was related to the above kernel options not being preserved through suspend/wake but It occurred on fresh boot after plugging in external display and shifting it as primary display.

I seem to be able to trigger this every time by doing the following:

1) boot into XFCE with Compiz as the Window Manager instead of xfwm4
2) hook up HDMI monitor
3) force primary display to HDMI monitor and disable laptop display.

I cannot trigger this everytime using xfwm4 however it still occurs, though I do get more logging.

I will leave sshd enabled so I could hopefully get a clean reboot and sync() so perhaps more will be logged.

Saw this in logs, but not sure if relevant:

Feb  6 13:03:40 oberschnutz kernel: [  107.562952] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
Feb  6 13:03:40 oberschnutz kernel: [  107.563006] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun
Feb  6 13:03:40 oberschnutz kernel: [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
Feb  6 13:03:40 oberschnutz kernel: [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun

Comment 3 Will Foster 2015-02-06 15:36:19 UTC
Here is my relevant non-default xorg.conf additions I've been using.
I am disabling "sna" acceleration to see if that helps, third crash today.

[root@oberschnutz ~]# cat /etc/X11/xorg.conf.d/20-intel.conf 
Section "Device"
    Identifier  "Intel Graphics"
    Driver      "intel"
#    Option      "AccelMethod"  "sna"
#   disabling sna for now
     Option     "AccelMethod"  "uxa"
#    Option      "AccelMethod"  "glamor"
#   glamor is another option to try
    Option      "TearFree"    "true"
EndSection

Section "ServerFlags"
	Option	    "AIGLX" "on"
EndSection

Section "Extensions"
        Option      "Composite" "Enable"
EndSection

Comment 4 Will Foster 2015-02-06 18:05:18 UTC
Switching to "uxa" from "sna" acceleration doesn't seem to make a difference, can confirm a crash after unplugging from an external HDMI monitor however "uxa" is faster :)

I have not tried "uxa" versus "sna" with xfwm4 as the window manager over Compiz.

Will continue to try various kernel options and things.
Switching to vblank_mode=0 and i915.sepmaphores=1 for now.

I am on latest, updated F21 as of 2015-02-06.

kernel-3.18.5-201.fc21.x86_64
xorg-x11-drv-intel-2.99.916-3.20141117.fc21.x86_64
libva-intel-driver-1.4.1-1.fc21.x86_64
intel-gpu-tools-1.7-20.intel20142.x86_64

Comment 5 Will Foster 2015-02-06 18:43:47 UTC
lspci output
-------------

00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 0b)
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)
00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller (rev 0b)
00:14.0 USB controller: Intel Corporation 8 Series USB xHCI HC (rev 04)
00:16.0 Communication controller: Intel Corporation 8 Series HECI #0 (rev 04)
00:16.3 Serial controller: Intel Corporation 8 Series HECI KT (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I218-LM (rev 04)
00:1b.0 Audio device: Intel Corporation 8 Series HD Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 6 (rev e4)
00:1c.1 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 3 (rev e4)
00:1d.0 USB controller: Intel Corporation 8 Series USB EHCI #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation 8 Series LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 8 Series SATA Controller 1 [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 8 Series SMBus Controller (rev 04)
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)
03:00.0 Network controller: Intel Corporation Wireless 7260 (rev 83)

Comment 6 Will Foster 2015-02-10 21:51:49 UTC
As an update, I believe this may be a compiz crash related to i915.
To rule this out I've switched to using kwin and have not experienced a crash yet.  I will continue to watch this and report back, especially if I reproduce this again via xfwm4 or another window manager.

Comment 7 Will Foster 2015-02-16 14:59:27 UTC
Smooth sailing thus far, no crashes using kwin / XFCE:

3.18.6-200.fc21.x86_64
xorg-x11-drv-intel-2.99.916-3.20141117.fc21.x86_64
libva-intel-driver-1.4.1-1.fc21.x86_64
intel-gpu-tools-1.7-20.intel20142.x86_64
xorg-x11-server-common-1.16.3-2.fc21.x86_64

Comment 8 prohol 2015-02-27 14:19:03 UTC
I have the same crashes:

[143112.808141] [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
[143113.777166] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A
[143113.777168] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun

3.18.7-200.fc21.x86_64
xorg-x11-drv-intel-2.99.916-3.20141117.fc21.x86_64
xorg-x11-server-common-1.16.3-2.fc21.x86_64

I have this crashes in gnome shell, classic gnome and cinnamon.

Comment 9 Will Foster 2015-03-09 14:32:17 UTC
This still occurs, though I have different behavior under XFCE + KWIN.
Basically, it takes down X11.

[542600.422859] WARNING: CPU: 0 PID: 11498 at drivers/gpu/drm/i915/intel_pm.c:6560 intel_display_power_put+0x15c/0x170 [i915]()
[542600.422860] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack msdos nls_utf8 isofs loop vfat fat uas usb_storage cpufreq_stats tun ccm fuse bridge stp llc arc4 uvcvideo snd_hda_codec_realtek videobuf2_vmalloc iwlmvm snd_hda_codec_generic videobuf2_core snd_hda_codec_hdmi videobuf2_memops v4l2_common videodev mac80211 xfs media intel_rapl x86_pkg_temp_thermal libcrc32c iTCO_wdt coretemp iTCO_vendor_support btusb iwlwifi bluetooth kvm_intel kvm cfg80211 snd_hda_intel snd_hda_controller snd_hda_codec joydev snd_hwdep rtsx_pci_ms serio_raw snd_seq memstick snd_seq_device thinkpad_acpi wmi snd_pcm tpm_tis tpm i2c_i801 snd_timer snd rfkill shpchp lpc_ich ptp soundcore pps_core nfsd
[542600.422879]  auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc dm_crypt i915 rtsx_pci_sdmmc mmc_core crct10dif_pclmul crc32_pclmul i2c_algo_bit crc32c_intel drm_kms_helper drm ghash_clmulni_intel rtsx_pci mfd_core video [last unloaded: iptable_raw]
[542600.422887] CPU: 0 PID: 11498 Comm: Xorg.bin Tainted: G     U  W      3.18.7-200.fc21.x86_64 #1
[542600.422888] Hardware name: LENOVO 20AMS22U03/20AMS22U03, BIOS GIET75WW (2.25 ) 06/24/2014
[542600.422889]  0000000000000000 0000000089eb002d ffff88018f583a08 ffffffff8175e686
[542600.422891]  0000000000000000 0000000000000000 ffff88018f583a48 ffffffff810991d1
[542600.422892]  ffff8800360d8848 ffff8800360d002c ffff8800360d0000 ffff8800360d8848
[542600.422894] Call Trace:
[542600.422898]  [<ffffffff8175e686>] dump_stack+0x46/0x58
[542600.422901]  [<ffffffff810991d1>] warn_slowpath_common+0x81/0xa0
[542600.422903]  [<ffffffff810992ea>] warn_slowpath_null+0x1a/0x20
[542600.422910]  [<ffffffffa00f4a7c>] intel_display_power_put+0x15c/0x170 [i915]
[542600.422923]  [<ffffffffa014636c>] modeset_update_crtc_power_domains+0x16c/0x180 [i915]
[542600.422935]  [<ffffffffa01468be>] haswell_modeset_global_resources+0xe/0x10 [i915]
[542600.422946]  [<ffffffffa0147165>] __intel_set_mode+0x615/0x16f0 [i915]
[542600.422958]  [<ffffffffa0150746>] intel_set_mode+0x16/0x30 [i915]
[542600.422969]  [<ffffffffa01518ea>] intel_crtc_set_config+0xa9a/0xea0 [i915]
[542600.422980]  [<ffffffffa005d228>] ? drm_modeset_lock+0x78/0x110 [drm]
[542600.422988]  [<ffffffffa004de30>] drm_mode_set_config_internal+0x60/0xf0 [drm]
[542600.422995]  [<ffffffffa00525e6>] drm_mode_setcrtc+0x2b6/0x5d0 [drm]
[542600.423001]  [<ffffffffa0043aaf>] drm_ioctl+0x1df/0x680 [drm]
[542600.423004]  [<ffffffff810d4549>] ? pick_next_task_fair+0x6c9/0x8c0
[542600.423006]  [<ffffffff8122a1c0>] do_vfs_ioctl+0x2d0/0x4b0
[542600.423008]  [<ffffffff8122a421>] SyS_ioctl+0x81/0xa0
[542600.423010]  [<ffffffff8113adf6>] ? __audit_syscall_exit+0x1f6/0x2a0
[542600.423013]  [<ffffffff81765429>] system_call_fastpath+0x12/0x17
[542600.423014] ---[ end trace 534ef4638b6d4f4d ]---

Comment 10 Philip Prindeville 2015-03-10 18:14:26 UTC
(In reply to prohol from comment #8)
> I have the same crashes:
> 
> [143112.808141] [drm:assert_plane] *ERROR* plane A assertion failure
> (expected on, current off)
> [143113.777166] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR*
> uncleared fifo underrun on pipe A
> [143113.777168] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun
> 
> 3.18.7-200.fc21.x86_64
> xorg-x11-drv-intel-2.99.916-3.20141117.fc21.x86_64
> xorg-x11-server-common-1.16.3-2.fc21.x86_64
> 
> I have this crashes in gnome shell, classic gnome and cinnamon.

I'm also seeing that:

Mar 10 11:46:03 eng-dhcp-123 kernel: [321441.212112] [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:46:03 eng-dhcp-123 kernel: [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:48:12 eng-dhcp-123 kernel: [321570.250953] [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:48:12 eng-dhcp-123 kernel: [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)

running a patched kernel (3.18.7-200) as described in bug #1120901.

Same RPM's as above.

Comment 11 Philip Prindeville 2015-03-10 18:18:21 UTC
Oh, I should add: nothing crashed, but I did lose my pointer when the display woke back up again.

Comment 12 prohol 2015-03-11 08:31:48 UTC
After bios update in HP EliteBook 840G1 to newest for 2 days no (In reply to Philip Prindeville from comment #11)
> Oh, I should add: nothing crashed, but I did lose my pointer when the
> display woke back up again.

I have the same problem with pointer, but sometimes display frozen and only gdm restart from console help. 

ps. after bios upgrade to newest for 2 days everything works fine..

Comment 13 prohol 2015-03-31 06:39:24 UTC
After few days pointer works with no problem, but on console sometimes again appear:

Mar 10 11:46:03 eng-dhcp-123 kernel: [321441.212112] [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:46:03 eng-dhcp-123 kernel: [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:48:12 eng-dhcp-123 kernel: [321570.250953] [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)
Mar 10 11:48:12 eng-dhcp-123 kernel: [drm:assert_plane] *ERROR* plane A assertion failure (expected on, current off)

At now I'm testing the newest kernel 3.19.2.

Comment 14 Nick Coghlan 2015-05-26 12:15:06 UTC
I believe I may be seeing this issue with Fedora 22 under KDE on a HP Spectre 360 (presumably using Compiz, as I believe that's the default now).

kernel-4.0.4-301.fc22.x86_64
xorg-x11-drv-intel-2.99.917-6.20150211.fc22.x86_64

Representative selection of errors from /var/log/messages:

May 26 18:28:14 thechalk kernel: [drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (SDE)!
May 26 21:16:01 thechalk kernel: [drm:assert_plane.constprop.80 [i915]] *ERROR* plane A assertion failure (expected on, current off)
May 26 21:22:41 thechalk kernel: [drm:assert_plane.constprop.80 [i915]] *ERROR* plane A assertion failure (expected on, current off)
May 26 21:23:09 thechalk kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
May 26 21:49:37 thechalk kernel: [drm:assert_plane.constprop.80 [i915]] *ERROR* plane A assertion failure (expected on, current off)
May 26 21:50:01 thechalk kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

$ lspci
00:00.0 Host bridge: Intel Corporation Broadwell-U Host Bridge -OPI (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Broadwell-U Integrated Graphics (rev 09)
00:03.0 Audio device: Intel Corporation Broadwell-U Audio Controller (rev 09)
00:14.0 USB controller: Intel Corporation Wildcat Point-LP USB xHCI Controller (rev 03)
00:16.0 Communication controller: Intel Corporation Wildcat Point-LP MEI Controller #1 (rev 03)
00:1c.0 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #2 (rev e3)
00:1c.2 PCI bridge: Intel Corporation Wildcat Point-LP PCI Express Root Port #3 (rev e3)
00:1f.0 ISA bridge: Intel Corporation Wildcat Point-LP LPC Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation Wildcat Point-LP SATA Controller [AHCI Mode] (rev 03)
00:1f.3 SMBus: Intel Corporation Wildcat Point-LP SMBus Controller (rev 03)
01:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5227 PCI Express Card Reader (rev 01)
02:00.0 Network controller: Intel Corporation Wireless 7265 (rev 50)

Comment 15 Nick Coghlan 2015-05-26 12:18:49 UTC
Additional note: resizing a window under KDE by dragging its lower right corner seems to have a fairly high chance of triggering this.

Comment 16 Nick Coghlan 2015-05-26 12:21:36 UTC
*** Bug 1222273 has been marked as a duplicate of this bug. ***

Comment 17 Nick Coghlan 2015-05-28 04:27:30 UTC
Inspired by Will Foster's feedback above, I'm trying turning off Compiz in the KDE System Settings (System Settings -> Display and Monitor -> Compositor -> uncheck "Enable compositor at startup")

As an interim workaround, it's currently looking promising.

Comment 18 Nick Coghlan 2015-05-28 05:56:24 UTC
And... there's another system freeze, even with the compositor switched off.

As a further data point, I'm running this system with both the laptop's internal monitor and an external HDMI monitor. When X freezes, I *can* get to a virtual console with Ctrl-Alt-F3, but:

1. When I do this, the internal monitor goes completely haywire with the screen scrolling horizontally at high speed. (The external monitor is fine)

2. If I just try restarting X via "sudo systemctl gdm.service restart", *both* monitors go haywire.

So it isn't clear at this point if the messages from the [i915] driver are related to the original system freeze or not.

Comment 19 Dave Airlie 2015-05-28 07:02:20 UTC
can you try xorg-x11-drv-intel-2.99.917-9.20150520.fc22 to see if it still happens?

I haven't got a broadwell here to reproduce things on.

Comment 20 AGS 2015-05-29 17:23:50 UTC
This also affects me and when the crashes and/or lockups happen I see these messages scroll several times.

Comment 21 Nick Coghlan 2015-06-01 04:46:22 UTC
OK, I'll try that one tonight.

Comment 23 Nick Coghlan 2015-06-03 05:41:40 UTC
I brought my system into work so Dave could have a look at it.

1. The virtual console issue goes away if the system is started with "i915.enable_ips=0". That suggests this is due to a known bug that will be fixed in a future driver release.

2. The simplest reproducer I have found for the original lockup bug is to:

a. Open Gwenview
b. Drag the bottom right corner to resize the window

(Resizing Konsole and Dolphin doesn't appear to cause this problem)

Comment 24 Jean-Christophe Baptiste 2015-06-25 18:32:33 UTC
I also encountered the bug today on Fedora 22.

I can reproduce it every time, for now, by simply watching a video in full screen.

Here is the sequence I get in dmesg :

[63143.747468] [drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 129
[63144.010463] [drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 129
[63156.203468] [drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 10
[66021.579426] [drm:assert_plane.constprop.80 [i915]] *ERROR* plane A assertion failure (expected on, current off)

Comment 25 Jean-Christophe Baptiste 2015-06-25 18:38:11 UTC
I can get a full Xorg crash only if I get the video in full screen on my external HDMI screen, while playing some flash in the browser on the other screen.

If I "only" play the video on the HDMI screen, only totem crashes :

[66384.617157] totem[22391]: segfault at 208 ip 00007fc180471b8c sp 00007ffd08acb600 error 4 in i965_dri.so[7fc180105000+55f000]


I also forgot to give my config : Lenovo Thinkpad S540

Comment 26 Jean-Christophe Baptiste 2015-06-25 18:44:01 UTC
After a reboot and several tests, I can confirm that "i915.enable_ips=0" seems to fix the issue.

Comment 27 Nick Coghlan 2015-06-28 02:57:34 UTC
I spent a few minutes today attempting to reproduce the original crash I observed when resizing the Gwenview window. While that was never 100% reproducible, I'd always previously been able to trigger it with a minute or two of resizing the window.

I've seen a few updates to the driver go by in recent weeks, so it seems plausible one of those may have addressed the specific issue that was being triggered there. Current driver version:

$rpm -qa xorg-x11-drv-intel
xorg-x11-drv-intel-2.99.917-12.20150615.fc22.x86_64

That particular problem used to happen even with enable_ips=0 though, so this doesn't invalidate Jean-Christophe's update above regarding ongoing problems with "enable_ips=1".

Comment 28 Danny Ciarniello 2015-08-02 07:10:29 UTC
I've just updated from F21 to F22 and been bitten by this bug.  (Actually, my situation more closely fits the description given in bug 1222273 but since that bug has been closed as a duplicate of this one, I'm adding my comment here.)

The plasma desktop will lock up on me when the screen is switched off if I have one of the "monitor" widgets (i. e. CPU Load Monitor, Hard Disk I/O Load Monitor or Network Monitor) on the desktop.  This doesn't seem to be a necessary condition since I've had lock ups without these widgets but it certainly seems to be a sufficient condition since I've had lock ups every time with one of those widgets installed.

I can get out of it without rebooting by switching to another virtual console and doing a "killall startkde".

What I haven't seen is anything in the logs, including any mention of the i915 chipset, that would indicate what the problem might be.

Comment 29 H.G.Blob 2015-09-04 19:42:56 UTC
(In reply to Danny Ciarniello from comment #28)
> The plasma desktop will lock up on me when the screen is switched off if I
> have one of the "monitor" widgets (i. e. CPU Load Monitor, Hard Disk I/O
> Load Monitor or Network Monitor) on the desktop.  This doesn't seem to be a
> necessary condition since I've had lock ups without these widgets but it
> certainly seems to be a sufficient condition since I've had lock ups every
> time with one of those widgets installed.

I have had exactly the same problem. Apparently it's related to the now infamous i915 bug that causes weird problems in plasma. The solution as suggested on the KDE mailing list at https://mail.kde.org/pipermail/kde-distro-packagers/2015-August/000088.html is to switch AccelMethod in your xorg.conf from sna to uxa.

Hope this helps.

Comment 30 Danny Ciarniello 2015-09-08 02:42:14 UTC
Yes, that did indeed help.  With AccelMethod set to uxa, the plasma desktop is no longer locking up when the screen is switched off.

Unfortunately, it doesn't prevent the kf5-activities bug (e. g. bug 1233653).

Comment 31 Nick Coghlan 2015-09-14 06:40:54 UTC
Interesting. I'm not getting lockups at the moment, but I *am* regularly getting Plasma crashes (with Plasma then restarting smoothly). I've applied the suggested UXA workaround, and will see if that improves things.

Comment 32 Mihai Harpau 2015-09-14 18:20:01 UTC
I would suggest to apply mesa update FEDORA-2015-15543. 
In my case clear out any problems related with "infamous i915 bug".
Also that seems to be correct fix for Plasma crashes.

Comment 33 Danny Ciarniello 2015-09-20 16:55:46 UTC
Unfortunately, that didn't work for me.  Plasma still crashes on logout.

Comment 34 EMR_Fedora 2015-09-23 22:44:35 UTC
I just got a mesa/plasma update yesterday, and even this UXA workaround doesn't work. I need to revert to LxQt to be able to get my work done. It seems like it whatever it was, it was a regression.

Comment 35 Will Foster 2015-09-24 14:18:10 UTC
As an update:

I've moved back to the default xorg settings, using SNA acceleration again and I'm unable to reproduce this on the latest Fedora 22 with all updates e.g.

xorg-x11-drv-intel-2.99.917-15.20150729.fc22.x86_64
xorg-x11-server-Xorg-1.17.2-2.fc22.x86_64
4.1.4-200.fc22.x86_64 (4.1.7-200 is latest but I have another issue with hybrid-suspend and think systemd lid-sleep has regressed in later kernels, causing logind not to perform actions - unrelated for this bug).

The following kernel parameters are in place, however and I'm not sure if this has fixed it or not but it's not occurred for me in quite some time.

--snip /etc/default/grub --
i915.semaphores=1 i915.modeset=1 i915.use_mmio_flip=1 i915.powersave=1 i915.enable_ips=1 i915.disable_power_well=1 i915.enable_hangcheck=1 i915.enable_cmd_parser=1 i915.fastboot=0 i915.enable_ppgtt=1 i915.reset=0 i915.lvds_use_ssc=0 i915.enable_psr=0 vblank_mode=0 i915.i915_enable_rc6=1
--snip --

I am using XFCE with KWIN now for compositing/window management.

Comment 36 Nick Coghlan 2015-09-30 10:06:07 UTC
The UXA workaround did seem to help for me. I've subsequently upgraded to the F23 Beta, and haven't encountered any further problems (however that upgrades so many components, it's impossible to use it to determine whether the original root cause has been addressed, or if other components are just handling it better).

Comment 37 Fedora End Of Life 2016-07-19 12:45:49 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.