Bug 1441906 - unable to handle kernel NULL pointer dereference at 0000000000000018 [i915]
Summary: unable to handle kernel NULL pointer dereference at 0000000000000018 [i915]
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-intel
Version: 25
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-13 04:24 UTC by Daniel Wang
Modified: 2017-12-12 13:22 UTC (History)
33 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-12 10:19:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
FreeDesktop.org 100516 0 None None None 2017-04-13 04:24:49 UTC

Description Daniel Wang 2017-04-13 04:24:50 UTC
Description of problem: A few times a day, I get a freeze in video (mouse ptr, etc stop moving) and input hardware (usb keyboard/mouse, built-in touchpad, ctrl-alt-del do not respond). This is a known problem upstream, but the fix is not planned until kernel 4.12.
Upstream bug: https://bugs.freedesktop.org/show_bug.cgi?id=100516

log:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
 IP: gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915]
 PGD 190aa1067 
 PUD 0 
 
 Oops: 0002 [#1] SMP
 Modules linked in: rfcomm ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcas
  dcdbas irqbypass crct10dif_pclmul crc32_pclmul iwlmvm ghash_clmulni_intel intel_cstate intel_uncore mac80211 intel_rapl_perf snd
  i2c_designware_core int340x_thermal_zone int3406_thermal industrialio intel_soc_dts_iosf tpm_crb intel_hid tpm_tis sparse_keymap
 CPU: 3 PID: 3043 Comm: chrome Tainted: G     U     OE   4.10.5-200.fc25.x86_64 #1
 Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A07 11/11/2015
 task: ffff904120bbcb00 task.stack: ffffb5760817c000
 RIP: 0010:gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915]
 RSP: 0018:ffffb5760817f858 EFLAGS: 00010246
 RAX: ffff9040595ef6c0 RBX: 0000000000000003 RCX: 0000000000000003
 RDX: 0000000000000000 RSI: ffff904110b2d000 RDI: ffff904190288000
 RBP: ffffb5760817f8b0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff903fcd0ee000
 R13: ffff9040d7fe9690 R14: 00000000fffbf000 R15: 0000000000008000
 FS:  00007f54bd60cf80(0000) GS:ffff90419f580000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000018 CR3: 0000000185d28000 CR4: 00000000003426e0
 Call Trace:
  gen8_alloc_va_range_3lvl+0xfb/0x9e0 [i915]
  ? __alloc_pages_nodemask+0x122/0x2c0
  gen8_alloc_va_range+0x23d/0x470 [i915]
  i915_vma_bind+0x7e/0x170 [i915]
  __i915_vma_do_pin+0x2f1/0x4a0 [i915]
  i915_gem_execbuffer_reserve_vma.isra.30+0x144/0x1b0 [i915]
  i915_gem_execbuffer_reserve.isra.31+0x44a/0x480 [i915]
  i915_gem_do_execbuffer.isra.37+0x652/0x1820 [i915]
  ? radix_tree_lookup_slot+0x22/0x50
  ? shmem_getpage_gfp+0xdd/0xc90
  i915_gem_execbuffer2+0xc5/0x240 [i915]
  drm_ioctl+0x21b/0x4c0 [drm]
  ? file_update_time+0x5e/0x110
  ? i915_gem_execbuffer+0x310/0x310 [i915]
  do_vfs_ioctl+0xa3/0x5f0
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x67/0x180
  entry_SYSCALL64_slow_path+0x25/0x25
 RIP: 0033:0x7f54b6c81787
 RSP: 002b:00007ffea9615888 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 0000000572317800 RCX: 00007f54b6c81787
 RDX: 00007ffea96158d0 RSI: 00000000c0406469 RDI: 0000000000000010
 RBP: 00007ffea96158d0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000038 R11: 0000000000000246 R12: 00000000c0406469
 R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000000
 Code: e6 48 8b 90 20 03 00 00 48 8b b8 d8 02 00 00 48 8b 52 08 48 83 ca 03 e8 aa cc ff ff 48 8b 45 b0 48 8b 4d c8 48 8b 10 48 8b 
 RIP: gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915] RSP: ffffb5760817f858
 CR2: 0000000000000018
 ---[ end trace 108fcbb7cc9aa151 ]---


Version-Release number of selected component (if applicable):
4.10.5-200.fc25.x86_64
4.10.9-200.fc25.x86_64

How reproducible:
Happens multiple times a day (sometimes within 30 minutes of boot, sometimes after a few hours).

Steps to Reproduce:
1.Boot into gnome on wayland or xorg (both exhibit problem)
2. use a browser (this seems to make things happen earlier)
3. eventually, video, input crash, but network access (ssh) is functional

Actual results:
video, input freeze

Expected results:
no crashes

Additional info:
Hardware: Dell XPS 13 (9343)
Any chance of a bandaid fix backported?

Comment 1 Nick Byrne 2017-04-20 14:09:16 UTC
I'm seeing this as well on Lenovo S3 Yoga Thinkpad

Comment 2 Dan Trainor 2017-04-20 18:49:58 UTC
Same, on Carbon X1.

Comment 3 Robert Holmes 2017-04-23 07:34:41 UTC
Also hitting this infuriating crash on an ASUS N552VX.

The upstream bug report also offers a pre-4.12 workaround by disabling memory reclaim: https://bugs.freedesktop.org/show_bug.cgi?id=99295#c22

Comment 4 Radek Novacek 2017-04-28 07:26:42 UTC
It happens to me every day at least once on HP Spectre x360 (Intel 7500U with integrated graphic card). Backport or workaround would be much appreciated.

Comment 5 Raf 2017-05-02 12:16:35 UTC
The same on: Dell XPS 15 9560 (early 2017), intel i7 7700. + Two external monitors, one on HDMI, the second one VGA via usb C (dell's USB-C -> LAN/HDMI/VGA/USB adaptor)  

Sytem: fedora 25
kernel: 4.10.11-200.fc25.x86_64
DM: KDE

Random crashes, sometimes a few days without, sometimes twice a day.

Comment 6 Johannes 2017-05-03 15:03:22 UTC
Same on Dell XPS 13 9350 even without any external monitors connected.
Fedora 24.

Comment 7 Johannes 2017-05-03 15:03:57 UTC
Same on Dell XPS 13 9350 even without any external monitors connected.
Fedora 25.

Comment 8 Jan-Philip Gehrcke 2017-05-07 16:28:37 UTC
Ran into this many times with Fedora 25 on A Thinkpad T450s. Last time yesterday, with kernel 4.10.11:

    May 06 16:34:43 jp-t450s kernel: IP: gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915]
    May 06 16:34:44 jp-t450s kernel: PGD 0 
    May 06 16:34:44 jp-t450s kernel: 
    May 06 16:34:44 jp-t450s kernel: Oops: 0002 [#1] SMP
    May 06 16:34:44 jp-t450s kernel: Modules linked in: uas usb_storage mmc_block xt_nat veth nf_conntrack_netlink xt_addrtype br_netfilter overlay rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep arc4 intel_rapl mei_wdt iwlmvm x86_pkg_temp_thermal intel_powerclamp mac80211 coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support iwlwifi irqbypass intel_cstate
    May 06 16:34:44 jp-t450s kernel:  intel_uncore intel_rapl_perf joydev uvcvideo cfg80211 snd_hda_codec_realtek videobuf2_vmalloc snd_hda_codec_hdmi snd_hda_codec_generic videobuf2_memops videobuf2_v4l2 rtsx_pci_ms btusb memstick videobuf2_core btrtl i2c_i801 mei_me snd_hda_intel btbcm intel_pch_thermal lpc_ich btintel shpchp videodev mei snd_hda_codec bluetooth media snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer thinkpad_acpi wmi snd soundcore rfkill tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt hid_logitech_hidpp hid_logitech_dj i915 rtsx_pci_sdmmc mmc_core crct10dif_pclmul i2c_algo_bit crc32_pclmul drm_kms_helper crc32c_intel ghash_clmulni_intel drm e1000e serio_raw rtsx_pci ptp pps_core fjes video
    May 06 16:34:44 jp-t450s kernel: CPU: 0 PID: 15507 Comm: chrome Not tainted 4.10.11-200.fc25.x86_64 #1
    May 06 16:34:44 jp-t450s kernel: Hardware name: LENOVO 20BX0011GE/20BX0011GE, BIOS JBET51WW (1.16 ) 07/08/2015
    May 06 16:34:44 jp-t450s kernel: task: ffff96dde3daa580 task.stack: ffffaff9c9704000
    May 06 16:34:44 jp-t450s kernel: RIP: 0010:gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915]
    May 06 16:34:44 jp-t450s kernel: RSP: 0000:ffffaff9c9707858 EFLAGS: 00010246
    May 06 16:34:44 jp-t450s kernel: RAX: ffff96ddb1af7fc0 RBX: 0000000000000003 RCX: 0000000000000003
    May 06 16:34:44 jp-t450s kernel: RDX: 0000000000000000 RSI: ffff96dd8d204000 RDI: ffff96de2a6b0000
    May 06 16:34:44 jp-t450s kernel: RBP: ffffaff9c97078b0 R08: 0000000000000000 R09: 0000000000000000
    May 06 16:34:44 jp-t450s kernel: R10: 0000000000000000 R11: 0000000000000041 R12: ffff96de2d200000
    May 06 16:34:44 jp-t450s kernel: R13: ffff96dc97bf6df0 R14: 00000000ffff5000 R15: 0000000000002000
    May 06 16:34:44 jp-t450s kernel: FS:  00007f8bff5e0f80(0000) GS:ffff96de3dc00000(0000) knlGS:0000000000000000
    May 06 16:34:44 jp-t450s kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May 06 16:34:44 jp-t450s kernel: CR2: 0000000000000018 CR3: 00000002c4f7d000 CR4: 00000000003406f0
    May 06 16:34:44 jp-t450s kernel: Call Trace:
    May 06 16:34:44 jp-t450s kernel:  gen8_alloc_va_range_3lvl+0xfb/0x9e0 [i915]
    May 06 16:34:44 jp-t450s kernel:  ? shmem_getpage_gfp+0xdd/0xc90
    May 06 16:34:44 jp-t450s kernel:  ? sg_init_table+0x1a/0x40
    May 06 16:34:44 jp-t450s kernel:  ? swiotlb_map_sg_attrs+0x49/0x110
    May 06 16:34:44 jp-t450s kernel:  gen8_alloc_va_range+0x23d/0x470 [i915]
    May 06 16:34:44 jp-t450s kernel:  i915_vma_bind+0x7e/0x170 [i915]
    May 06 16:34:44 jp-t450s kernel:  __i915_vma_do_pin+0x2f1/0x4a0 [i915]
    May 06 16:34:44 jp-t450s kernel:  i915_gem_execbuffer_reserve_vma.isra.30+0x144/0x1b0 [i915]
    May 06 16:34:44 jp-t450s kernel:  i915_gem_execbuffer_reserve.isra.31+0x44a/0x480 [i915]
    May 06 16:34:44 jp-t450s kernel:  i915_gem_do_execbuffer.isra.37+0x652/0x16c0 [i915]
    May 06 16:34:44 jp-t450s kernel:  ? ___slab_alloc+0x294/0x540
    May 06 16:34:44 jp-t450s kernel:  ? radix_tree_lookup_slot+0x22/0x50
    May 06 16:34:44 jp-t450s kernel:  ? shmem_getpage_gfp+0xdd/0xc90
    May 06 16:34:44 jp-t450s kernel:  ? drm_gem_object_free+0x29/0x70 [drm]
    May 06 16:34:44 jp-t450s kernel:  i915_gem_execbuffer2+0xc5/0x240 [i915]
    May 06 16:34:44 jp-t450s kernel:  drm_ioctl+0x21b/0x4c0 [drm]
    May 06 16:34:44 jp-t450s kernel:  ? file_update_time+0x5e/0x110
    May 06 16:34:44 jp-t450s kernel:  ? i915_gem_execbuffer+0x310/0x310 [i915]
    May 06 16:34:44 jp-t450s kernel:  do_vfs_ioctl+0xa3/0x5f0
    May 06 16:34:44 jp-t450s kernel:  SyS_ioctl+0x79/0x90
    May 06 16:34:44 jp-t450s kernel:  do_syscall_64+0x67/0x180
    May 06 16:34:44 jp-t450s kernel:  entry_SYSCALL64_slow_path+0x25/0x25

Comment 9 Larry O'Leary 2017-05-12 21:20:42 UTC
I'm seeing this on Lenovo T460p. Has happened a few times now.

Comment 10 Benjamin Herrenschmidt 2017-05-17 04:27:35 UTC
This is happening on all the thinkpads in the lab too. This is basically making every Intel based machine randomly lockup.

I don't understand how somebody thinks it's ok to have two major kernel versions do that without any intention to backport the fix. Ugh.

Comment 11 Josh Boyer 2017-05-17 12:19:38 UTC
commit e2b763caa6eb68ea56918ee6f79b40b82bdcf7c9
Author: Chris Wilson <chris.uk>
Date:   Wed Feb 15 08:43:48 2017 +0000

    drm/i915: Remove bitmap tracking for used-pdpes

is the highlighted fix from the upstream bug

commit bf75d59eff679d2e2b7af5c6958a088f8a458f7a
Author: Chris Wilson <chris.uk>
Date:   Mon Feb 27 12:26:52 2017 +0000

    drm/i915: Only unwind the local pgtable layer if empty

is a follow on fix.  Both are in 4.12-rc1.  It would likely be a good test if someone could try the rawhide 4.12-rc1 kernel and see if this issue is resolved.

Comment 12 Tim Niemueller 2017-05-17 12:27:21 UTC
I'm on 4.9.10-200.fc25.x86_64 in the meantime and I haven't experienced any crashes so far (in about a day which already is above average compared to 4.10). I currently do not have the time to try 4.12-rc1.

Comment 13 Jan-Philip Gehrcke 2017-05-17 12:31:02 UTC
I am on 4.10.x and use my system daily for many hours. I have experienced said crash only ~10 times in many weeks. So, in certain environments the probability to run into this is very low. Not experiencing this during "one day of testing" does not allow for drawing conclusions.

Comment 14 Gordon Messmer 2017-05-17 14:02:24 UTC
I don't see rc1 in the mirrors yet.  Is this a good kernel to test?

https://kojipkgs.fedoraproject.org//packages/kernel/4.12.0/0.rc1.git1.1.fc27/x86_64/

Comment 15 Josh Boyer 2017-05-17 14:11:05 UTC
(In reply to Gordon Messmer from comment #14)
> I don't see rc1 in the mirrors yet.  Is this a good kernel to test?
> 
> https://kojipkgs.fedoraproject.org//packages/kernel/4.12.0/0.rc1.git1.1.fc27/
> x86_64/

Yes.  Or alternatively the nodebug version of that, available as described here:


https://fedoraproject.org/wiki/RawhideKernelNodebug

Comment 16 Gordon Messmer 2017-05-17 23:46:15 UTC
Thanks, I wasn't aware of that option!

I'd already loaded the rawhide kernel, this time.  8 hours, so far.  No crashes or hang yet after several tty switches and suspend/resume cycles.  I'll check later whether or not I have logs that might indicate how frequently I was seeing this problem before.  The other symptom I saw under 4.10 was a hang if I left the screen locked overnight.

So far, so good.

Comment 17 Gordon Messmer 2017-05-19 06:55:05 UTC
I think this bug may have been a lot less frequent than I previously thought.  My first record of this bug actually happened under 4.9.9, which I later mistakenly believe to be unaffected.  I thought only 4.10 had the problem.  Between the 5th and the 17th, I continued running 4.9.9 without seeing a hang, whereas I ran 4.10 for just two days before it hung.

...which is to say that it'll take a good deal longer before I can say whether or not the problem appears to be fixed in 4.12.

Comment 18 Jean-Christophe Baptiste 2017-05-23 10:11:54 UTC
I am experiencing this bug daily, especially during heavy I/O tasks (usb transfer, multiple vms, etc.).

Comment 19 Jean-Christophe Baptiste 2017-05-23 10:12:38 UTC
I am experiencing this bug daily (Fedora 25 up-to-date), especially during heavy I/O tasks (usb transfer, multiple vms, etc.).
Testing rawhide 4.12 kernel for now.

Comment 20 Dave Airlie 2017-05-23 19:48:42 UTC
https://kojipkgs.fedoraproject.org/scratch/airlied/task_19697417/

can someone give this 4.11 kernel a try?

Comment 21 Gordon Messmer 2017-05-24 21:42:13 UTC
Dave, I'll start running your kernel today.

Comment 22 Aaron Sowry 2017-05-25 04:19:35 UTC
(In reply to Gordon Messmer from comment #21)
> Dave, I'll start running your kernel today.

Ditto.

Comment 23 Jean-Christophe Baptiste 2017-05-25 15:15:15 UTC
Dave, I have not tried your kernel, but since then I have upgraded to F26 and the 4.11 kernel in there.
So far, the bug hasn't occured.

Comment 24 Radek Novacek 2017-05-26 09:57:56 UTC
I'm running the kernel from Dave (comment #20) for two days without any problem. It used to crash at least once a day with affected kernel and it's running twice the time without crash so far.

I had to disable Secure Boot to use that kernel. I guess that's expected for unofficial kernel.

Comment 25 Radek Novacek 2017-05-31 06:57:15 UTC
It's more than six days straight without crash. It seems that the kernel from comment #20 indeed fixes this issue. Thanks a lot, those crashes were really annoying.

When can we expect this fix in official kernel?

Comment 26 Joe Doss 2017-06-03 21:14:30 UTC
I am still seeing issue with Fedora 26 and 4.11.3-300.fc26.x86_64 with any heavy I/O on my Thinkpad X1 Carbon (gen 2). 

[jdoss@sts1 ~]$ uname -a
Linux sts1.inf7.net 4.11.3-300.fc26.x86_64 #1 SMP Thu May 25 18:43:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
 
[jdoss@sts1 ~]$ cat /etc/redhat-release 
Fedora release 26 (Twenty Six)


Jun 03 16:01:27 sts1.inf7.net kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
Jun 03 16:01:29 sts1.inf7.net kernel: IP: gen8_ppgtt_alloc_page_directories.isra.41+0xd9/0x250 [i915]
Jun 03 16:01:29 sts1.inf7.net kernel: PGD 0 
Jun 03 16:01:33 sts1.inf7.net kernel: 
Jun 03 16:01:33 sts1.inf7.net kernel: Oops: 0002 [#1] SMP
Jun 03 16:01:33 sts1.inf7.net kernel: Modules linked in: rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun xt_addrtype nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ip_set nfnetlink xt_conntrack ebtable_nat ebtable_broute br_netfilter bridge stp llc overlay ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc vfat fat arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iwlmvm kvm mac80211 uvcvideo iTCO_wdt irqbypass videobuf2_vmalloc iTCO_vendor_support mei_wdt videobuf2_memops videobuf2_v4l2
Jun 03 16:01:33 sts1.inf7.net kernel:  intel_cstate intel_uncore videobuf2_core iwlwifi videodev intel_rapl_perf btusb media btrtl btbcm snd_hda_codec_hdmi btintel cfg80211 snd_hda_codec_realtek bluetooth snd_hda_codec_generic joydev snd_hda_intel snd_hda_codec thinkpad_acpi snd_hda_core wmi snd_hwdep rfkill snd_seq snd_seq_device snd_pcm mei_me snd_timer snd mei intel_pch_thermal soundcore i2c_i801 lpc_ich shpchp tpm_tis tpm_tis_core tpm dm_crypt i915 crct10dif_pclmul crc32_pclmul crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_kms_helper drm e1000e serio_raw ptp pps_core video
Jun 03 16:01:34 sts1.inf7.net kernel: CPU: 3 PID: 2646 Comm: chrome Not tainted 4.11.3-300.fc26.x86_64 #1
Jun 03 16:01:34 sts1.inf7.net kernel: Hardware name: LENOVO 20BSCTO1WW/20BSCTO1WW, BIOS N14ET36W (1.14 ) 07/14/2016
Jun 03 16:01:34 sts1.inf7.net kernel: task: ffff97855d6a4b00 task.stack: ffffb3cbc2de8000
Jun 03 16:01:34 sts1.inf7.net kernel: RIP: 0010:gen8_ppgtt_alloc_page_directories.isra.41+0xd9/0x250 [i915]
Jun 03 16:01:34 sts1.inf7.net kernel: RSP: 0018:ffffb3cbc2deb890 EFLAGS: 00010246
Jun 03 16:01:34 sts1.inf7.net kernel: RAX: ffff97851269c000 RBX: 0000000000008000 RCX: 0000000000000018
Jun 03 16:01:34 sts1.inf7.net kernel: RDX: 0000000000000000 RSI: ffff978536da2000 RDI: ffff9785be6f8000
Jun 03 16:01:34 sts1.inf7.net kernel: RBP: ffffb3cbc2deb8f0 R08: 0000000000000000 R09: 0000000000000000
Jun 03 16:01:34 sts1.inf7.net kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
Jun 03 16:01:34 sts1.inf7.net kernel: R13: 0000000000000003 R14: ffff97855d59a000 R15: 00000000ffff7000
Jun 03 16:01:34 sts1.inf7.net kernel: FS:  00007f3c043c2ac0(0000) GS:ffff9785cdcc0000(0000) knlGS:0000000000000000
Jun 03 16:01:35 sts1.inf7.net kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 03 16:01:35 sts1.inf7.net kernel: CR2: 0000000000000018 CR3: 00000001d606a000 CR4: 00000000003426e0
Jun 03 16:01:35 sts1.inf7.net kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 03 16:01:35 sts1.inf7.net kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 03 16:01:35 sts1.inf7.net kernel: Call Trace:
Jun 03 16:01:35 sts1.inf7.net kernel:  gen8_alloc_va_range_3lvl+0xcc/0x950 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  ? __kmalloc+0x185/0x210
Jun 03 16:01:35 sts1.inf7.net kernel:  ? sg_kmalloc+0x19/0x30
Jun 03 16:01:35 sts1.inf7.net kernel:  gen8_alloc_va_range+0x282/0x440 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  i915_vma_bind+0x7e/0x170 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  __i915_vma_do_pin+0x396/0x450 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  i915_gem_execbuffer_reserve_vma.isra.29+0xbe/0x1b0 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  i915_gem_execbuffer_reserve.isra.30+0x41b/0x4a0 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  i915_gem_do_execbuffer.isra.36+0x4f3/0x1540 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  ? find_get_entry+0x20/0x170
Jun 03 16:01:35 sts1.inf7.net kernel:  ? find_lock_entry+0x5b/0x150
Jun 03 16:01:35 sts1.inf7.net kernel:  i915_gem_execbuffer2+0xc5/0x240 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  drm_ioctl+0x212/0x4d0 [drm]
Jun 03 16:01:35 sts1.inf7.net kernel:  ? i915_gem_execbuffer+0x320/0x320 [i915]
Jun 03 16:01:35 sts1.inf7.net kernel:  do_vfs_ioctl+0xa5/0x600
Jun 03 16:01:35 sts1.inf7.net kernel:  ? security_file_ioctl+0x43/0x60
Jun 03 16:01:35 sts1.inf7.net kernel:  SyS_ioctl+0x79/0x90
Jun 03 16:01:35 sts1.inf7.net kernel:  do_syscall_64+0x67/0x170
Jun 03 16:01:35 sts1.inf7.net kernel:  entry_SYSCALL64_slow_path+0x25/0x25
Jun 03 16:01:35 sts1.inf7.net kernel: RIP: 0033:0x7f3bfda0f837
Jun 03 16:01:35 sts1.inf7.net kernel: RSP: 002b:00007ffed1c53cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jun 03 16:01:35 sts1.inf7.net kernel: RAX: ffffffffffffffda RBX: 000003ffb3985030 RCX: 00007f3bfda0f837
Jun 03 16:01:35 sts1.inf7.net kernel: RDX: 00007ffed1c53d50 RSI: 0000000040406469 RDI: 000000000000000e
Jun 03 16:01:35 sts1.inf7.net kernel: RBP: 00007ffed1c53d50 R08: 0000000000000000 R09: 000003ffb38b7a00
Jun 03 16:01:35 sts1.inf7.net kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000040406469
Jun 03 16:01:35 sts1.inf7.net kernel: R13: 000000000000000e R14: ffffffffffffffff R15: 000003ffb3590f00
Jun 03 16:01:35 sts1.inf7.net kernel: Code: 00 49 8b be e0 02 00 00 48 89 c6 48 89 45 c0 48 8b 52 08 48 83 ca 03 e8 76 d0 ff ff 48 8b 45 a8 48 8b 4d d0 48 8b 10 48 8b 45 c0 <48> 89 04 0a 48 8b 45 b0 4c 0f ab 28 0f 1f 44 00 00 49 8d 87 00 
Jun 03 16:01:35 sts1.inf7.net kernel: RIP: gen8_ppgtt_alloc_page_directories.isra.41+0xd9/0x250 [i915] RSP: ffffb3cbc2deb890
Jun 03 16:01:35 sts1.inf7.net kernel: CR2: 0000000000000018
Jun 03 16:01:35 sts1.inf7.net kernel: ---[ end trace 6a84366dfb54cad9 ]---

Comment 27 Gordon Messmer 2017-06-03 23:14:41 UTC
I suspect that the fix isn't in F26's builds yet.  Try using the kernel Dave provided in #20.

Comment 28 Oskari Saarenmaa 2017-06-08 05:18:15 UTC
kernel-4.11.2-200.fdo99295.fc25.x86_64 fixed the issue for me on my Thinkpad X230 & Intel NUC6I5SYH when connected to an external monitor.  The package is now gone from Koji so it'd be nice to get the fix in official F25 packages.

Comment 29 Pawel Jakubowski 2017-06-08 07:52:47 UTC
Problem still occurs for me on 4.11 kernel

$ uname -a
Linux devbook 4.11.3-200.fc25.x86_64 #1 SMP Thu May 25 19:03:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/redhat-release 
Fedora release 25 (Twenty Five)

Comment 30 Michael Lippens 2017-06-08 10:13:00 UTC
I second Pawel Jakubowski, I have the same kernel running wayland.

Comment 31 Axel Nagel 2017-06-08 13:39:41 UTC
I'm also waiting for the fix while on 4.11.3-200.fc25.x86_64. 
Could the kernel-4.11.2-200.fdo99295.fc25.x86_64 be made available again ?!

Cheers

Comment 32 Michael Lippens 2017-06-08 14:15:52 UTC
Axel nagel, isn't it still in your list of kernels then when booting? I think I still have 4.11.2-200 but also had this issue already.

Comment 33 Gordon Messmer 2017-06-08 14:34:11 UTC
Pawel and Michael: The fix isn't in the kernel you're using.  It appeared in a build that was prepared to test a solution, version kernel-4.11.2-200.fdo99295.fc25.x86_64

It should also appear in 4.12 kernels, and several of those are available here:
https://kojipkgs.fedoraproject.org/packages/kernel/4.12.0/

Comment 34 Gordon Messmer 2017-06-08 15:05:57 UTC
... or as Josh pointed out, you could follow these instructions to get a "nodebug" version:

https://fedoraproject.org/wiki/RawhideKernelNodebug

Comment 35 Axel Nagel 2017-06-08 15:59:12 UTC
(In reply to Gordon Messmer from comment #33)
> Pawel and Michael: The fix isn't in the kernel you're using.  It appeared in
> a build that was prepared to test a solution, version
> kernel-4.11.2-200.fdo99295.fc25.x86_64
> 
> It should also appear in 4.12 kernels, and several of those are available
> here:
> https://kojipkgs.fedoraproject.org/packages/kernel/4.12.0/

Hi Gordon, 

thanks for suggesting 4.12. I would have probably went the raw hide route. 
I would "just" need to compile VMWare Workstation which isn't compatible yet. So the backport to 4.11 would be just what I need.

Comment 36 Mirek Svoboda 2017-06-09 09:02:18 UTC
Experiencing crashes on HP Elitebook 850 G4, Intel i5-7200U with integrated GPU.

Comment 37 Michael Lippens 2017-06-09 11:38:35 UTC
I just had an update (from 4.11.3-200 => 4.11.3-202), does anyone have an idea if this also fixes the issue?

Comment 38 Mirek Svoboda 2017-06-09 11:41:40 UTC
(In reply to Michael Lippens from comment #37)
> I just had an update (from 4.11.3-200 => 4.11.3-202), does anyone have an
> idea if this also fixes the issue?

I do not think so, as it is not mentioned among the bugs included in the release.

Comment 39 Axel Nagel 2017-07-21 10:45:36 UTC
I'm on 4.11.10-200 and it seems to be fixed.

Comment 41 Aaron Sowry 2017-07-23 02:52:04 UTC
(In reply to Axel Nagel from comment #39)
> I'm on 4.11.10-200 and it seems to be fixed.

I'm still hitting it on 4.11.10-300.fc26.x86_64

Comment 42 Mirek Svoboda 2017-07-23 08:26:56 UTC
It did not happen to me for many weeks, definitely not on 4.11.10 and 4.11.11.

I run the latest available x86_64 kernels for FC26 from Koji, even prior these are pushed to testing/stable. My HW is i5-7200U w/integrated GPU.

Comment 43 Fedora End Of Life 2017-11-16 19:20:12 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 44 Fedora End Of Life 2017-12-12 10:19:20 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.