Bug 1575721 - random screen freezes nouveau NULL pointer dereference
Summary: random screen freezes nouveau NULL pointer dereference
Keywords:
Status: CLOSED DUPLICATE of bug 1559178
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-07 18:27 UTC by Ldap Tester
Modified: 2018-05-29 15:51 UTC (History)
18 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-05-29 15:51:24 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ldap Tester 2018-05-07 18:27:04 UTC
Description of problem:
random screen freezes on Sony laptop with NVIDIA GT218M [GeForce 310M] graphics
kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
kernel: IP: nouveau_mem_host+0x47/0x1b0 [nouveau]

Version-Release number of selected component (if applicable):
4.16.5-200.fc27.x86_64 and many previous

How reproducible:
not at all

Steps to Reproduce:
1. random
2. just wait
3.

Actual results:
random screen freezes

Expected results:
no screen freezes

Additional info:
I have a Sony laptop with NVIDIA GT218M [GeForce 310M] graphics hardware.

For some months now (I don't remember how many), I have been getting random screen freezes.  The screen never updates and I have to ssh in to shutdown the laptop.  After coming back up, everything works fine until the next random freeze.  The time between freezes varies from hours to weeks.  I haven't seen any correlation to any work that I am doing at the moment of the freezes.  Before the freezes started, the graphics and the laptop in general have been working fine for years.  Here are the messages:  

May  2 12:27:03 stan kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
May  2 12:27:03 stan kernel: IP: nouveau_mem_host+0x47/0x1b0 [nouveau]
May  2 12:27:03 stan kernel: PGD 8000000223798067 P4D 8000000223798067 PUD 21f79d067 PMD 0 
May  2 12:27:03 stan kernel: Oops: 0000 [#1] SMP PTI
May  2 12:27:03 stan kernel: Modules linked in: bnep fuse tun ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntra
ck_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack ip6_tables libcrc32c usblp snd_hda_codec_hdmi snd_hda_codec_realtek
 uvcvideo btusb snd_hda_codec_generic intel_powerclamp videobuf2_vmalloc coretemp snd_hda_intel btrtl arc4 snd_hda_codec btbcm ath9k
 ath9k_common videobuf2_memops videobuf2_v4l2 btintel ath9k_hw bluetooth snd_hda_core videobuf2_common kvm videodev irqbypass ecdh_g
eneric mac80211 snd_hwdep media ath cfg80211 intel_cstate iTCO_wdt iTCO_vendor_support snd_seq intel_uncore snd_seq_device snd_pcm s
ony_laptop joydev wmi_bmof i7core_edac snd_timer rfkill snd soundcore shpchp lpc_ich i2c_i801 acpi_cpufreq nouveau i2c_algo_bit drm_
kms_helper mxm_wmi ttm crc32c_intel
May  2 12:27:03 stan kernel: drm sdhci_pci cqhci uas serio_raw firewire_ohci sdhci usb_storage firewire_core mmc_core sky2 crc_itu_t
 wmi video
May  2 12:27:03 stan kernel: CPU: 1 PID: 1393 Comm: Xorg Not tainted 4.16.5-200.fc27.x86_64 #1
May  2 12:27:03 stan kernel: Hardware name: Sony Corporation VPCF1390X/VAIO, BIOS R0190Y9 10/20/2010
May  2 12:27:03 stan kernel: RIP: 0010:nouveau_mem_host+0x47/0x1b0 [nouveau]
May  2 12:27:03 stan kernel: RSP: 0018:ffff95360306f7f0 EFLAGS: 00010246
May  2 12:27:03 stan kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000ba7bd200
May  2 12:27:03 stan kernel: RDX: ffff894b1f414180 RSI: ffff894a7b002080 RDI: ffff95360306f948
May  2 12:27:03 stan kernel: RBP: ffff894a7b002600 R08: ffff894a7b0020d8 R09: ffffffffc057d34a
May  2 12:27:03 stan kernel: R10: ffffe2e8088371c0 R11: 00000000000004f8 R12: ffff95360306f948
May  2 12:27:03 stan kernel: R13: ffff894b261ec518 R14: ffff894a7b002600 R15: ffff95360306f948
May  2 12:27:03 stan kernel: FS:  00007f2dc2fb3a80(0000) GS:ffff894b2fc40000(0000) knlGS:0000000000000000
May  2 12:27:03 stan kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May  2 12:27:03 stan kernel: CR2: 0000000000000040 CR3: 00000001e9c18000 CR4: 00000000000006e0
May  2 12:27:03 stan kernel: Call Trace:
May  2 12:27:03 stan kernel: nv50_sgdma_bind+0x18/0x30 [nouveau]
May  2 12:27:03 stan kernel: ttm_tt_bind+0x42/0x60 [ttm]
May  2 12:27:03 stan kernel: ttm_bo_handle_move_mem+0x577/0x5b0 [ttm]
May  2 12:27:03 stan kernel: ttm_bo_evict+0x13f/0x320 [ttm]
May  2 12:27:03 stan kernel: ? drm_add_display_info+0x4b1/0x800 [drm]
May  2 12:27:03 stan kernel: ttm_mem_evict_first+0x193/0x200 [ttm]
May  2 12:27:03 stan kernel: ttm_bo_mem_space+0x328/0x4a0 [ttm]
May  2 12:27:03 stan kernel: ttm_bo_validate+0xbc/0x130 [ttm]
May  2 12:27:03 stan kernel: ? drm_add_display_info+0x4ee/0x800 [drm]
May  2 12:27:03 stan kernel: ttm_bo_init_reserved+0x378/0x420 [ttm]
May  2 12:27:03 stan kernel: ttm_bo_init+0x62/0xd0 [ttm]
May  2 12:27:03 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  2 12:27:03 stan kernel: nouveau_bo_new+0x416/0x590 [nouveau]
May  2 12:27:03 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  2 12:27:03 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  2 12:27:03 stan kernel: nouveau_gem_new+0x5d/0x120 [nouveau]
May  2 12:27:03 stan kernel: nouveau_gem_ioctl_new+0x51/0xd0 [nouveau]
May  2 12:27:03 stan kernel: drm_ioctl_kernel+0x5b/0xb0 [drm]
May  2 12:27:03 stan kernel: drm_ioctl+0x2d5/0x370 [drm]
May  2 12:27:03 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  2 12:27:03 stan kernel: ? vfs_writev+0xb9/0x110
May  2 12:27:03 stan kernel: nouveau_drm_ioctl+0x64/0xc0 [nouveau]
May  2 12:27:03 stan kernel: do_vfs_ioctl+0xa4/0x620
May  2 12:27:03 stan kernel: SyS_ioctl+0x74/0x80
May  2 12:27:03 stan kernel: do_syscall_64+0x74/0x180
May  2 12:27:03 stan kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
May  2 12:27:03 stan kernel: RIP: 0033:0x7f2dc02bb0f7
May  2 12:27:03 stan kernel: RSP: 002b:00007fff4e161fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May  2 12:27:03 stan kernel: RAX: ffffffffffffffda RBX: 00000000025ca530 RCX: 00007f2dc02bb0f7
May  2 12:27:03 stan kernel: RDX: 00007fff4e162030 RSI: 00000000c0306480 RDI: 000000000000000f
May  2 12:27:03 stan kernel: RBP: 00007fff4e162030 R08: 0000000000000000 R09: 00000000025ca3e8
May  2 12:27:03 stan kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c0306480
May  2 12:27:03 stan kernel: R13: 000000000000000f R14: 00000000025ca3e8 R15: 000000000141bc00
May  2 12:27:03 stan kernel: Code: 04 25 28 00 00 00 48 89 44 24 20 31 c0 49 8b 1e 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10 00 00 00 00 48 c7 44 24 18 00 00 00 00 <48> 8b 7b 40 48 8d 83 f8 00 00 00 44 0f b6 6b 39 48 89 04 24 48 
May  2 12:27:03 stan kernel: RIP: nouveau_mem_host+0x47/0x1b0 [nouveau] RSP: ffff95360306f7f0
May  2 12:27:03 stan kernel: CR2: 0000000000000040
May  2 12:27:04 stan kernel: ---[ end trace e985916e8a03f0c3 ]---

Over the past few days, I have been seeing numerous mini-freezes, where the screen updates pause for a few seconds, and then everything continues normally.  This happens ~100 times per day.  The messages are:

May  7 09:49:06 stan kernel: nouveau 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
May  7 09:49:06 stan kernel: nouveau 0000:01:00.0: swiotlb: coherent allocation failed, size=2097152
May  7 09:49:06 stan kernel: CPU: 4 PID: 1407 Comm: Xorg Not tainted 4.16.5-200.fc27.x86_64 #1
May  7 09:49:06 stan kernel: Hardware name: Sony Corporation VPCF1390X/VAIO, BIOS R0190Y9 10/20/2010
May  7 09:49:06 stan kernel: Call Trace:
May  7 09:49:06 stan kernel: dump_stack+0x5c/0x85
May  7 09:49:06 stan kernel: swiotlb_alloc_coherent+0x1be/0x1d0
May  7 09:49:06 stan kernel: ttm_dma_pool_get_pages+0x235/0x620 [ttm]
May  7 09:49:06 stan kernel: ttm_dma_populate+0x25e/0x350 [ttm]
May  7 09:49:06 stan kernel: ttm_tt_bind+0x2c/0x60 [ttm]
May  7 09:49:06 stan kernel: ttm_bo_handle_move_mem+0x577/0x5b0 [ttm]
May  7 09:49:06 stan kernel: ttm_bo_validate+0x120/0x130 [ttm]
May  7 09:49:06 stan kernel: ? drm_mode_prune_invalid+0x8e/0x100 [drm]
May  7 09:49:06 stan kernel: ttm_bo_init_reserved+0x378/0x420 [ttm]
May  7 09:49:06 stan kernel: ttm_bo_init+0x62/0xd0 [ttm
May  7 09:49:06 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  7 09:49:06 stan kernel: nouveau_bo_new+0x416/0x590 [nouveau]
May  7 09:49:06 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  7 09:49:06 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  7 09:49:06 stan kernel: nouveau_gem_new+0x5d/0x120 [nouveau]
May  7 09:49:06 stan kernel: nouveau_gem_ioctl_new+0x51/0xd0 [nouveau]
May  7 09:49:06 stan kernel: drm_ioctl_kernel+0x5b/0xb0 [drm]
May  7 09:49:06 stan kernel: drm_ioctl+0x2d5/0x370 [drm]
May  7 09:49:06 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  7 09:49:06 stan kernel: ? __handle_mm_fault+0xd2b/0x12f0
May  7 09:49:06 stan kernel: nouveau_drm_ioctl+0x64/0xc0 [nouveau]
May  7 09:49:06 stan kernel: do_vfs_ioctl+0xa4/0x620
May  7 09:49:06 stan kernel: ? handle_mm_fault+0xdc/0x210
May  7 09:49:06 stan kernel: ? __do_page_fault+0x279/0x4e0
May  7 09:49:06 stan kernel: SyS_ioctl+0x74/0x80
May  7 09:49:06 stan kernel: do_syscall_64+0x74/0x180
May  7 09:49:06 stan kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
May  7 09:49:06 stan kernel: RIP: 0033:0x7f2427c5f0f7
May  7 09:49:06 stan kernel: RSP: 002b:00007ffc664cc198 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May  7 09:49:06 stan kernel: RAX: ffffffffffffffda RBX: 00000000036e9b10 RCX: 00007f2427c5f0f7
May  7 09:49:06 stan kernel: RDX: 00007ffc664cc1f0 RSI: 00000000c0306480 RDI: 000000000000000f
May  7 09:49:06 stan kernel: RBP: 00007ffc664cc1f0 R08: 0000000000000004 R09: 00007f2427f23c60
May  7 09:49:06 stan kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00000000c0306480
May  7 09:49:06 stan kernel: R13: 000000000000000f R14: 000000000331ddc8 R15: 0000000002722c00

Comment 1 Ldap Tester 2018-05-09 15:48:36 UTC
Another total screen freeze with slightly different messages:

May  7 16:13:05 stan kernel: general protection fault: 0000 [#1] SMP PTI
May  7 16:13:05 stan kernel: Modules linked in: fuse tun bnep ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv4 nf_conntrack_ipv6 nf_defrag_ipv4 nf_defrag_ipv6 xt_conntrack nf_conntrack libcrc32c ip6table_filter ip6_tables usblp arc4 ath9k ath9k_common ath9k_hw intel_powerclamp coretemp uvcvideo videobuf2_vmalloc videobuf2_memops btusb videobuf2_v4l2 btrtl btbcm videobuf2_common mac80211 snd_hda_codec_hdmi btintel ath bluetooth videodev snd_hda_codec_realtek cfg80211 ecdh_generic snd_hda_codec_generic media kvm irqbypass snd_hda_intel snd_hda_codec iTCO_wdt iTCO_vendor_support snd_hda_core intel_cstate i7core_edac snd_hwdep sony_laptop intel_uncore snd_seq snd_seq_device snd_pcm rfkill snd_timer joydev wmi_bmof shpchp acpi_cpufreq lpc_ich snd i2c_i801 soundcore nouveau i2c_algo_bit drm_kms_helper ttm crc32c_intel sdhci_pci
May  7 16:13:05 stan kernel: drm cqhci mxm_wmi serio_raw sdhci uas usb_storage firewire_ohci mmc_core firewire_core sky2 crc_itu_t video wmi
May  7 16:13:05 stan kernel: CPU: 3 PID: 1396 Comm: Xorg Not tainted 4.16.6-202.fc27.x86_64 #1
May  7 16:13:05 stan kernel: Hardware name: Sony Corporation VPCF1390X/VAIO, BIOS R0190Y9 10/20/2010
May  7 16:13:05 stan kernel: RIP: 0010:nouveau_mem_host+0x47/0x1b0 [nouveau]
May  7 16:13:05 stan kernel: RSP: 0018:ffffa27b4320f7f0 EFLAGS: 00010246
May  7 16:13:05 stan kernel: RAX: 0000000000000000 RBX: 2f726567616e614d RCX: 00000000ba7bd200
May  7 16:13:05 stan kernel: RDX: ffff8ea9a5dee300 RSI: ffff8ea9735d3380 RDI: ffffa27b4320f948
May  7 16:13:05 stan kernel: RBP: ffff8ea9735d3100 R08: ffff8ea9735d33d8 R09: ffffffffc046e34a
May  7 16:13:05 stan kernel: R10: ffffd17085489140 R11: 0000000000000f78 R12: ffffa27b4320f948
May  7 16:13:05 stan kernel: R13: ffff8ea9a5e18518 R14: ffff8ea9735d3100 R15: ffffa27b4320f948
May  7 16:13:05 stan kernel: FS:  00007fd1ca5d2a80(0000) GS:ffff8ea9afcc0000(0000) knlGS:0000000000000000
May  7 16:13:05 stan kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May  7 16:13:05 stan kernel: CR2: 00005614b899b710 CR3: 0000000222aee000 CR4: 00000000000006e0
May  7 16:13:05 stan kernel: Call Trace:
May  7 16:13:05 stan kernel: nv50_sgdma_bind+0x18/0x30 [nouveau]
May  7 16:13:05 stan kernel: ttm_tt_bind+0x42/0x60 [ttm]
May  7 16:13:05 stan kernel: ttm_bo_handle_move_mem+0x577/0x5b0 [ttm]
May  7 16:13:05 stan kernel: ttm_bo_evict+0x13f/0x320 [ttm]
May  7 16:13:05 stan kernel: ? drm_mm_insert_node_in_range+0x261/0x500 [drm]
May  7 16:13:05 stan kernel: ttm_mem_evict_first+0x193/0x200 [ttm]
May  7 16:13:05 stan kernel: ttm_bo_mem_space+0x328/0x4a0 [ttm]
May  7 16:13:05 stan kernel: ttm_bo_validate+0xbc/0x130 [ttm]
May  7 16:13:05 stan kernel: ? drm_mm_insert_node_in_range+0x29e/0x500 [drm]
May  7 16:13:05 stan kernel: ttm_bo_init_reserved+0x378/0x420 [ttm]
May  7 16:13:05 stan kernel: ttm_bo_init+0x62/0xd0 [ttm]
May  7 16:13:05 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  7 16:13:05 stan kernel: nouveau_bo_new+0x416/0x590 [nouveau]
May  7 16:13:05 stan kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
May  7 16:13:05 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  7 16:13:05 stan kernel: nouveau_gem_new+0x5d/0x120 [nouveau]
May  7 16:13:05 stan kernel: nouveau_gem_ioctl_new+0x51/0xd0 [nouveau]
May  7 16:13:05 stan kernel: drm_ioctl_kernel+0x5b/0xb0 [drm]
May  7 16:13:05 stan kernel: drm_ioctl+0x2d5/0x370 [drm]
May  7 16:13:05 stan kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
May  7 16:13:05 stan kernel: ? vfs_writev+0xb9/0x110
May  7 16:13:05 stan kernel: nouveau_drm_ioctl+0x64/0xc0 [nouveau]
May  7 16:13:05 stan kernel: do_vfs_ioctl+0xa4/0x620
May  7 16:13:05 stan kernel: SyS_ioctl+0x74/0x80
May  7 16:13:05 stan kernel: do_syscall_64+0x74/0x180
May  7 16:13:05 stan kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
May  7 16:13:05 stan kernel: RIP: 0033:0x7fd1c78da0f7
May  7 16:13:05 stan kernel: RSP: 002b:00007ffee3aeed88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
May  7 16:13:05 stan kernel: RAX: ffffffffffffffda RBX: 00000000031b7410 RCX: 00007fd1c78da0f7
May  7 16:13:05 stan kernel: RDX: 00007ffee3aeede0 RSI: 00000000c0306480 RDI: 000000000000000f
May  7 16:13:05 stan kernel: RBP: 00007ffee3aeede0 R08: 0000000000000000 R09: 0000000000000006
May  7 16:13:05 stan kernel: R10: 0000000002602010 R11: 0000000000000246 R12: 00000000c0306480
May  7 16:13:05 stan kernel: R13: 000000000000000f R14: 000000000358c888 R15: 0000000002680ce0
May  7 16:13:05 stan kernel: Code: 04 25 28 00 00 00 48 89 44 24 20 31 c0 49 8b 1e 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10 00 00 00 00 48 c7 44 24 18 00 00 00 00 <48> 8b 7b 40 48 8d 83 f8 00 00 00 44 0f b6 6b 39 48 89 04 24 48 
May  7 16:13:05 stan kernel: RIP: nouveau_mem_host+0x47/0x1b0 [nouveau] RSP: ffffa27b4320f7f0
May  7 16:13:05 stan kernel: ---[ end trace 24e99ea45dbaee56 ]---

Comment 2 Edgar Hoch 2018-05-14 10:01:07 UTC
This may be a duplicate of bug 1559178

Comment 3 Jeremy Cline 2018-05-29 15:51:24 UTC
Thanks for catching that, Edgar!

*** This bug has been marked as a duplicate of bug 1559178 ***


Note You need to log in before you can comment on or make changes to this bug.