Bug 1559178

Summary: repeated lockups of the entire X session - nv50_sgdma_bind
Product: [Fedora] Fedora Reporter: Wayne Walker <wwalker>
Component: xorg-x11-drv-nouveauAssignee: Ben Skeggs <bskeggs>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 27CC: airlied, ajax, bskeggs, dlucas, edgar.hoch, jcline, jglisse, ldap.tester, loening, mnk, stefano.biagiotti
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-30 13:20:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
journalctl -k
none
journalctl -k
none
Output of journalctl -k -b -1 --no-hostname --no-pager in attachment.
none
Output of journalctl -k -b -1 --no-hostname --no-pager in attachment. none

Description Wayne Walker 2018-03-21 22:04:34 UTC
Description of problem:
X locks up completely, usually after a few hundred lines output to an xterm.

Version-Release number of selected component (if applicable):
xorg-x11-drv-nouveau-1.0.15-3.fc27.x86_64


How reproducible:
Happens every day or two

Steps to Reproduce:
1. Login (lightdm into an xfce session)
2. Work normally (mostly in xterm)
3. cat a big file

Actual results: Total lockup of X, no cursor movement. Cntl-Alt-F2 does not drop to the console


Expected results: X to stay running.


Additional info: 

Mar 19 14:56:55 polonium kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Mar 19 14:56:55 polonium kernel: IP: nouveau_mem_host+0x47/0x1b0 [nouveau]
Mar 19 14:56:55 polonium kernel: PGD 8000000805f82067 P4D 8000000805f82067 PUD 7f9ac6067 PMD 0
Mar 19 14:56:55 polonium kernel: Oops: 0000 [#1] SMP PTI
Mar 19 14:56:55 polonium kernel: Modules linked in: snd_seq_dummy rfcomm xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_mark nf_conntrack_netlink xt_addrtype br_netfilter overlay xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack binfmt_misc ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables vmnet(OE) ppdev parport_pc parport ip6table_filter ip6_tables fuse vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) cmac bnep sunrpc vfat fat arc4 snd_hda_codec_hdmi btusb intel_rapl btrtl btbcm mei_wdt x86_pkg_temp_thermal
Mar 19 14:56:55 polonium kernel: intel_powerclamp btintel coretemp bluetooth kvm_intel iwlmvm kvm snd_soc_rt5640 snd_hda_codec_realtek mac80211 snd_hda_codec_generic snd_soc_rl6231 snd_soc_core ecdh_generic irqbypass joydev snd_hda_intel crct10dif_pclmul snd_hda_codec crc32_pclmul ghash_clmulni_intel iwlwifi snd_compress intel_cstate snd_hda_core snd_pcm_dmaengine ac97_bus snd_hwdep intel_uncore acpi_als cfg80211 snd_seq intel_rapl_perf kfifo_buf snd_seq_device industrialio snd_pcm rfkill i2c_i801 wdat_wdt intel_pch_thermal intel_wmi_thunderbolt snd_timer mei_me snd acpi_pad mei soundcore shpchp lpc_ich vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nouveau mxm_wmi drm_kms_helper ttm drm igb e1000e crc32c_intel serio_raw dca ptp i2c_algo_bit pps_core wmi video
Mar 19 14:56:55 polonium kernel: CPU: 0 PID: 1826 Comm: Xorg Tainted: G           OE    4.15.7-300.fc27.x86_64 #1
Mar 19 14:56:55 polonium kernel: Hardware name: CompuLab Airtop/Airtop, BIOS ARTP-3.1.0.637.3.0 X64 09/19/2016
Mar 19 14:56:55 polonium kernel: RIP: 0010:nouveau_mem_host+0x47/0x1b0 [nouveau]
Mar 19 14:56:55 polonium kernel: RSP: 0018:ffffb7b688e83800 EFLAGS: 00010246
Mar 19 14:56:55 polonium kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000002f03b6200
Mar 19 14:56:55 polonium kernel: RDX: ffff8fb2aa4cb280 RSI: ffff8fb26cfd8280 RDI: ffffb7b688e83958
Mar 19 14:56:55 polonium kernel: RBP: ffff8fb26cfd8e80 R08: ffff8fb26cfd82d8 R09: ffffffffc0575133
Mar 19 14:56:55 polonium kernel: R10: ffffdc09d53c5880 R11: 0000000000000000 R12: ffffb7b688e83958
Mar 19 14:56:55 polonium kernel: R13: 0000000000000000 R14: ffff8fb26cfd8e80 R15: ffffb7b688e83958
Mar 19 14:56:55 polonium kernel: FS:  00007f3730613a80(0000) GS:ffff8fb2d5c00000(0000) knlGS:0000000000000000
Mar 19 14:56:55 polonium kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 19 14:56:55 polonium kernel: CR2: 0000000000000040 CR3: 000000082c3da003 CR4: 00000000003606f0
Mar 19 14:56:55 polonium kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 19 14:56:55 polonium kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 19 14:56:55 polonium kernel: Call Trace:
Mar 19 14:56:55 polonium kernel: nv50_sgdma_bind+0x18/0x30 [nouveau]
Mar 19 14:56:55 polonium kernel: ttm_tt_bind+0x3f/0x60 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_handle_move_mem+0x5da/0x610 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_evict+0x14d/0x330 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_mem_evict_first+0x161/0x1d0 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_mem_space+0x344/0x4c0 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_validate+0xce/0x150 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_init_reserved+0x385/0x430 [ttm]
Mar 19 14:56:55 polonium kernel: ttm_bo_init+0x2f/0x90 [ttm]
Mar 19 14:56:55 polonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 19 14:56:55 polonium kernel: ? _cond_resched+0x15/0x40
Mar 19 14:56:55 polonium kernel: nouveau_bo_new+0x416/0x590 [nouveau]
Mar 19 14:56:55 polonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 19 14:56:55 polonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 19 14:56:55 polonium kernel: nouveau_gem_new+0x5d/0x120 [nouveau]
Mar 19 14:56:55 polonium kernel: nouveau_gem_ioctl_new+0x51/0xd0 [nouveau]
Mar 19 14:56:55 polonium kernel: drm_ioctl_kernel+0x5b/0xb0 [drm]
Mar 19 14:56:55 polonium kernel: drm_ioctl+0x2d5/0x370 [drm]
Mar 19 14:56:55 polonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 19 14:56:55 polonium kernel: nouveau_drm_ioctl+0x64/0xc0 [nouveau]
Mar 19 14:56:55 polonium kernel: do_vfs_ioctl+0xa4/0x620
Mar 19 14:56:55 polonium kernel: ? __sys_recvmsg+0x4e/0x90
Mar 19 14:56:55 polonium kernel: ? __sys_recvmsg+0x7d/0x90
Mar 19 14:56:55 polonium kernel: SyS_ioctl+0x74/0x80
Mar 19 14:56:55 polonium kernel: do_syscall_64+0x74/0x180
Mar 19 14:56:55 polonium kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 19 14:56:55 polonium kernel: RIP: 0033:0x7f372d8f2b87
Mar 19 14:56:55 polonium kernel: RSP: 002b:00007ffd6a8399c8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Mar 19 14:56:55 polonium kernel: RAX: ffffffffffffffda RBX: 00000000022f2760 RCX: 00007f372d8f2b87
Mar 19 14:56:55 polonium kernel: RDX: 00007ffd6a839a20 RSI: 00000000c0306480 RDI: 0000000000000010
Mar 19 14:56:55 polonium kernel: RBP: 00007ffd6a839a20 R08: 0000000000000000 R09: 00007f372dbc3c20
Mar 19 14:56:55 polonium kernel: R10: 0000000000000007 R11: 0000000000003246 R12: 00000000c0306480
Mar 19 14:56:55 polonium kernel: R13: 0000000000000010 R14: 00000000022f40f8 R15: 0000000001891400
Mar 19 14:56:55 polonium kernel: Code: 04 25 28 00 00 00 48 89 44 24 20 31 c0 49 8b 1e 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10 00 00 00 00 48 c7 44 24 18 00 00 00 00 <48> 8b 7b 40 48 8d 83 f8 00 00 00 44 0f b6 6b 39 48 89 04 24 48
Mar 19 14:56:55 polonium kernel: RIP: nouveau_mem_host+0x47/0x1b0 [nouveau] RSP: ffffb7b688e83800
Mar 19 14:56:55 polonium kernel: CR2: 0000000000000040
Mar 19 14:56:55 polonium kernel: ---[ end trace 2d095211d37610d1 ]---

Kernel - 4.15.7-300.fc27.x86_64

Comment 1 Wayne Walker 2018-03-21 22:11:13 UTC
Forgot to id the video card:

01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 950] (rev a1)

Comment 2 Wayne Walker 2018-03-21 22:14:43 UTC
Initialization at boot:

Mar 21 16:02:09 polonium kernel: fb: switching to nouveaufb from EFI VGA
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: NVIDIA GM206 (126020a1)
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: bios: version 84.06.3d.00.87
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: bus: MMIO write of 80000104 FAULT at 10eb14 [ IBUS ]
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB version 4.1
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 02811f76 04400020
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 03: 02011f72 00020020
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 04: 04822f86 04400010
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 05: 04022f82 00020010
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 06: 04833f96 04400020
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 07: 04033f92 00020020
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 08: 02044f62 00020010
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB outp 15: 01df5ff8 00000000
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 00020146
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 01000246
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 03: 02000346
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 04: 00010461
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: DCB conn 05: 00000570
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: failed to create encoder 1/8/0: -19
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: Virtual-1 has no encoders, removing
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
Mar 21 16:02:09 polonium kernel: nouveau 0000:01:00.0: DRM: allocated 3840x2160 fb: 0x60000, bo 00000000f92a2e22
Mar 21 16:02:09 polonium kernel: fbcon: nouveaufb (fb0) is primary device
Mar 21 16:02:10 polonium kernel: nouveau 0000:01:00.0: disp: 0x00005b6b[0]: INIT_GENERIC_CONDITON: unknown 0x07
Mar 21 16:02:10 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: link rate unsupported by sink
Mar 21 16:02:10 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: training failed
Mar 21 16:02:10 polonium kernel: nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
Mar 21 16:02:10 polonium kernel: [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
Mar 21 16:02:11 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: link rate unsupported by sink
Mar 21 16:02:11 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: training failed
Mar 21 16:02:12 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: link rate unsupported by sink
Mar 21 16:02:12 polonium kernel: nouveau 0000:01:00.0: disp: outp 02:0006:0f82: training failed

Comment 3 Wayne Walker 2018-03-22 21:44:20 UTC
I got on rawhide and the bug still exists.  I'm now running:

xorg-x11-drv-nouveau-1.0.15-4.fc28.x86_64

Comment 4 Wayne Walker 2018-03-22 21:48:12 UTC
Trace after switching to rawhide:


Mar 22 16:25:51 polonium kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Mar 22 16:25:51 polonium kernel: IP: nouveau_mem_host+0x3e/0x1b0 [nouveau]
Mar 22 16:25:51 polonium kernel: PGD 8000000806f75067 P4D 8000000806f75067 PUD 8094d1067 PMD 0 
Mar 22 16:25:51 polonium kernel: Oops: 0000 [#1] SMP PTI
Mar 22 16:25:51 polonium kernel: Modules linked in: rfcomm vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc vmnet(OE) vmmon(OE) xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_mark nf_conntrack_netlink xt_addrtype br_netf
ilter overlay xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink devlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf
_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ppdev parport_pc parport 
ip6table_filter ip6_tables fuse vmw_vsock_vmci_transport vsock vmw_vmci cmac bnep sunrpc vfat fat arc4 snd_hda_codec_hdmi
Mar 22 16:25:51 polonium kernel: intel_rapl x86_pkg_temp_thermal intel_powerclamp iwlmvm coretemp mac80211 kvm_intel btusb btrtl snd_soc_rt5640 btbcm snd_hda_codec_realtek kvm mei_wdt snd_hda_codec_generic snd_soc_rl6231 btintel snd_hda_
intel snd_soc_core bluetooth snd_hda_codec irqbypass crct10dif_pclmul crc32_pclmul iwlwifi snd_hda_core snd_compress snd_pcm_dmaengine ac97_bus snd_hwdep ghash_clmulni_intel snd_seq intel_cstate joydev snd_seq_device ecdh_generic intel_u
ncore cfg80211 snd_pcm intel_rapl_perf intel_pch_thermal i2c_i801 mei_me wdat_wdt rfkill lpc_ich mei intel_wmi_thunderbolt snd_timer snd soundcore acpi_als kfifo_buf shpchp industrialio acpi_pad nouveau mxm_wmi drm_kms_helper ttm e1000e 
igb drm crc32c_intel serio_raw dca ptp i2c_algo_bit pps_core wmi video
Mar 22 16:25:51 polonium kernel: CPU: 3 PID: 11789 Comm: Xorg Tainted: G           OE    4.16.0-0.rc4.git0.1.fc28.x86_64 #1
Mar 22 16:25:51 polonium kernel: Hardware name: CompuLab Airtop/Airtop, BIOS ARTP-3.1.0.637.3.0 X64 09/19/2016
Mar 22 16:25:51 polonium kernel: RIP: 0010:nouveau_mem_host+0x3e/0x1b0 [nouveau]
Mar 22 16:25:51 polonium kernel: RSP: 0018:ffffb464c87937f0 EFLAGS: 00010246
Mar 22 16:25:51 polonium kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000002f03a1e00
Mar 22 16:25:51 polonium kernel: RDX: ffff99d96a480400 RSI: ffff99d8e747f300 RDI: ffffb464c8793950
Mar 22 16:25:51 polonium kernel: RBP: ffff99d8e747f780 R08: ffff99d8e747f358 R09: ffffffffc03ab20e
Mar 22 16:25:51 polonium kernel: R10: ffffd8f89ea72440 R11: 0000000000000000 R12: ffffb464c8793950
Mar 22 16:25:51 polonium kernel: R13: ffff99d969ddb800 R14: ffffb464c8793950 R15: ffff99d8e747f780
Mar 22 16:25:51 polonium kernel: FS:  00007fdbaf889ac0(0000) GS:ffff99d995cc0000(0000) knlGS:0000000000000000
Mar 22 16:25:51 polonium kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 22 16:25:51 polonium kernel: CR2: 0000000000000040 CR3: 000000080120e004 CR4: 00000000003606e0
Mar 22 16:25:51 polonium kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 22 16:25:51 polonium kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Mar 22 16:25:51 polonium kernel: Call Trace:
Mar 22 16:25:51 polonium kernel: ? ttm_dma_populate+0x20c/0x390 [ttm]
Mar 22 16:25:51 polonium kernel: nv50_sgdma_bind+0x18/0x30 [nouveau]
Mar 22 16:25:51 polonium kernel: ttm_tt_bind+0x44/0x60 [ttm]
Mar 22 16:25:51 polonium kernel: ttm_bo_handle_move_mem+0x4cf/0x550 [ttm]
Mar 22 16:25:51 polonium kernel: ttm_bo_evict+0x140/0x1a0 [ttm]
Mar 22 16:25:51 polonium kernel: ? nouveau_mem_new+0x2c/0x50 [nouveau]
Mar 22 16:25:51 polonium kernel: ? drm_ht_just_insert_please+0x31/0xb0 [drm]
Mar 22 16:25:51 polonium kernel: ttm_mem_evict_first+0x193/0x200 [ttm]
Mar 22 16:25:51 polonium kernel: ttm_bo_mem_space+0x2de/0x4a0 [ttm]
Mar 22 16:25:51 polonium kernel: ttm_bo_validate+0xc7/0x130 [ttm]
Mar 22 16:25:51 polonium kernel: ? drm_ht_create+0x46/0x70 [drm]
Mar 22 16:25:51 polonium kernel: ttm_bo_init_reserved+0x334/0x380 [ttm]
Mar 22 16:25:51 polonium kernel: ? ttm_bo_init+0x62/0xd0 [ttm]
Mar 22 16:25:51 polonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 22 16:25:51 polonium kernel: ? nouveau_bo_new+0x401/0x580 [nouveau]
Mar 22 16:25:51 polonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 22 16:25:51 polonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 22 16:25:51 polonium kernel: ? nouveau_gem_new+0x5d/0x120 [nouveau]
Mar 22 16:25:51 polonium kernel: ? nouveau_gem_ioctl_new+0x53/0xe0 [nouveau]
Mar 22 16:25:51 polonium kernel: ? drm_ioctl_kernel+0x5b/0xb0 [drm]
Mar 22 16:25:51 polonium kernel: ? drm_ioctl+0x1c4/0x380 [drm]
Mar 22 16:25:51 polonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 22 16:25:51 polonium kernel: ? do_iter_write+0xdc/0x190
Mar 22 16:25:51 polonium kernel: ? nouveau_drm_ioctl+0x65/0xc0 [nouveau]
Mar 22 16:25:51 polonium kernel: ? do_vfs_ioctl+0xa4/0x610
Mar 22 16:25:51 polonium kernel: ? __fput+0x147/0x220
Mar 22 16:25:51 polonium kernel: ? SyS_ioctl+0x74/0x80
Mar 22 16:25:51 polonium kernel: ? do_syscall_64+0x74/0x180
Mar 22 16:25:51 polonium kernel: ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 22 16:25:51 polonium kernel: Code: 83 ec 28 4c 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 20 31 c0 49 8b 1f 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10 00 00 00 00 <48> 8b 7b 40 48 8d 83 f8 00 00 00 44 0f b6 6b 39 48 c7 44 24 18 
Mar 22 16:25:51 polonium kernel: RIP: nouveau_mem_host+0x3e/0x1b0 [nouveau] RSP: ffffb464c87937f0
Mar 22 16:25:51 polonium kernel: CR2: 0000000000000040
Mar 22 16:25:51 polonium kernel: ---[ end trace bcb4857b1735b79d ]---

Comment 5 Wayne Walker 2018-03-26 15:09:50 UTC
(In reply to Wayne Walker from comment #0)
> Description of problem:
> X locks up completely, usually after a few hundred lines output to an xterm.


X locks up completely, usually after a few hundred lines of very rapid output to an xterm.

Comment 6 Wayne Walker 2018-03-31 17:24:48 UTC
Also seeing this on my notebook (has an nVidia K1000M)

Mar 29 13:34:58 plutonium kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
Mar 29 13:34:58 plutonium kernel: IP: nouveau_mem_host+0x47/0x1b0 [nouveau]
Mar 29 13:34:58 plutonium kernel: PGD 80000007e4cfe067 P4D 80000007e4cfe067 PUD 7e2825067 PMD 0 
Mar 29 13:34:58 plutonium kernel: Oops: 0000 [#1] SMP PTI
Mar 29 13:34:58 plutonium kernel: Modules linked in: ccm rfcomm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack
 fuse ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security i
ptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_secur
ity ebtable_filter ebtables ip6table_filter ip6_tables bnep sunrpc snd_hda_codec_hdmi intel_rapl arc4 x86_pkg_temp_thermal intel_powerclamp coretemp mei_wdt iTCO_wdt iTCO_vendor_support iwldvm mac80211 snd_hda_codec_realtek kvm snd_hda_codec_generic btusb snd_hda_intel btrtl btbcm btintel uvcvideo irqbypass snd_hda_codec bluetooth videobuf2_vmalloc snd_hda_core
Mar 29 13:34:58 plutonium kernel: videobuf2_memops iwlwifi videobuf2_v4l2 snd_hwdep videobuf2_core snd_seq intel_cstate intel_uncore snd_seq_device videodev intel_rapl_perf snd_pcm cfg80211 ecdh_generic media i2c_i801 thinkpad_acpi wmi_bmof mei_me snd_timer joydev tpm_tis lpc_ich mei tpm_tis_core snd shpchp soundcore rfkill tpm dm_crypt nouveau mxm_wmi i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ttm e1000e sdhci_pci firewire_ohci sdhci ghash_clmulni_intel drm firewire_core serio_raw ptp mmc_core crc_itu_t pps_core wmi video
Mar 29 13:34:58 plutonium kernel: CPU: 6 PID: 1482 Comm: Xorg Not tainted 4.15.10-300.fc27.x86_64 #1
Mar 29 13:34:58 plutonium kernel: Hardware name: LENOVO 243852U/243852U, BIOS G5ET96WW (2.56 ) 11/27/2013
Mar 29 13:34:58 plutonium kernel: RIP: 0010:nouveau_mem_host+0x47/0x1b0 [nouveau]
Mar 29 13:34:58 plutonium kernel: RSP: 0018:ffffb64e08157800 EFLAGS: 00010246
Mar 29 13:34:58 plutonium kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000002e8f7e000
Mar 29 13:34:58 plutonium kernel: RDX: ffff9e7235769800 RSI: ffff9e723494cf00 RDI: ffffb64e08157958
Mar 29 13:34:58 plutonium kernel: RBP: ffff9e723494c900 R08: ffff9e723494cf58 R09: ffffffffc050a133
Mar 29 13:34:58 plutonium kernel: R10: ffffe81d9fd54f80 R11: 0000000000000908 R12: ffffb64e08157958
Mar 29 13:34:58 plutonium kernel: R13: 0000000000000000 R14: ffff9e723494c900 R15: ffffb64e08157958
Mar 29 13:34:58 plutonium kernel: FS:  00007f615caa5a80(0000) GS:ffff9e725dd80000(0000) knlGS:0000000000000000
Mar 29 13:34:58 plutonium kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 29 13:34:58 plutonium kernel: CR2: 0000000000000040 CR3: 00000007e37be003 CR4: 00000000001606e0
Mar 29 13:34:58 plutonium kernel: Call Trace:
Mar 29 13:34:58 plutonium kernel: nv50_sgdma_bind+0x18/0x30 [nouveau]
Mar 29 13:34:58 plutonium kernel: ttm_tt_bind+0x3f/0x60 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_handle_move_mem+0x5da/0x610 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_evict+0x14d/0x330 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_mem_evict_first+0x161/0x1d0 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_mem_space+0x344/0x4c0 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_validate+0xce/0x150 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_init_reserved+0x385/0x430 [ttm]
Mar 29 13:34:58 plutonium kernel: ttm_bo_init+0x2f/0x90 [ttm]
Mar 29 13:34:58 plutonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 29 13:34:58 plutonium kernel: nouveau_bo_new+0x416/0x590 [nouveau]
Mar 29 13:34:58 plutonium kernel: ? nouveau_bo_invalidate_caches+0x10/0x10 [nouveau]
Mar 29 13:34:58 plutonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 29 13:34:58 plutonium kernel: nouveau_gem_new+0x5d/0x120 [nouveau]
Mar 29 13:34:58 plutonium kernel: nouveau_gem_ioctl_new+0x51/0xd0 [nouveau]
Mar 29 13:34:58 plutonium kernel: drm_ioctl_kernel+0x5b/0xb0 [drm]
Mar 29 13:34:58 plutonium kernel: drm_ioctl+0x2d5/0x370 [drm]
Mar 29 13:34:58 plutonium kernel: ? nouveau_gem_new+0x120/0x120 [nouveau]
Mar 29 13:34:58 plutonium kernel: nouveau_drm_ioctl+0x64/0xc0 [nouveau]
Mar 29 13:34:58 plutonium kernel: do_vfs_ioctl+0xa4/0x620
Mar 29 13:34:58 plutonium kernel: ? __sys_recvmsg+0x4e/0x90
Mar 29 13:34:58 plutonium kernel: ? __sys_recvmsg+0x7d/0x90
Mar 29 13:34:58 plutonium kernel: SyS_ioctl+0x74/0x80
Mar 29 13:34:58 plutonium kernel: do_syscall_64+0x74/0x180
Mar 29 13:34:58 plutonium kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Mar 29 13:34:58 plutonium kernel: RIP: 0033:0x7f6159d9b0f7
Mar 29 13:34:58 plutonium kernel: RSP: 002b:00007fff7e172808 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Mar 29 13:34:58 plutonium kernel: RAX: ffffffffffffffda RBX: 00000000028c3f40 RCX: 00007f6159d9b0f7
Mar 29 13:34:58 plutonium kernel: RDX: 00007fff7e172860 RSI: 00000000c0306480 RDI: 0000000000000010
Mar 29 13:34:58 plutonium kernel: RBP: 00007fff7e172860 R08: 0000000000000000 R09: 0000000002864018
Mar 29 13:34:58 plutonium kernel: R10: 0000000000000000 R11: 0000000000003246 R12: 00000000c0306480
Mar 29 13:34:58 plutonium kernel: R13: 0000000000000010 R14: 0000000002864018 R15: 0000000001dc4c10
Mar 29 13:34:58 plutonium kernel: Code: 04 25 28 00 00 00 48 89 44 24 20 31 c0 49 8b 1e 48 c7 44 24 08 00 00 00 00 48 c7 44 24 10 00 00 00 00 48 c7 44 24 18 00 00 00 00 <48> 8b 7b 40 48 8d 83 f8 00 00 00 44 0f b6 6b 39 48 89 04 24 48 
Mar 29 13:34:58 plutonium kernel: RIP: nouveau_mem_host+0x47/0x1b0 [nouveau] RSP: ffffb64e08157800
Mar 29 13:34:58 plutonium kernel: CR2: 0000000000000040
Mar 29 13:34:58 plutonium kernel: ---[ end trace 1e4e04e02a03b482 ]---

Comment 7 Wayne Walker 2018-04-03 19:29:49 UTC
In both cases, I'm attaching 2 4K monitors.  This used to work without hangs.  I've used this setup for about 1.5 years and only recently has it become a problem.

Comment 8 Stefano Biagiotti 2018-04-04 15:03:51 UTC
Created attachment 1417305 [details]
journalctl -k

Same bug here, though on different GPU family, with kernel-4.15.13-300.fc27 and xorg-x11-drv-nouveau-1.0.15-3.fc27.

Kernel-4.14.16-300.fc27.x86_64 works fine.

I have two monitors connected to (from lspci)
01:00.0 VGA compatible controller: NVIDIA Corporation G98 [GeForce 8400 GS Rev. 2] (rev a1)

The freeze happens in a non-predictable way, but often right after login from lightdm.

When the freeze happens the mouse pointer is still alive (I can move it around), and I can use ssh to log into the system and reboot.

Comment 9 Wayne Walker 2018-04-04 16:45:15 UTC
Thanks Stefano.

I rolled back the kernel to 4.13.9-300.fc27.x86_64 and the problem seems fixed.

Comment 10 Wayne Walker 2018-04-05 16:11:41 UTC
I spoke too soon. I have had 3 lock ups today on the 4.13.9-300.fc27.x86_64 kernel.

Stefano - is Kernel-4.14.16-300.fc27.x86_64 still working for you?  Where did you get that package?

Comment 11 Stefano Biagiotti 2018-04-07 13:48:00 UTC
(In reply to Wayne Walker from comment #10)
> I spoke too soon. I have had 3 lock ups today on the 4.13.9-300.fc27.x86_64
> kernel.
> 
> Stefano - is Kernel-4.14.16-300.fc27.x86_64 still working for you?  Where
> did you get that package?

Yes, it is. The package has been installed some time ago using usual dnf update, it survived to subsequent updates.

You maybe want to try this (found after some googling):
 https://koji.fedoraproject.org/koji/buildinfo?buildID=1022743

Comment 12 Wayne Walker 2018-04-08 14:57:02 UTC
Thank you Stefano!

my primary workstation is a lenove w530, with a K1000M card.

when using the main LCD display (1080P) it has only locked up once this week.
when using with external 4K monitors (2 at 3840x2160) it locks up between 1 minute and a couple of hours.
using the 4Ks as just HD monitors, for the last 24 hours, no lock up. :-)

Comment 13 Wayne Walker 2018-04-12 02:17:02 UTC
More lockups at plain HD.

Comment 14 Stefano Biagiotti 2018-04-13 09:55:51 UTC
Created attachment 1421305 [details]
journalctl -k

Bug still present with kernel-4.15.15-300.fc27.x86_64.

Comment 15 Andreas Loening 2018-04-24 11:50:56 UTC
Bug still present with kernel 4.15.17-300.fc27.x86_64

Comment 16 Andreas Loening 2018-05-01 04:39:34 UTC
Bug still present with kernel 4.16.3-200.fc27.x86_64

Comment 17 Stefano Biagiotti 2018-05-09 15:18:59 UTC
Should fix the the nouveau_mem_host issue.
https://bugs.freedesktop.org/show_bug.cgi?id=105687#c6

Comment 18 Stefano Biagiotti 2018-05-10 10:47:08 UTC
Created attachment 1434302 [details]
Output of journalctl -k -b -1 --no-hostname --no-pager in attachment.

Bug still present with kernel-4.16.6-202.fc27.x86_64.

Comment 19 Wayne Walker 2018-05-11 20:00:32 UTC
Anyone know where to get an rpm (trustable) that has the patch referred to in https://bugs.freedesktop.org/show_bug.cgi?id=105687#c6 ?

Comment 20 Stefano Biagiotti 2018-05-15 15:09:33 UTC
Created attachment 1436813 [details]
Output of journalctl -k -b -1 --no-hostname --no-pager in attachment.

Bug still present in kernel-4.16.7-200.fc27.x86_64.

Comment 21 Jeremy Cline 2018-05-29 15:51:24 UTC
*** Bug 1575721 has been marked as a duplicate of this bug. ***

Comment 22 Jeremy Cline 2018-05-29 15:52:18 UTC
Hi folks,

The patch referenced in https://bugs.freedesktop.org/show_bug.cgi?id=105687#c6 is included in v4.16.9, can you please test the latest kernels available in updates and report back? Thanks!

Comment 23 Wayne Walker 2018-05-29 21:32:52 UTC
I'm running v.4.16.9-300 and I again have two external 4K monitors attached and I can't trigger a failure.  Thank you.

Comment 24 Jeremy Cline 2018-05-30 13:20:13 UTC
Thanks for confirming.