Description of problem: After an upgrading to Fedora 26 on an iMac 27" 2012 with the GTX 680MX GPU (default nouveau driver), logging-in using GNOME under Wayland causes the screen to fill with artifacts and become non-responsive. Note that I can still SSH in and reboot, but I cannot recover the console. Falling back to GNOME under Xorg works just fine. Wayland was working okay under Fedora 25. There are no special GNOME extensions enabled to my knowledge, and the desktop is pretty standard (this is just a file server where the desktop is seldom used). Version-Release number of selected component (if applicable): Fedora 26 How reproducible: Completely reproducible as of the date of this filing. Steps to Reproduce: 1. Boot to GDM on an iMac 27" 2012 with a GTX 680MX 2. Use the default GNOME (with Wayland) session 3. Screen hard-locks with artifacts Actual results: Hard-lock on console with artifacts on the screen. Expected results: Normal GNOME session as with Fedora 25. Additional info: Here is a log extract (note that the log fills up extremely fast with these messages): 17:34:46 hostname kernel: nouveau 0000:01:00.0: Xwayland[1901]: channel 19 killed!0:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: read fault at 0000011000 engine 07 [HOST0] client 06 [HOST] reason 0c [UNSUPPORTED_KIND] on channel 25 [007e198000 Xwayland[1901]] Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: channel 25: killedGPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recoveryGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: Xwayland[1901]: channel 25 killed!TR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: read fault at 0000011000 engine 07 [HOST0] client 06 [HOST] reason 0c [UNSUPPORTED_KIND] on channel 27 [007df9a000 Xwayland[1901]] Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: channel 27: killedGPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recoveryGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: Xwayland[1901]: channel 27 killed!TR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80000000 [SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 7 mthd 3ffc data ffffffffdata 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80000000 [SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 7 mthd 3ffc data ffffffffdata 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80046000 [GPFIFO GPPTR PBENTRY SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 7 mthd 3ffc data ffffffff Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80046000 [GPFIFO GPPTR PBENTRY SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 7 mthd 3ffc data ffffffff Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80046000 [GPFIFO GPPTR PBENTRY SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000 Jul 13 17:34:46 hostname kernel: nouveau 0000:01:00.0: fifo: PBDMA0: 80006000 [GPFIFO GPPTR SIGNATURE] ch 28 [007dbd3000 Xwayland[1901]] subc 0 mthd 0000 data 00000000
Hey all, Apparently, I too am hitting this. Falling back to GNOME on X works fine, but the system seems to lock up when on Wayland. I did some vmcore analysis, but am extremely unfamiliar with nvidia territory in the code. As such, the following may be useful or useless. In any case, the best I can ascertain from my little knowledge of how we interact with nvidia is Xwayland is attempting a method on an object bound to a specific subchannel as part of an interrupt handling sequence. My kernel ring buffer is inundated with these logs without the actual values changing after the first few implying spamming the same method. However, I also see, when switching from X to Wayland via logging out and in, a couple errors in regards to read faults, channels being killed, and something with disp having an unknown error. This leads me to believe a vmcore will not suffice as it looks like I am looking into the result of something gone awry and I haven't caught it in the act. I can confirm the card worked with multiple monitors before the update from fedora 25 to 26 and occurred only afterwards. Unfortunately, no other kernels work now with Wayland as they all exhibit this issue and nvidia modules are and haven't been installed on any kernel. I am up for troubleshooting if someone can give guidance. I've seen 'nomodeset' be thrown around on the internet as well as a few other options, but if something specific is useful here, I can certainly try. I've also thought about throwing F27 beta on to see if the issue is reproduced. I am not super keen on this method however, but am open to installing a few beta packages that may be useful to try. I can provide my vmcore as well if needed. I have no info that needs scrubbing from it. Below is my analysis. KERNEL: /usr/lib/debug/lib/modules/4.11.10-300.fc26.x86_64/vmlinux DUMPFILE: /var/crash/127.0.0.1-2017-07-21-20:15:51/vmcore [PARTIAL DUMP] CPUS: 4 DATE: Fri Jul 21 20:15:15 2017 UPTIME: 00:41:05 LOAD AVERAGE: 0.63, 0.27, 0.21 TASKS: 726 NODENAME: eden RELEASE: 4.11.10-300.fc26.x86_64 VERSION: #1 SMP Wed Jul 12 17:05:39 UTC 2017 MACHINE: x86_64 (3297 Mhz) MEMORY: 16 GB PANIC: "sysrq: SysRq : Trigger a crash" PID: 1105 COMMAND: "gnome-shell" TASK: ffff9e9e5d6c0000 [THREAD_INFO: ffff9e9e5d6c0000] CPU: 0 STATE: TASK_RUNNING (SYSRQ) crash> sys -i | grep -i -e bios -e board DMI_BIOS_VENDOR: American Megatrends Inc. DMI_BIOS_VERSION: 2205 DMI_BIOS_DATE: 02/12/2015 DMI_BOARD_VENDOR: ASUSTeK COMPUTER INC. DMI_BOARD_NAME: Z97-A DMI_BOARD_VERSION: Rev 1.xx crash> mod -t no tainted modules > lspci | grep VGA 01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 960] (rev a1) [I.0] Hardware overview: - Physical with Z97-A motherboard - NVIDIA GeForce GTX 960 - System was up for a minute or so before the crash. I was jumping around to ensure the system would crash before initiating the crash crash> log - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 0.258760] pci 0000:01:00.0: vgaarb: setting as boot VGA device [ 0.258912] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.259171] pci 0000:01:00.0: vgaarb: bridge control possible - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 0.276778] system 00:01: [io 0x0800-0x087f] has been reserved [ 0.276931] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active) - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 0.287193] pci_bus 0000:01: resource 0 [io 0xe000-0xefff] [ 0.287194] pci_bus 0000:01: resource 1 [mem 0xde000000-0xdf0fffff] [ 0.287195] pci_bus 0000:01: resource 2 [mem 0xc0000000-0xd1ffffff 64bit pref] - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 1.324484] nouveau 0000:01:00.0: bios: version 84.06.32.00.27 [ 1.324803] nouveau 0000:01:00.0: disp: dcb 15 type 8 unknown [ 1.325491] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5 [ 1.325662] nouveau 0000:01:00.0: bus: MMIO write of 800000f0 FAULT at 10eb14 [ IBUS ] - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 1.333635] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB [ 1.333838] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB [ 1.333997] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 [ 1.334148] nouveau 0000:01:00.0: DRM: DCB version 4.1 [ 1.334308] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030 [ 1.334460] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000 [ 1.334611] nouveau 0000:01:00.0: DRM: DCB outp 02: 02811f76 04400020 [ 1.334767] nouveau 0000:01:00.0: DRM: DCB outp 03: 02011f72 00020020 [ 1.334926] nouveau 0000:01:00.0: DRM: DCB outp 04: 04822f86 04400010 [ 1.335084] nouveau 0000:01:00.0: DRM: DCB outp 05: 04022f82 00020010 [ 1.335242] nouveau 0000:01:00.0: DRM: DCB outp 06: 04833f96 04400020 [ 1.335401] nouveau 0000:01:00.0: DRM: DCB outp 07: 04033f92 00020020 [ 1.338050] nouveau 0000:01:00.0: DRM: DCB outp 08: 02044f62 00020010 [ 1.338208] nouveau 0000:01:00.0: DRM: DCB outp 15: 01df5ff8 00000000 [ 1.338360] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030 [ 1.338511] nouveau 0000:01:00.0: DRM: DCB conn 01: 00020146 [ 1.338667] nouveau 0000:01:00.0: DRM: DCB conn 02: 01000246 [ 1.338824] nouveau 0000:01:00.0: DRM: DCB conn 03: 02000346 [ 1.338982] nouveau 0000:01:00.0: DRM: DCB conn 04: 00010461 [ 1.339140] nouveau 0000:01:00.0: DRM: DCB conn 05: 00000570 [ 1.339298] nouveau 0000:01:00.0: DRM: Pointer to flat panel table invalid [ 1.407607] nouveau 0000:01:00.0: DRM: unknown connector type 70 [ 1.407795] nouveau 0000:01:00.0: DRM: failed to create encoder 1/8/0: -19 [ 1.407958] nouveau 0000:01:00.0: DRM: Unknown-1 has no encoders, removing - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 1.578921] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies [ 1.648874] nouveau 0000:01:00.0: priv: GPC0: 419df4 00000000 (1840820e) [ 1.649041] nouveau 0000:01:00.0: priv: GPC1: 419df4 00000000 (1840820e) - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 1.757805] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0x60000, bo ffff9e9e66d70000 [ 1.858899] nouveau 0000:01:00.0: disp: 0x5c73[0]: INIT_GENERIC_CONDITON: unknown 0x07 [ 2.028270] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 3.159573] ACPI Warning: SystemIO range 0x0000000000001828-0x000000000000182F conflicts with OpRegion 0x0000000000001800-0x000000000000187F (\PMIO) (20170119/utaddress-247) [ 3.159587] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 3.159593] ACPI Warning: SystemIO range 0x0000000000001C40-0x0000000000001C4F conflicts with OpRegion 0x0000000000001C00-0x0000000000001FFF (\GPR) (20170119/utaddress-247) [ 3.159599] ACPI Warning: SystemIO range 0x0000000000001C40-0x0000000000001C4F conflicts with OpRegion 0x0000000000001C00-0x0000000000001C7F (\_GPE.GPBX) (20170119/utaddress-247) [ 3.159608] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 3.159613] ACPI Warning: SystemIO range 0x0000000000001C30-0x0000000000001C3F conflicts with OpRegion 0x0000000000001C00-0x0000000000001C3F (\GPRL) (20170119/utaddress-247) [ 3.159619] ACPI Warning: SystemIO range 0x0000000000001C30-0x0000000000001C3F conflicts with OpRegion 0x0000000000001C00-0x0000000000001FFF (\GPR) (20170119/utaddress-247) [ 3.159624] ACPI Warning: SystemIO range 0x0000000000001C30-0x0000000000001C3F conflicts with OpRegion 0x0000000000001C00-0x0000000000001C7F (\_GPE.GPBX) (20170119/utaddress-247) [ 3.159630] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 3.159633] ACPI Warning: SystemIO range 0x0000000000001C00-0x0000000000001C2F conflicts with OpRegion 0x0000000000001C00-0x0000000000001C3F (\GPRL) (20170119/utaddress-247) [ 3.159639] ACPI Warning: SystemIO range 0x0000000000001C00-0x0000000000001C2F conflicts with OpRegion 0x0000000000001C00-0x0000000000001FFF (\GPR) (20170119/utaddress-247) [ 3.159644] ACPI Warning: SystemIO range 0x0000000000001C00-0x0000000000001C2F conflicts with OpRegion 0x0000000000001C00-0x0000000000001C7F (\_GPE.GPBX) (20170119/utaddress-247) - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 8.662027] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 8.662067] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready [ 21.024303] systemd-journald[572]: File /var/log/journal/fe0cee7971a44dbb824922b171e63813/user-1000.journal corrupted or uncleanly shut down, renaming and replacing. [ 25.373328] rfkill: input handler disabled [ 26.635583] ISO 9660 Extensions: Microsoft Joliet Level 3 [ 26.641173] ISOFS: changing to secondary root [ 2449.110087] rfkill: input handler enabled [ 2449.398957] nouveau 0000:01:00.0: disp: 0x5c73[0]: INIT_GENERIC_CONDITON: unknown 0x07 [ 2449.448983] nouveau 0000:01:00.0: disp: 0x5c73[0]: INIT_GENERIC_CONDITON: unknown 0x07 [ 2464.344588] rfkill: input handler disabled [ 2464.655497] nouveau 0000:01:00.0: fifo: read fault at 0000011000 engine 07 [HOST0] client 06 [HOST] reason 0c [UNSUPPORTED_KIND] on channel 24 [007c994000 Xwayland[15818]] [ 2464.655507] nouveau 0000:01:00.0: fifo: channel 24: killed [ 2464.655509] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [ 2464.655515] nouveau 0000:01:00.0: Xwayland[15818]: channel 24 killed! [ 2464.688477] nouveau 0000:01:00.0: fifo: read fault at 0000011000 engine 07 [HOST0] client 06 [HOST] reason 02 [PTE] on channel 25 [007c2ef000 Xwayland[15818]] [ 2464.688487] nouveau 0000:01:00.0: fifo: channel 25: killed [ 2464.688490] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [ 2464.725441] nouveau 0000:01:00.0: Xwayland[15818]: channel 25 killed! [ 2464.725476] nouveau 0000:01:00.0: fifo: read fault at 5555555000 engine 1f [] client 07 [HOST_CPU] reason 0d [REGION_VIOLATION] on channel -1 [0000000000 unknown] [ 2464.725486] nouveau 0000:01:00.0: fifo: PBDMA0: 80044000 [GPPTR PBENTRY SIGNATURE] ch 26 [007c2de000 Xwayland[15818]] subc 5 mthd 1554 data 55555555 [ 2464.725506] nouveau 0000:01:00.0: fifo: PBDMA0: 00044000 [GPPTR PBENTRY] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725524] nouveau 0000:01:00.0: fifo: PBDMA0: 00044000 [GPPTR PBENTRY] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725543] nouveau 0000:01:00.0: fifo: PBDMA0: 00044000 [GPPTR PBENTRY] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725562] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725582] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725601] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2464.725620] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 - - - - - - - - - - - [SNIP] - - - - - - - - - - - [ 2465.317314] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2465.317326] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2465.317339] nouveau 0000:01:00.0: fifo: PBDMA0: 00004000 [GPPTR] ch 26 [007c2de000 Xwayland[15818]] subc 0 mthd 0000 data 00000000 [ 2465.317354] sysrq: SysRq : Trigger a crash [I.1] See a fair bit of logging from nouveau and not sure what it is. Also some ACPI memory reservation conflicts PID: 1105 TASK: ffff9e9e5d6c0000 CPU: 0 COMMAND: "gnome-shell" __handle_sysrq -> machine_kexec bt: cannot transition from IRQ stack to current process stack: IRQ stack pointer: ffff9e9e7dc036f8 process stack pointer: ffffffff9c878afe current stack base: ffffc31c82c84000 PID: 16280 TASK: ffff9e9e24d44b00 CPU: 1 COMMAND: "tracker-extract" <userspace> -> do_nmi -> crash_nmi_callback PID: 16336 TASK: ffff9e9e68ed2580 CPU: 2 COMMAND: "pkla-check-auth" <userspace> -> do_nmi -> crash_nmi_callback PID: 572 TASK: ffff9e9e66e24b00 CPU: 3 COMMAND: "systemd-journal" do_filp_open -> do_dentry_open -> ext4_xattr_security_get -> __bpf_prog_run+2070 -> do_nmi -> crash_nmi_callback crash> runq CPU 0 RUNQUEUE: ffff9e9e7dc195c0 CURRENT: PID: 1105 TASK: ffff9e9e5d6c0000 COMMAND: "gnome-shell" RT PRIO_ARRAY: ffff9e9e7dc19770 [no tasks queued] CFS RB_ROOT: ffff9e9e7dc19658 [120] PID: 16334 TASK: ffff9e9cb4afcb00 COMMAND: "pool" [120] PID: 1 TASK: ffff9e9e6b64a580 COMMAND: "systemd" [120] PID: 2545 TASK: ffff9e9e5d724b00 COMMAND: "gdbus" [120] PID: 16268 TASK: ffff9e9cb3004b00 COMMAND: "systemd-localed" [120] PID: 16275 TASK: ffff9e9d9f2da580 COMMAND: "tracker-miner-a" CPU 1 RUNQUEUE: ffff9e9e7dc995c0 CURRENT: PID: 16280 TASK: ffff9e9e24d44b00 COMMAND: "tracker-extract" RT PRIO_ARRAY: ffff9e9e7dc99770 [no tasks queued] CFS RB_ROOT: ffff9e9e7dc99658 [no tasks queued] CPU 2 RUNQUEUE: ffff9e9e7dd195c0 CURRENT: PID: 16336 TASK: ffff9e9e68ed2580 COMMAND: "pkla-check-auth" RT PRIO_ARRAY: ffff9e9e7dd19770 [no tasks queued] CFS RB_ROOT: ffff9e9e7dd19658 [120] PID: 15715 TASK: ffff9e9e46a0a580 COMMAND: "gnome-session-b" [120] PID: 16274 TASK: ffff9e9d9f2d8000 COMMAND: "gdbus" [120] PID: 16308 TASK: ffff9e9e25fa4b00 COMMAND: "gdbus" [120] PID: 16333 TASK: ffff9e9e68ed4b00 COMMAND: "xdg-user-dirs-g" CPU 3 RUNQUEUE: ffff9e9e7dd995c0 CURRENT: PID: 572 TASK: ffff9e9e66e24b00 COMMAND: "systemd-journal" RT PRIO_ARRAY: ffff9e9e7dd99770 [no tasks queued] CFS RB_ROOT: ffff9e9e7dd99658 [no tasks queued] [I.2] CPU 0 handled the sysrq interrupt, CPUs 1/2 are in userspace and received the nmi to crash, and CPU 3 was attempting to open a directory on an ext4 fs and entered an eBPF program None of the runqueues are saturated with processes and it looks like Xwayland is not running or queued to run. crash> ps -S RU: 17 IN: 708 ZO: 1 crash> ps -m | grep RU | grep -v swapper [0 00:00:00.000] [RU] PID: 16336 TASK: ffff9e9e68ed2580 CPU: 2 COMMAND: "pkla-check-auth" [0 00:00:00.000] [RU] PID: 16280 TASK: ffff9e9e24d44b00 CPU: 1 COMMAND: "tracker-extract" [0 00:00:00.001] [RU] PID: 16333 TASK: ffff9e9e68ed4b00 CPU: 2 COMMAND: "xdg-user-dirs-g" [0 00:00:00.001] [RU] PID: 16274 TASK: ffff9e9d9f2d8000 CPU: 2 COMMAND: "gdbus" [0 00:00:00.001] [RU] PID: 16308 TASK: ffff9e9e25fa4b00 CPU: 2 COMMAND: "gdbus" [0 00:00:00.004] [RU] PID: 572 TASK: ffff9e9e66e24b00 CPU: 3 COMMAND: "systemd-journal" [0 00:00:00.003] [RU] PID: 1105 TASK: ffff9e9e5d6c0000 CPU: 0 COMMAND: "gnome-shell" [0 00:00:00.003] [RU] PID: 16334 TASK: ffff9e9cb4afcb00 CPU: 0 COMMAND: "pool" [0 00:00:00.003] [RU] PID: 16275 TASK: ffff9e9d9f2da580 CPU: 0 COMMAND: "tracker-miner-a" [0 00:00:00.006] [RU] PID: 15715 TASK: ffff9e9e46a0a580 CPU: 2 COMMAND: "gnome-session-b" [0 00:00:00.005] [RU] PID: 16268 TASK: ffff9e9cb3004b00 CPU: 0 COMMAND: "systemd-localed" [0 00:00:00.009] [RU] PID: 2545 TASK: ffff9e9e5d724b00 CPU: 0 COMMAND: "gdbus" [0 00:00:00.012] [RU] PID: 1 TASK: ffff9e9e6b64a580 CPU: 0 COMMAND: "systemd" crash> ps -m | grep -i wayland [0 00:00:00.024] [IN] PID: 15818 TASK: ffff9e9e258aa580 CPU: 2 COMMAND: "Xwayland" [0 00:00:04.455] [IN] PID: 1180 TASK: ffff9e9e5e1da580 CPU: 1 COMMAND: "Xwayland" [0 00:00:04.915] [IN] PID: 15709 TASK: ffff9e9e66a58000 CPU: 0 COMMAND: "gdm-wayland-ses" [0 00:40:59.654] [IN] PID: 1076 TASK: ffff9e9e5ce82580 CPU: 0 COMMAND: "gdm-wayland-ses" crash> bt 15818 PID: 15818 TASK: ffff9e9e258aa580 CPU: 2 COMMAND: "Xwayland" #0 [ffffc31c8250fd08] __schedule at ffffffff9c870064 #1 [ffffc31c8250fda0] schedule at ffffffff9c870736 #2 [ffffc31c8250fdb8] schedule_hrtimeout_range_clock at ffffffff9c874b49 #3 [ffffc31c8250fe48] schedule_hrtimeout_range at ffffffff9c874c33 #4 [ffffc31c8250fe58] ep_poll at ffffffff9c2b825a #5 [ffffc31c8250ff10] sys_epoll_wait at ffffffff9c2b9dfe #6 [ffffc31c8250ff50] entry_SYSCALL_64_fastpath at ffffffff9c875ff7 RIP: 00007f2d302960f3 RSP: 00007ffd967500b0 RFLAGS: 00000293 RAX: ffffffffffffffda RBX: 0000000002a2fde0 RCX: 00007f2d302960f3 RDX: 0000000000000100 RSI: 00007ffd967500c0 RDI: 0000000000000007 RBP: 0000000002688d80 R8: 0000000000000002 R9: 0000000000000000 R10: 0000000000091649 R11: 0000000000000293 R12: 00000000025f9eb0 R13: 0000000000000000 R14: 000000000082b060 R15: 0000000002a2fde0 ORIG_RAX: 00000000000000e8 CS: 0033 SS: 002b [I.3] Ah it is asleep waiting on events. Let's check where the error message is printed and what conditions it could be printed in > grep -ir 'subc.*mthd' (1) ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c: if (nvkm_sw_mthd(device->sw, chid, subc, mthd, data)) ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c: "subc %d mthd %04x data %08x\n", ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gf100.c: subc, mthd, data); (2) ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: if (nvkm_sw_mthd(device->sw, chid, subc, mthd, data)) ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: "subc %d mthd %04x data %08x\n", ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: subc, mthd, data); (3) ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv04.c: handled = nvkm_sw_mthd(sw, chid, subc, mthd, data); ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/nv04.c: "ch %d [%s] subc %d mthd %04x data %08x\n", > grep -ir -e GPPTR -e PBENTRY . ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: { 0x00004000, "GPPTR" }, ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: { 0x00040000, "PBENTRY" }, [I.4] We are almost certainly looking at gk104.c since the "msg" includes GPPTR and PBENTRY Now walk the chain up to see what calls us ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: static void gk104_fifo_intr_pbdma_0(struct gk104_fifo *fifo, int unit) { struct nvkm_subdev *subdev = &fifo->base.engine.subdev; struct nvkm_device *device = subdev->device; u32 mask = nvkm_rd32(device, 0x04010c + (unit * 0x2000)); u32 stat = nvkm_rd32(device, 0x040108 + (unit * 0x2000)) & mask; u32 addr = nvkm_rd32(device, 0x0400c0 + (unit * 0x2000)); u32 data = nvkm_rd32(device, 0x0400c4 + (unit * 0x2000)); u32 chid = nvkm_rd32(device, 0x040120 + (unit * 0x2000)) & 0xfff; u32 subc = (addr & 0x00070000) >> 16; u32 mthd = (addr & 0x00003ffc); u32 show = stat; struct nvkm_fifo_chan *chan; unsigned long flags; char msg[128]; if (stat & 0x00800000) { if (device->sw) { if (nvkm_sw_mthd(device->sw, chid, subc, mthd, data)) show &= ~0x00800000; } } nvkm_wr32(device, 0x0400c0 + (unit * 0x2000), 0x80600008); if (show) { nvkm_snprintbf(msg, sizeof(msg), gk104_fifo_pbdma_intr_0, show); chan = nvkm_fifo_chan_chid(&fifo->base, chid, &flags); nvkm_error(subdev, "PBDMA%d: %08x [%s] ch %d [%010llx %s] " | Error "subc %d mthd %04x data %08x\n", | is unit, show, msg, chid, chan ? chan->inst->addr : 0, | printed chan ? chan->object.client->name : "unknown", | here subc, mthd, data); | nvkm_fifo_chan_put(&fifo->base, flags, &chan); } nvkm_wr32(device, 0x040108 + (unit * 0x2000), stat); } ./drivers/gpu/drm/nouveau/nvkm/engine/fifo/gk104.c: static void gk104_fifo_intr(struct nvkm_fifo *base) { struct gk104_fifo *fifo = gk104_fifo(base); struct nvkm_subdev *subdev = &fifo->base.engine.subdev; struct nvkm_device *device = subdev->device; u32 mask = nvkm_rd32(device, 0x002140); u32 stat = nvkm_rd32(device, 0x002100) & mask; - - - - - - - - - - - [SNIP] - - - - - - - - - - - if (stat & 0x20000000) { u32 mask = nvkm_rd32(device, 0x0025a0); while (mask) { u32 unit = __ffs(mask); gk104_fifo_intr_pbdma_0(fifo, unit); <--- this is the only place gk104_fifo_intr_pbdma_1(fifo, unit); gk104_fifo_intr_pbdma_0 is called nvkm_wr32(device, 0x0025a0, (1 << unit)); mask &= ~(1 << unit); } stat &= ~0x20000000; } - - - - - - - - - - - [SNIP] - - - - - - - - - - - static const struct nvkm_fifo_func gk104_fifo_ = { .dtor = gk104_fifo_dtor, .oneinit = gk104_fifo_oneinit, .init = gk104_fifo_init, .fini = gk104_fifo_fini, .intr = gk104_fifo_intr, <--- passed in as a control block. Interrupt handler? Maybe this is the command pushed to the fifo? .uevent_init = gk104_fifo_uevent_init, .uevent_fini = gk104_fifo_uevent_fini, .recover_chan = gk104_fifo_recover_chan, .class_get = gk104_fifo_class_get, And the only location it is worked with is the following: int gk104_fifo_new_(const struct gk104_fifo_func *func, struct nvkm_device *device, int index, int nr, struct nvkm_fifo **pfifo) { struct gk104_fifo *fifo; if (!(fifo = kzalloc(sizeof(*fifo), GFP_KERNEL))) return -ENOMEM; fifo->func = func; INIT_WORK(&fifo->recover.work, gk104_fifo_recover_work); *pfifo = &fifo->base; return nvkm_fifo_ctor(&gk104_fifo_, device, index, nr, &fifo->base); } Not sure where to go from here.
I forgot to add; It appears the upgrade downgraded nouveau: > grep nouv /var/log/dnf.log-20170721 Jul 21 17:38:03 DEBUG ---> Package xorg-x11-drv-nouveau.x86_64 1:1.0.15-2.fc25 will be downgraded Jul 21 17:38:03 DEBUG ---> Package xorg-x11-drv-nouveau.x86_64 1:1.0.15-1.fc26 will be a downgrade xorg-x11-drv-nouveau x86_64 1:1.0.15-1.fc26 fedora 100 k
Interestingly, I logged into the Sway WM which uses wayland and, for some reason, it is operating fine.
This is known problem but without good answer https://bugzilla.redhat.com/show_bug.cgi?id=1457626 and https://fedoraproject.org/wiki/Common_F26_bugs#Wayland_sessions_crash_on_start_on_some_NVIDIA_graphics_cards.2C_logs_show_fifo:_read_fault_error
(In reply to Johnny B. Goode from comment #4) > This is known problem but without good answer Ah, then should we close this as DUPLICATE?
This is a question to Adam or Ben. New patches are here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0b5477d9dabd96ded4c5ef7a5f08b00188fc1dec but some of these are in kernel-4.12.5 so we will see probably pretty quickly.
(In reply to Johnny B. Goode from comment #6) > This is a question to Adam or Ben. > > New patches are here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=0b5477d9dabd96ded4c5ef7a5f08b00188fc1dec > but some of these are in kernel-4.12.5 so we will see probably pretty > quickly. Awesome, I can attempt and update and make a switch over to wayland when it is released and report if the patch helped. If you would like, I don't mind a test kernel to try out if you can build one with the patches in place.
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.