Bug 1470371 - Lenovo Laptop locks up with RIP: 0010:ioread32+0x30/0x40
Lenovo Laptop locks up with RIP: 0010:ioread32+0x30/0x40
Status: NEW
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau (Show other bugs)
27
All Linux
high Severity high
: ---
: ---
Assigned To: Ben Skeggs
Fedora Extras Quality Assurance
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-12 16:26 EDT by Sam Roza
Modified: 2017-12-07 19:30 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
most recent crash (10.98 KB, text/plain)
2017-12-07 14:19 EST, Sam Roza
no flags Details
updated journalctl entries from most recent crash (95.36 KB, text/plain)
2017-12-07 17:13 EST, Sam Roza
no flags Details

  None (edit)
Description Sam Roza 2017-07-12 16:26:54 EDT
Description of problem:

Laptop locks up on RIP

Version-Release number of selected component (if applicable):


How reproducible:
Computer was idle and when I returned, mouse would not move. Required power-off switch to reboot.

A 'dnf update' was performed approximately 25 minutes before.

Here's the journalctl logs:

~~~
Jul 12 15:37:08 ypestis2 kernel: NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [libvirtd:1612]
Jul 12 15:37:08 ypestis2 kernel: Modules linked in: pcc_cpufreq(-) rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT n
Jul 12 15:37:08 ypestis2 kernel:  btrtl btbcm snd_hda_codec_realtek btintel videobuf2_core intel_rapl_perf cfg80211 joydev bluetooth snd_hda_codec_hdmi snd_hda_codec_generic videodev media snd_hda_intel snd_hda_
Jul 12 15:37:08 ypestis2 kernel: CPU: 7 PID: 1612 Comm: libvirtd Not tainted 4.10.14-200.fc25.x86_64 #1
Jul 12 15:37:08 ypestis2 kernel: Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET71WW (2.19 ) 02/05/2015
Jul 12 15:37:08 ypestis2 kernel: task: ffff8dbd0253a580 task.stack: ffffb5130819c000
Jul 12 15:37:08 ypestis2 kernel: RIP: 0010:ioread32+0x30/0x40
Jul 12 15:37:08 ypestis2 kernel: RSP: 0018:ffffb5130819f9f8 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff10
Jul 12 15:37:08 ypestis2 kernel: RAX: 00000000ffffffff RBX: ffff8dbd144ba400 RCX: 0000000000000018
Jul 12 15:37:08 ypestis2 kernel: RDX: 0000046620951d38 RSI: ffffb5130910a014 RDI: ffffb51309009400
Jul 12 15:37:08 ypestis2 kernel: RBP: ffffb5130819fa18 R08: 0000000000000002 R09: ffffb5130819fa04
Jul 12 15:37:08 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffff
Jul 12 15:37:08 ypestis2 kernel: R13: ffff8dbd12f187e0 R14: ffffffffffffffff R15: ffff8dbd12f1cd80
Jul 12 15:37:08 ypestis2 kernel: FS:  00007ff964fdedc0(0000) GS:ffff8dbd3e3c0000(0000) knlGS:0000000000000000
Jul 12 15:37:08 ypestis2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 12 15:37:08 ypestis2 kernel: CR2: 00007f68cefe1630 CR3: 0000000840f2a000 CR4: 00000000001406e0
Jul 12 15:37:08 ypestis2 kernel: Call Trace:
Jul 12 15:37:08 ypestis2 kernel:  ? nv04_timer_read+0x35/0x60 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_timer_read+0xf/0x20 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_pmu_reset+0x71/0x160 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_pmu_preinit+0x12/0x20 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_subdev_preinit+0x34/0x110 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_device_init+0x62/0x280 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_udevice_init+0x48/0x60 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_object_init+0x40/0x190 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_object_init+0xb4/0x190 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_client_init+0xe/0x10 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvkm_client_resume+0xe/0x10 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nvif_client_resume+0x17/0x20 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nouveau_do_resume+0x4b/0x130 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  nouveau_pmops_runtime_resume+0x78/0x150 [nouveau]
Jul 12 15:37:08 ypestis2 kernel:  ? pci_restore_standard_config+0x40/0x40
Jul 12 15:37:08 ypestis2 kernel:  pci_pm_runtime_resume+0x7b/0xa0
Jul 12 15:37:08 ypestis2 kernel:  __rpm_callback+0xc2/0x200
Jul 12 15:37:08 ypestis2 kernel:  rpm_callback+0x24/0x80
Jul 12 15:37:08 ypestis2 kernel:  ? pci_restore_standard_config+0x40/0x40
Jul 12 15:37:08 ypestis2 kernel:  rpm_resume+0x4a4/0x6b0
Jul 12 15:37:08 ypestis2 kernel:  ? mntput_no_expire+0x183/0x190
Jul 12 15:37:08 ypestis2 kernel:  ? terminate_walk+0xe0/0xf0
Jul 12 15:37:08 ypestis2 kernel:  __pm_runtime_resume+0x4e/0x80
Jul 12 15:37:08 ypestis2 kernel:  pci_config_pm_runtime_get+0x53/0x60
Jul 12 15:37:08 ypestis2 kernel:  pci_read_config+0x8f/0x280
Jul 12 15:37:08 ypestis2 kernel:  sysfs_kf_bin_read+0x4a/0x70
Jul 12 15:37:08 ypestis2 kernel:  kernfs_fop_read+0xae/0x180
Jul 12 15:37:08 ypestis2 kernel:  __vfs_read+0x37/0x150
Jul 12 15:37:08 ypestis2 kernel:  ? security_file_permission+0x9b/0xc0
Jul 12 15:37:08 ypestis2 kernel:  vfs_read+0x96/0x130
Jul 12 15:37:08 ypestis2 kernel:  SyS_read+0x55/0xc0
Jul 12 15:37:08 ypestis2 kernel:  entry_SYSCALL_64_fastpath+0x1a/0xa9
Jul 12 15:37:08 ypestis2 kernel: RIP: 0033:0x7ff960a5563d
~~~

Similar reported RIPs and  reported here: https://bugs.freedesktop.org/show_bug.cgi?id=100035

Not sure what other information would be valuable for this bug. Please let me know if you need any information.

I have been having other hang issues related to the Nouveau driver and CPU power states when accessing the GPU (IOW: Watch a video while accessing the Sound applet in the tools menu, and the laptop will lock up completely.
Comment 1 Sam Roza 2017-09-04 23:30:11 EDT
Still happening. kernel fully updated: 

[root@ypestis2 ~]# uname -a
Linux ypestis2 4.11.11-300.fc26.x86_64 #1 SMP Mon Jul 17 16:32:11 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


~~~
Sep 04 20:09:54 ypestis2 kernel: NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [Xorg:2232]
Sep 04 20:09:54 ypestis2 kernel: Modules linked in: rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ccm ip6t
Sep 04 20:09:54 ypestis2 kernel:  btintel intel_uncore videobuf2_vmalloc intel_rapl_perf videobuf2_memops snd_rawmidi videobuf2_v4l2 videobuf2_core snd_hda_codec_realtek bluetooth i2c_i801 iwlwifi videodev snd_h
Sep 04 20:09:54 ypestis2 kernel: CPU: 5 PID: 2232 Comm: Xorg Not tainted 4.11.11-300.fc26.x86_64 #1
Sep 04 20:09:54 ypestis2 kernel: Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET71WW (2.19 ) 02/05/2015
Sep 04 20:09:54 ypestis2 kernel: task: ffff9a670c6f0000 task.stack: ffffbbf04c4b0000
Sep 04 20:09:54 ypestis2 kernel: RIP: 0010:ioread32+0x19/0x40
Sep 04 20:09:54 ypestis2 kernel: RSP: 0018:ffffbbf04c4b3b18 EFLAGS: 00000296 ORIG_RAX: ffffffffffffff10
Sep 04 20:09:54 ypestis2 kernel: RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000018
Sep 04 20:09:54 ypestis2 kernel: RDX: 00000d50229e3674 RSI: ffffbbf04910a014 RDI: ffffbbf04910a04c
Sep 04 20:09:54 ypestis2 kernel: RBP: ffffbbf04c4b3b48 R08: 0000000000000002 R09: ffffbbf04c4b3aec
Sep 04 20:09:54 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9a6754034000
Sep 04 20:09:54 ypestis2 kernel: R13: ffff9a6753bc8240 R14: ffff9a6753bc43c0 R15: ffffffffffffffff
Sep 04 20:09:54 ypestis2 kernel: FS:  00007fb8ace642c0(0000) GS:ffff9a677e340000(0000) knlGS:0000000000000000
Sep 04 20:09:54 ypestis2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 04 20:09:54 ypestis2 kernel: CR2: 00007f6b3c7678f0 CR3: 0000000814df9000 CR4: 00000000001406e0
Sep 04 20:09:54 ypestis2 kernel: Call Trace:
Sep 04 20:09:54 ypestis2 kernel:  ? nvkm_pmu_reset+0x94/0x180 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_pmu_preinit+0x12/0x20 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_subdev_preinit+0x34/0x110 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_device_init+0x60/0x270 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_udevice_init+0x48/0x60 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_object_init+0x3f/0x190 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_object_init+0xa3/0x190 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvkm_object_init+0xa3/0x190 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  ? pci_restore_standard_config+0x40/0x40
Sep 04 20:09:54 ypestis2 kernel:  nvkm_client_resume+0xe/0x10 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nvif_client_resume+0x17/0x20 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nouveau_do_resume+0x40/0xe0 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  nouveau_pmops_runtime_resume+0x77/0x160 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  ? pci_restore_standard_config+0x40/0x40
Sep 04 20:09:54 ypestis2 kernel:  pci_pm_runtime_resume+0x7f/0xa0
Sep 04 20:09:54 ypestis2 kernel:  __rpm_callback+0xc2/0x200
Sep 04 20:09:54 ypestis2 kernel:  rpm_callback+0x24/0x80
Sep 04 20:09:54 ypestis2 kernel:  ? pci_restore_standard_config+0x40/0x40
Sep 04 20:09:54 ypestis2 kernel:  rpm_resume+0x4aa/0x790
Sep 04 20:09:54 ypestis2 kernel:  ? __handle_mm_fault+0x8c2/0x10a0
Sep 04 20:09:54 ypestis2 kernel:  __pm_runtime_resume+0x4e/0x80
Sep 04 20:09:54 ypestis2 kernel:  nouveau_drm_ioctl+0x3d/0xc0 [nouveau]
Sep 04 20:09:54 ypestis2 kernel:  do_vfs_ioctl+0xa5/0x600
Sep 04 20:09:54 ypestis2 kernel:  SyS_ioctl+0x79/0x90
Sep 04 20:09:54 ypestis2 kernel:  entry_SYSCALL_64_fastpath+0x1a/0xa9
Sep 04 20:09:54 ypestis2 kernel: RIP: 0033:0x7fb8aa6da5e7
Sep 04 20:09:54 ypestis2 kernel: RSP: 002b:00007fff07744298 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Sep 04 20:09:54 ypestis2 kernel: RAX: ffffffffffffffda RBX: 000000000139e260 RCX: 00007fb8aa6da5e7
Sep 04 20:09:54 ypestis2 kernel: RDX: 00007fff077442d0 RSI: 00000000c05064a7 RDI: 000000000000000d
Sep 04 20:09:54 ypestis2 kernel: RBP: 00000000011701d0 R08: 0000000001393500 R09: 0000000000000000
Sep 04 20:09:54 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000003246 R12: 000000000084e100
Sep 04 20:09:54 ypestis2 kernel: R13: 0000000000000002 R14: 0000000000000000 R15: 000000000084e100
Sep 04 20:09:54 ypestis2 kernel: Code: 4e ff ff ff b8 ff ff 00 00 5d c3 0f 1f 80 00 00 00 00 48 81 ff ff ff 03 00 77 0e 48 81 ff 00 00 01 00 76 08 0f b7 d7 ed c3 8b 07 <c3> 55 48 c7 c6 99 e2 c9 83 48 89 e5 e8 16
Sep 04 20:09:56 ypestis2 abrt-dump-journal-oops[1955]: abrt-dump-journal-oops: Found oopses: 1
Sep 04 20:09:56 ypestis2 abrt-dump-journal-oops[1955]: abrt-dump-journal-oops: Creating problem directories
Sep 04 20:09:57 ypestis2 abrt-dump-journal-oops[1955]: Reported 1 kernel oopses to Abrt
~~~

Again, this was triggered by watching a video while opening up the sound menu (wanted to boost the audio somewhat). 

This has become less common since the driver began allowing me to have my laptop lid closed and only drive my 2 external displays, but obviously, it still occurs, somewhat regularly.
Comment 2 Fedora End Of Life 2017-11-16 14:36:57 EST
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Comment 3 Sam Roza 2017-11-30 21:07:30 EST
Is there any reasonable reason why this is still occurring; only now it's been carried forward to F27?

~~~
Nov 30 16:37:04 ypestis2 abrt-dump-journal-oops[2086]: abrt-dump-journal-oops: Found oopses: 1
Nov 30 16:37:04 ypestis2 abrt-dump-journal-oops[2086]: abrt-dump-journal-oops: Creating problem directories
Nov 30 16:37:18 ypestis2 kernel: INFO: rcu_sched self-detected stall on CPU
Nov 30 16:37:18 ypestis2 kernel:         2-...: (4560074 ticks this GP) idle=71e/140000000000001/0 softirq=3529108/3529108 fqs=1139870 
Nov 30 16:37:18 ypestis2 kernel:          (t=4560075 jiffies g=3160559 c=3160558 q=1692098)
Nov 30 16:37:18 ypestis2 kernel: NMI backtrace for cpu 2
Nov 30 16:37:18 ypestis2 kernel: CPU: 2 PID: 2172 Comm: Xorg Tainted: G             L  4.13.13-300.fc27.x86_64 #1
Nov 30 16:37:18 ypestis2 kernel: Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET71WW (2.19 ) 02/05/2015
Nov 30 16:37:18 ypestis2 kernel: Call Trace:
Nov 30 16:37:18 ypestis2 kernel:  <IRQ>
Nov 30 16:37:18 ypestis2 kernel:  dump_stack+0x63/0x8b
Nov 30 16:37:18 ypestis2 kernel:  nmi_cpu_backtrace+0xbe/0xc0
Nov 30 16:37:18 ypestis2 kernel:  ? irq_force_complete_move+0x130/0x130
Nov 30 16:37:18 ypestis2 kernel:  nmi_trigger_cpumask_backtrace+0xe6/0x120
Nov 30 16:37:18 ypestis2 kernel:  arch_trigger_cpumask_backtrace+0x19/0x20
Nov 30 16:37:18 ypestis2 kernel:  rcu_dump_cpu_stacks+0xa5/0xe8
Nov 30 16:37:18 ypestis2 kernel:  rcu_check_callbacks+0x6cb/0x8e0
Nov 30 16:37:18 ypestis2 kernel:  ? tick_sched_do_timer+0x60/0x60
Nov 30 16:37:18 ypestis2 kernel:  update_process_times+0x2f/0x60
Nov 30 16:37:18 ypestis2 kernel:  tick_sched_handle+0x26/0x70
Nov 30 16:37:18 ypestis2 kernel:  ? tick_sched_do_timer+0x44/0x60
Nov 30 16:37:18 ypestis2 kernel:  tick_sched_timer+0x39/0x80
Nov 30 16:37:18 ypestis2 kernel:  __hrtimer_run_queues+0xe0/0x210
Nov 30 16:37:18 ypestis2 kernel:  hrtimer_interrupt+0xa0/0x1c0
Nov 30 16:37:18 ypestis2 kernel:  local_apic_timer_interrupt+0x38/0x60
Nov 30 16:37:18 ypestis2 kernel:  smp_apic_timer_interrupt+0x38/0x50
Nov 30 16:37:18 ypestis2 kernel:  apic_timer_interrupt+0x93/0xa0
Nov 30 16:37:18 ypestis2 kernel: RIP: 0010:ioread32+0x19/0x40
Nov 30 16:37:18 ypestis2 kernel: RSP: 0018:ffffb06d0c78bad8 EFLAGS: 00000296 ORIG_RAX: ffffffffffffff10
Nov 30 16:37:18 ypestis2 kernel: RAX: 00000000ffffffff RBX: ffff8da0d516dc00 RCX: 0000000000000018
Nov 30 16:37:18 ypestis2 kernel: RDX: 0000099bbe6db3a8 RSI: ffffb06d0910a014 RDI: ffffb06d09009410
Nov 30 16:37:18 ypestis2 kernel: RBP: ffffb06d0c78baf8 R08: 0000000000000002 R09: ffffb06d0c78bae4
Nov 30 16:37:18 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff8da0d516dc00
Nov 30 16:37:18 ypestis2 kernel: R13: ffff8da0d5241060 R14: ffffffffffffffff R15: ffff8da0d4b96cc0
~~~

It's extremely frustrating-let alone hard to support our products in front of customers-when I CAN'T EVEN REBOOT MY MACHINE WITHOUT HARD-POWERING IT OFF VIA THE POWER BUTTON!
Comment 5 Sam Roza 2017-12-02 12:06:03 EST
More, this time I was just waking my laptop up from sleep:

Dec 01 15:34:19 ypestis2 abrt-dump-journal-oops[2021]: abrt-dump-journal-oops: Found oopses: 1
Dec 01 15:34:19 ypestis2 abrt-dump-journal-oops[2021]: abrt-dump-journal-oops: Creating problem directories
Dec 01 15:34:20 ypestis2 abrt-dump-journal-oops[2021]: Reported 1 kernel oopses to Abrt
Dec 01 15:34:31 ypestis2 abrt-server[6594]: Lock file '.lock' is locked by process 2713
Dec 01 15:34:31 ypestis2 abrt-notification[6722]: System encountered a non-fatal error in ??()
Dec 01 15:34:46 ypestis2 kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [Xorg:2128]
Dec 01 15:34:46 ypestis2 kernel: Modules linked in: xt_addrtype br_netfilter overlay rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
Dec 01 15:34:46 ypestis2 kernel:  intel_rapl_perf snd_hda_codec_realtek snd_hda_codec_hdmi uvcvideo btusb snd_hda_codec_generic videobuf2_vmalloc iwlwifi videobuf2_memops btrtl btbcm videobuf2_v4l2 btintel video
Dec 01 15:34:46 ypestis2 kernel: CPU: 4 PID: 2128 Comm: Xorg Tainted: G             L  4.13.15-300.fc27.x86_64 #1
Dec 01 15:34:46 ypestis2 kernel: Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET71WW (2.19 ) 02/05/2015
Dec 01 15:34:46 ypestis2 kernel: task: ffff8aea56eea640 task.stack: ffff9f770a3cc000
Dec 01 15:34:46 ypestis2 kernel: RIP: 0010:ioread32+0x19/0x40
Dec 01 15:34:46 ypestis2 kernel: RSP: 0018:ffff9f770a3cfad8 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff10
Dec 01 15:34:46 ypestis2 kernel: RAX: 00000000ffffffff RBX: ffff8aea9576e800 RCX: 0000000000000018
Dec 01 15:34:46 ypestis2 kernel: RDX: 00000cbffdb1a070 RSI: ffff9f770910a014 RDI: ffff9f7709009400
Dec 01 15:34:46 ypestis2 kernel: RBP: ffff9f770a3cfaf8 R08: 0000000000000002 R09: ffff9f770a3cfae4
Dec 01 15:34:46 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffff
Dec 01 15:34:46 ypestis2 kernel: R13: ffff8aea9395b420 R14: ffffffffffffffff R15: ffff8aea93960000
Dec 01 15:34:46 ypestis2 kernel: FS:  00007ff22e61fa80(0000) GS:ffff8aeabe300000(0000) knlGS:0000000000000000
Dec 01 15:34:46 ypestis2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 01 15:34:46 ypestis2 kernel: CR2: 00001844e0ae0000 CR3: 00000008172d2000 CR4: 00000000001406e0
Dec 01 15:34:46 ypestis2 kernel: Call Trace:
Dec 01 15:34:46 ypestis2 kernel:  ? nv04_timer_read+0x35/0x60 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_timer_read+0xf/0x20 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_pmu_reset+0x71/0x170 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_pmu_preinit+0x12/0x20 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_subdev_preinit+0x34/0x110 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_device_init+0x60/0x270 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_udevice_init+0x48/0x60 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_object_init+0x3f/0x190 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_object_init+0xa3/0x190 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_object_init+0xa3/0x190 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvkm_client_resume+0xe/0x10 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nvif_client_resume+0x17/0x20 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nouveau_do_resume+0x40/0xe0 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  nouveau_pmops_runtime_resume+0x91/0x150 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  ? pci_restore_standard_config+0x50/0x50
Dec 01 15:34:46 ypestis2 kernel:  pci_pm_runtime_resume+0x7a/0xa0
Dec 01 15:34:46 ypestis2 kernel:  __rpm_callback+0xc2/0x200
Dec 01 15:34:46 ypestis2 kernel:  rpm_callback+0x24/0x80
Dec 01 15:34:46 ypestis2 kernel:  ? pci_restore_standard_config+0x50/0x50
Dec 01 15:34:46 ypestis2 kernel:  rpm_resume+0x4bf/0x7b0
Dec 01 15:34:46 ypestis2 kernel:  __pm_runtime_resume+0x4e/0x80
Dec 01 15:34:46 ypestis2 kernel:  nouveau_drm_ioctl+0x3d/0xc0 [nouveau]
Dec 01 15:34:46 ypestis2 kernel:  do_vfs_ioctl+0xa5/0x600
Dec 01 15:34:46 ypestis2 kernel:  SyS_ioctl+0x79/0x90
Dec 01 15:34:46 ypestis2 kernel:  entry_SYSCALL_64_fastpath+0x1a/0xa5
Dec 01 15:34:46 ypestis2 kernel: RIP: 0033:0x7ff22b8f4dc7
Dec 01 15:34:46 ypestis2 kernel: RSP: 002b:00007ffebd12aea8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Dec 01 15:34:46 ypestis2 kernel: RAX: ffffffffffffffda RBX: 00000000015a3a58 RCX: 00007ff22b8f4dc7
Dec 01 15:34:46 ypestis2 kernel: RDX: 00007ffebd12aee0 RSI: 00000000c05064a7 RDI: 000000000000000d
Comment 6 Sam Roza 2017-12-07 13:53:07 EST
I updated the BIOS, hoping the ACPI code was the issue, but no dice:

~~~
Dec  7 10:27:47 ypestis2 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [Xorg:2583]
Dec  7 10:27:47 ypestis2 kernel: Modules linked in: nls_utf8 isofs vfat fat uas usb_storage rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc rmi_smbus rmi_core arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp iTCO_wdt iTCO_vendor_support mei_wdt coretemp kvm_intel kvm iwlmvm snd_hda_codec_realtek snd_hda_codec_hdmi
Dec  7 10:27:47 ypestis2 kernel: snd_hda_codec_generic irqbypass intel_cstate intel_uncore mac80211 snd_hda_intel intel_rapl_perf snd_hda_codec snd_usb_audio btusb snd_hda_core snd_usbmidi_lib snd_rawmidi snd_hwdep btrtl snd_seq btbcm btintel snd_seq_device uvcvideo bluetooth iwlwifi snd_pcm videobuf2_vmalloc cfg80211 videobuf2_memops videobuf2_v4l2 videobuf2_core i2c_i801 wmi_bmof lpc_ich videodev joydev thinkpad_acpi snd_timer snd media mei_me ecdh_generic tpm_tis mei tpm_tis_core ie31200_edac soundcore shpchp rfkill tpm dm_crypt hid_logitech_hidpp hid_logitech_dj i915 nouveau crct10dif_pclmul crc32_pclmul mxm_wmi ttm crc32c_intel e1000e i2c_algo_bit drm_kms_helper sdhci_pci sdhci ghash_clmulni_intel serio_raw drm mmc_core ptp pps_core wmi video
Dec  7 10:27:47 ypestis2 kernel: CPU: 2 PID: 2583 Comm: Xorg Not tainted 4.13.15-300.fc27.x86_64 #1
Dec  7 10:27:47 ypestis2 kernel: Hardware name: LENOVO 20EGS0R600/20EGS0R600, BIOS GNET84WW (2.32 ) 09/18/2017
Dec  7 10:27:47 ypestis2 kernel: task: ffff9742e305cc80 task.stack: ffffb5194a3dc000
Dec  7 10:27:47 ypestis2 kernel: RIP: 0010:ioread32+0x19/0x40
Dec  7 10:27:47 ypestis2 kernel: RSP: 0018:ffffb5194a3dfb10 EFLAGS: 00000296 ORIG_RAX: ffffffffffffff10
Dec  7 10:27:47 ypestis2 kernel: RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000018
Dec  7 10:27:47 ypestis2 kernel: RDX: 000003b2ae402b0f RSI: ffffb5194910a014 RDI: ffffb5194910a04c
Dec  7 10:27:47 ypestis2 kernel: RBP: ffffb5194a3dfb40 R08: 0000000000000002 R09: ffffb5194a3dfae4
Dec  7 10:27:47 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff97431523f800
Dec  7 10:27:47 ypestis2 kernel: R13: ffff97431a224720 R14: ffffffffffffffff R15: ffff974315631480
Dec  7 10:27:47 ypestis2 kernel: FS:  00007f8ff41b3a80(0000) GS:ffff97433e280000(0000) knlGS:0000000000000000
Dec  7 10:27:47 ypestis2 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec  7 10:27:47 ypestis2 kernel: CR2: 00007f5ce962d260 CR3: 0000000833e5e000 CR4: 00000000001406e0
Dec  7 10:27:47 ypestis2 kernel: Call Trace:
Dec  7 10:27:47 ypestis2 kernel: ? nvkm_pmu_reset+0x94/0x170 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_pmu_preinit+0x12/0x20 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_subdev_preinit+0x34/0x110 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_device_init+0x60/0x270 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_udevice_init+0x48/0x60 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_object_init+0x3f/0x190 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_object_init+0xa3/0x190 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_object_init+0xa3/0x190 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvkm_client_resume+0xe/0x10 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nvif_client_resume+0x17/0x20 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nouveau_do_resume+0x40/0xe0 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: nouveau_pmops_runtime_resume+0x91/0x150 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: ? pci_restore_standard_config+0x50/0x50
Dec  7 10:27:47 ypestis2 kernel: pci_pm_runtime_resume+0x7a/0xa0
Dec  7 10:27:47 ypestis2 kernel: __rpm_callback+0xc2/0x200
Dec  7 10:27:47 ypestis2 kernel: rpm_callback+0x24/0x80
Dec  7 10:27:47 ypestis2 kernel: ? pci_restore_standard_config+0x50/0x50
Dec  7 10:27:47 ypestis2 kernel: rpm_resume+0x4bf/0x7b0
Dec  7 10:27:47 ypestis2 kernel: ? __handle_mm_fault+0xb24/0x10b0
Dec  7 10:27:47 ypestis2 kernel: __pm_runtime_resume+0x4e/0x80
Dec  7 10:27:47 ypestis2 kernel: nouveau_drm_ioctl+0x3d/0xc0 [nouveau]
Dec  7 10:27:47 ypestis2 kernel: do_vfs_ioctl+0xa5/0x600
Dec  7 10:27:47 ypestis2 kernel: SyS_ioctl+0x79/0x90
Dec  7 10:27:47 ypestis2 kernel: entry_SYSCALL_64_fastpath+0x1a/0xa5
Dec  7 10:27:47 ypestis2 kernel: RIP: 0033:0x7f8ff1488dc7
Dec  7 10:27:47 ypestis2 kernel: RSP: 002b:00007ffe011c70d8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Dec  7 10:27:47 ypestis2 kernel: RAX: ffffffffffffffda RBX: 0000000001a58310 RCX: 00007f8ff1488dc7
Dec  7 10:27:47 ypestis2 kernel: RDX: 00007ffe011c7110 RSI: 00000000c05064a7 RDI: 000000000000000d
Dec  7 10:27:47 ypestis2 kernel: RAX: ffffffffffffffda RBX: 0000000001a58310 RCX: 00007f8ff1488dc7
Dec  7 10:27:47 ypestis2 kernel: RDX: 00007ffe011c7110 RSI: 00000000c05064a7 RDI: 000000000000000d
Dec  7 10:27:47 ypestis2 kernel: RBP: 000000000181f230 R08: 0000000001a416f0 R09: 0000000000000000
Dec  7 10:27:47 ypestis2 kernel: R10: 0000000000000000 R11: 0000000000003246 R12: 000000000084bf80
Dec  7 10:27:47 ypestis2 kernel: R13: 0000000000000179 R14: 0000000000000000 R15: 000000000084bf80
Dec  7 10:27:47 ypestis2 kernel: Code: 5e ff ff ff b8 ff ff 00 00 5d c3 0f 1f 80 00 00 00 00 48 81 ff ff ff 03 00 77 0e 48 81 ff 00 00 01 00 76 08 0f b7 d7 ed c3 8b 07 <c3> 55 48 c7 c6 89 ae cc 9e 48 89 e5 e8 26 ff ff ff b8 ff ff ff 
Dec  7 10:27:48 ypestis2 abrt-dump-journal-oops[2471]: abrt-dump-journal-oops: Found oopses: 1
Dec  7 10:27:48 ypestis2 abrt-dump-journal-oops[2471]: abrt-dump-journal-oops: Creating problem directories
Dec  7 10:27:49 ypestis2 abrt-dump-journal-oops[2471]: Reported 1 kernel oopses to Abrt
Dec  7 10:28:15 ypestis2 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [Xorg:2583]
Dec  7 10:28:15 ypestis2 kernel: Modules linked in: nls_utf8 isofs vfat fat uas usb_storage rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc rmi_smbus rmi_core arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp iTCO_wdt iTCO_vendor_support mei_wdt coretemp kvm_intel kvm iwlmvm snd_hda_codec_realtek snd_hda_codec_hdmi
Dec  7 10:28:15 ypestis2 kernel: snd_hda_codec_generic irqbypass intel_cstate intel_uncore mac80211 snd_hda_intel intel_rapl_perf snd_hda_codec snd_usb_audio btusb snd_hda_core snd_usbmidi_lib snd_rawmidi snd_hwdep btrtl snd_seq btbcm btintel snd_seq_device uvcvideo bluetooth iwlwifi snd_pcm videobuf2_vmalloc cfg80211 videobuf2_memops videobuf2_v4l2 videobuf2_core i2c_i801 wmi_bmof lpc_ich videodev joydev thinkpad_acpi snd_timer snd media mei_me ecdh_generic tpm_tis mei tpm_tis_core ie31200_edac soundcore shpchp rfkill tpm dm_crypt hid_logitech_hidpp hid_logitech_dj i915 nouveau crct10dif_pclmul crc32_pclmul mxm_wmi ttm crc32c_intel e1000e i2c_algo_bit drm_kms_helper sdhci_pci sdhci ghash_clmulni_intel serio_raw drm mmc_core ptp pps_core wmi video
~~~
Comment 7 Ben Skeggs 2017-12-07 14:02:03 EST
Can you upload the full journalctl log from after this has occurred?
Comment 8 Sam Roza 2017-12-07 14:19:07 EST
I will upload this most recent crash/hang for you.

Thanks.

-SR
Comment 9 Sam Roza 2017-12-07 14:19 EST
Created attachment 1364445 [details]
most recent crash
Comment 10 Ben Skeggs 2017-12-07 17:06:52 EST
I was somewhat interested in the lines *before* the backtrace too :)
Comment 11 Sam Roza 2017-12-07 17:12:59 EST
Ben, you're right and I got way too focused on that last attachment. My apologies. 

Looking at the logs, I think you'll see some thermal warnings at 1020, but as you probably are well aware, that's just a "feature" of using Bluejeans. :)

Let me know if you want a bigger scope, but this should give you ~7 minutes before, until just after the reboot.

-SR
Comment 12 Sam Roza 2017-12-07 17:13 EST
Created attachment 1364508 [details]
updated journalctl entries from most recent crash
Comment 13 Ben Skeggs 2017-12-07 17:28:10 EST
"Dec  7 10:27:20 ypestis2 kernel: nouveau 0000:01:00.0: Refused to change power state, currently in D3"

This is what I was looking for, and wondering if was happening.  There's a number of later model laptops where Linux's PCI and/or ACPI (I don't really know where the bug lies exactly TBH) layers don't correctly handle PCI power states, and as far as I know, hasn't been resolved yet.

I was just attempting to find the relevant kernel bug for it, where the appropriate developers are/were watching, but couldn't find it.

The end result is that when nouveau is handed control of the device again after it's been powered-down for inactivity, it hasn't actually been powered on correctly, resulting in the GPU being in a very weird state.

You should be able to work around this by appending "nouveau.runpm=0" to your kernel options, which will prevent the GPU being powered down.
Comment 14 Sam Roza 2017-12-07 19:30:42 EST
Yep, that works around the issue. It has been present for probably 2 years at this point, and in that whole time, I've never been able to reboot my laptop without having to power it off using the power button. It will reboot without protest, now.

Part of me would like to see that we have a properly-filed bug to be followed up on. I know that I'm not the only one who suffers from this issue, per memo-list. 

-SR

Note You need to log in before you can comment on or make changes to this bug.