Bug 1463157
Summary: | [GK106] GTX 660 freeze computer shortly after login | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Tomas Pelka <tpelka> |
Component: | xorg-x11-drv-nouveau | Assignee: | Ben Skeggs <bskeggs> |
Status: | CLOSED WONTFIX | QA Contact: | Desktop QE <desktop-qa-list> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.4 | CC: | bskeggs, jan.public, kherbst, tpelka |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-11-11 21:47:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1547138 |
Description
Tomas Pelka
2017-06-20 09:10:29 UTC
This freeze is actually also followed by crash: Jun 20 11:05:31 localhost.localdomain kernel: kworker/u16:3 D 0000000000000246 0 339 2 0x00000000 Jun 20 11:05:31 localhost.localdomain kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau] Jun 20 11:05:31 localhost.localdomain kernel: ffff880506acfc00 0000000000000046 ffff880506ad0000 ffff880506acffd8 Jun 20 11:05:31 localhost.localdomain kernel: ffff880506acffd8 ffff880506acffd8 ffff880506ad0000 0000000000000000 Jun 20 11:05:31 localhost.localdomain kernel: ffff880506ad0000 7fffffffffffffff ffff8804eeafe540 0000000000000246 Jun 20 11:05:31 localhost.localdomain kernel: Call Trace: Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff816a6f09>] schedule+0x29/0x70 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff816a4a19>] schedule_timeout+0x239/0x2c0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff811de381>] ? __slab_free+0x81/0x2f0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff8145ec9f>] dma_fence_default_wait+0x1cf/0x230 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff8145e9a0>] ? dma_fence_free+0x20/0x20 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff8145e889>] dma_fence_wait_timeout+0x39/0xd0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffffc018cc0d>] drm_atomic_helper_wait_for_fences+0x7d/0x100 [drm_kms_helper] Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffffc028e095>] nv50_disp_atomic_commit_tail+0x55/0x1180 [nouveau] Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffffc028f1d2>] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810a87fa>] process_one_work+0x17a/0x440 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810a94c6>] worker_thread+0x126/0x3c0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810a93a0>] ? manage_workers.isra.24+0x2a0/0x2a0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810b096f>] kthread+0xcf/0xe0 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810b08a0>] ? insert_kthread_work+0x40/0x40 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff816b2958>] ret_from_fork+0x58/0x90 Jun 20 11:05:31 localhost.localdomain kernel: [<ffffffff810b08a0>] ? insert_kthread_work+0x40/0x40 Jun 2 One more thing, seem I can 100% reproduce by logging in gnome-session and playing video (big buck cunny trailer, ogv) in totem. Kernel shows: nouveau 0000:01:00.0: gr: TRAP ch 2 [023fad6000 X[1330]] Jun 20 11:14:00 localhost.localdomain kernel: nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000080 [ZETA_STORAGE_TYPE_MISMATCH] x = 80, y = 96, format = 0, storage type = fe Jun 20 11:14:00 localhost.localdomain kernel: nouveau 0000:01:00.0: gr: TRAP ch 2 [023fad6000 X[1330]] Jun 20 11:14:00 localhost.localdomain kernel: nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000080 [ZETA_STORAGE_TYPE_MISMATCH] x = 160, y = 320, format = 0, storage type = fe Jun 20 11:14:04 localhost.localdomain kernel: nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] Jun 20 11:14:04 localhost.localdomain kernel: nouveau 0000:01:00.0: fifo: gr engine fault on channel 4, recovering... and desktop freeze I was able to trigger this issue also by libreoffice presentation mode. I can reproduce on 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK110 [GeForce GTX 780] [10de:1004] (rev a1) too Tomas, Can you reproduce this on 7.5? Thanks, Ben. (In reply to Ben Skeggs from comment #7) > Tomas, > > Can you reproduce this on 7.5? > > Thanks, > Ben. Tomas please have a look. Thanks -Tom I can reproduce it on 7.5 with kernel-3.10.0-862.el7.x86_64. Desktop froze when playing video, installing libreoffice-impress and moving lo-impress window. Kernel call trace from journalctl: Apr 17 14:04:38 localhost.localdomain kernel: INFO: task kworker/u16:5:343 blocked for more than 120 seconds. Apr 17 14:04:38 localhost.localdomain kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 17 14:04:38 localhost.localdomain kernel: kworker/u16:5 D ffff943407281fa0 0 343 2 0x00000000 Apr 17 14:04:38 localhost.localdomain kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: Call Trace: Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc02fd307>] ? nvkm_client_notify_get+0x27/0x40 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc02feb5a>] ? nvkm_ioctl_ntfy_get+0x6a/0xc0 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff86512f49>] schedule+0x29/0x70 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff865108b9>] schedule_timeout+0x239/0x2c0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03af912>] ? nvkm_client_ioctl+0x12/0x20 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc02fc048>] ? nvif_object_ioctl+0x48/0x60 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03b266c>] ? nouveau_bo_rd32+0x2c/0x30 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03cea2e>] ? nv84_fence_read+0x2e/0x30 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03ccbfc>] ? nouveau_fence_no_signaling+0x2c/0x90 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff86295adc>] dma_fence_default_wait+0x1cc/0x220 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff862956a0>] ? dma_fence_release+0xa0/0xa0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff862954df>] dma_fence_wait_timeout+0x3f/0xe0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc02dc869>] drm_atomic_helper_wait_for_fences+0x69/0xe0 [drm_kms_helper] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03c27b5>] nv50_disp_atomic_commit_tail+0x55/0x1200 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff8651291c>] ? __schedule+0x41c/0xa20 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffffc03c3972>] nv50_disp_atomic_commit_work+0x12/0x20 [nouveau] Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85eb2dff>] process_one_work+0x17f/0x440 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85eb3ac6>] worker_thread+0x126/0x3c0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85eb39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85ebae31>] kthread+0xd1/0xe0 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85ebad60>] ? insert_kthread_work+0x40/0x40 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff8651f637>] ret_from_fork_nospec_begin+0x21/0x21 Apr 17 14:04:38 localhost.localdomain kernel: [<ffffffff85ebad60>] ? insert_kthread_work+0x40/0x40 Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7. From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. From the RHEL life cycle page: https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase "During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available." If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes: https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns. [0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7 |