Bug 1494191
| Summary: | WARNING: CPU: 0 PID: 1300 at drivers/gpu/drm/nouveau/nouveau_bo.c:137 nouveau_bo_del_ttm+0x79/0x80 [nouveau] | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Alan Matsuoka <alanm> |
| Component: | xorg-x11-drv-nouveau | Assignee: | Lyude <lyude> |
| Status: | CLOSED WONTFIX | QA Contact: | Desktop QE <desktop-qa-list> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.7 | CC: | alanm, bskeggs, dkochuka, fernando, jkoten, lyude |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-11-11 21:55:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1547138 | ||
|
Description
Alan Matsuoka
2017-09-21 15:51:34 UTC
abrt has been picking up these backtraces: bash-4.2$ cat oops-2017-09-15-17:51:12-56773-0/backtrace WARNING: CPU: 0 PID: 1300 at drivers/gpu/drm/nouveau/nouveau_bo.c:137 nouveau_bo_del_ttm+0x79/0x80 [nouveau] Modules linked in: fuse ipheth binfmt_misc nfsv3 nfs xt_CHECKSUM fscache iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter mvfs(OE) ip6_tables iptable_filter dm_mirror dm_region_hash dm_log dm_mod intel_powerclamp snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep kvm snd_seq snd_seq_device snd_pcm snd_timer irqbypass crc32_pclmul ghash_clmulni_intel snd iTCO_wdt aesni_intel dell_wmi dell_smbios sparse_keymap ppdev lrw gpio_ich iTCO_vendor_support sg soundcore pcspkr gf128mul glue_helper i7core_edac dcdbas ablk_helper i2c_i801 edac_core cryptd parport_pc parport shpchp lpc_ich nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod sr_mod cdrom crc_t10dif crct10dif_generic nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci libahci drm libata tg3 crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ptp pps_core i2c_core wmi CPU: 0 PID: 1300 Comm: X Tainted: G OE ------------ 3.10.0-693.2.1.el7.x86_64 #1 Hardware name: Dell Inc. Precision WorkStation T5500 /0CRH6C, BIOS A16 05/28/2013 0000000000000000 00000000b320290b ffff880036353b60 ffffffff816a3db1 ffff880036353ba0 ffffffff810879c8 0000008900000000 ffff8802ffb3b400 ffff88017a9dc000 ffff880306e5a1e8 ffff8802ffb3b400 ffff8802ffb3b400 Call Trace: [<ffffffff816a3db1>] dump_stack+0x19/0x1b [<ffffffff810879c8>] __warn+0xd8/0x100 [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20 [<ffffffffc023dfe9>] nouveau_bo_del_ttm+0x79/0x80 [nouveau] [<ffffffffc0142ebb>] ttm_bo_release_list+0xbb/0x1a0 [ttm] [<ffffffffc01432bc>] ttm_bo_release+0xfc/0x220 [ttm] [<ffffffffc0143409>] ttm_bo_unref+0x29/0x30 [ttm] [<ffffffffc024192e>] nouveau_gem_object_del+0x8e/0xf0 [nouveau] [<ffffffffc00e9869>] drm_gem_object_free+0x29/0x70 [drm] [<ffffffffc00e9bd8>] drm_gem_object_unreference_unlocked+0x48/0xb0 [drm] [<ffffffffc00e9cc9>] drm_gem_object_handle_unreference_unlocked+0x69/0xb0 [drm] [<ffffffffc00e9d63>] drm_gem_object_release_handle+0x53/0x90 [drm] [<ffffffffc00e9dff>] drm_gem_handle_delete+0x5f/0x90 [drm] [<ffffffffc00ea5d5>] drm_gem_close_ioctl+0x25/0x30 [drm] [<ffffffffc00eaedc>] drm_ioctl+0x20c/0x4b0 [drm] [<ffffffffc00ea5b0>] ? drm_gem_handle_create+0x40/0x40 [drm] [<ffffffff8109e922>] ? __set_current_blocked+0x42/0x70 [<ffffffff8103528e>] ? fpu_finit+0x1e/0x30 [<ffffffffc023a404>] nouveau_drm_ioctl+0x54/0xc0 [nouveau] [<ffffffff8121524d>] do_vfs_ioctl+0x33d/0x540 [<ffffffff812b780f>] ? file_has_perm+0x9f/0xb0 [<ffffffff812154f1>] SyS_ioctl+0xa1/0xc0 [<ffffffff816b5009>] system_call_fastpath+0x16/0x1b bash-4.2$ This appears to have been reported on other platforms elsewhere. https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-nouveau/+bug/1698450 https://bugzilla.redhat.com/show_bug.cgi?id=1449961 bash-4.2$ cat backtrace WARNING: CPU: 16 PID: 4286 at drivers/gpu/drm/nouveau/nouveau_bo.c:1212 nouveau_bo_move_ntfy+0xb8/0xc0 [nouveau] Modules linked in: mvfs(OE) nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter binfmt_misc dm_mirror dm_region_hash dm_log dm_mod sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi crc32_pclmul ghash_clmulni_intel snd_hda_intel aesni_intel dcdbas lrw gf128mul glue_helper ablk_helper cryptd snd_hda_codec iTCO_wdt sg snd_hda_core mei_wdt snd_hwdep iTCO_vendor_support pcspkr snd_seq snd_seq_device snd_pcm snd_timer snd shpchp soundcore i2c_i801 lpc_ich mei_me mei nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif crct10dif_generic nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci e1000e libahci libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ptp i2c_core pps_core wmi CPU: 16 PID: 4286 Comm: gnome-shell Tainted: G W OE ------------ 3.10.0-693.2.1.el7.x86_64 #1 Hardware name: Dell Inc. Precision T5610/0WN7Y6, BIOS A03 09/05/2013 0000000000000000 0000000098821002 ffff88043dbc38c0 ffffffff816a3db1 ffff88043dbc3900 ffffffff810879c8 000004bc506cbc00 ffff88085265cc00 ffff88043dbc3a00 ffff8808506cbc00 ffff8808506cbec8 ffff8808506cbc00 Call Trace: [<ffffffff816a3db1>] dump_stack+0x19/0x1b [<ffffffff810879c8>] __warn+0xd8/0x100 [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20 [<ffffffffc026f508>] nouveau_bo_move_ntfy+0xb8/0xc0 [nouveau] [<ffffffffc00b1b0e>] ttm_bo_handle_move_mem+0x22e/0x5a0 [ttm] [<ffffffffc00b26f3>] ? ttm_bo_mem_space+0x3b3/0x460 [ttm] [<ffffffff811df73c>] ? kmem_cache_alloc_trace+0x3c/0x200 [<ffffffffc00b1fc2>] ttm_bo_evict+0x142/0x2e0 [ttm] [<ffffffff81460019>] ? dma_fence_wait_timeout+0x39/0xd0 [<ffffffffc00b22c6>] ttm_mem_evict_first+0x166/0x1e0 [ttm] [<ffffffffc00b261d>] ttm_bo_mem_space+0x2dd/0x460 [ttm] [<ffffffffc00b2bea>] ttm_bo_validate+0xda/0x160 [ttm] [<ffffffffc00b2ea0>] ttm_bo_init+0x230/0x4b0 [ttm] [<ffffffffc02704ec>] nouveau_bo_new+0x1fc/0x340 [nouveau] [<ffffffffc026ef70>] ? nv10_bo_put_tile_region+0x80/0x80 [nouveau] [<ffffffffc0272da2>] nouveau_gem_new+0x82/0x140 [nouveau] [<ffffffffc0272ee9>] nouveau_gem_ioctl_new+0x89/0x160 [nouveau] [<ffffffffc010eedc>] drm_ioctl+0x20c/0x4b0 [drm] [<ffffffffc0272e60>] ? nouveau_gem_new+0x140/0x140 [nouveau] [<ffffffffc026b404>] nouveau_drm_ioctl+0x54/0xc0 [nouveau] [<ffffffff8121524d>] do_vfs_ioctl+0x33d/0x540 [<ffffffff812b780f>] ? file_has_perm+0x9f/0xb0 [<ffffffff812154f1>] SyS_ioctl+0xa1/0xc0 [<ffffffff816b5009>] system_call_fastpath+0x16/0x1b different backtrace but quite possibly the same problem Similar behavior was experienced with nvidia quadro nvs 295, nvs 310 and fx 1800 cards running nouveau driver with RH kernel 7.4, kernel 3.10.0-693.2.1.el7.x86_64 Do you still see this in 7.5? Actually going to dev_ack+ this bug because I'm able to reproduce this 100% of the time with rob's latest DRM backport on the ThinkPad W530 I've got over here in bss (which I can set you up with ssh credentials for, it's already in the red hat intranet). We've got some other bugs that are depending on this machine being able to work that are on the RPL: https://bugzilla.redhat.com/show_bug.cgi?id=1305618. (In reply to Lyude from comment #8) > Actually going to dev_ack+ this bug because I'm able to reproduce this 100% > of the time with rob's latest DRM backport on the ThinkPad W530 I've got > over here in bss (which I can set you up with ssh credentials for, it's > already in the red hat intranet). We've got some other bugs that are > depending on this machine being able to work that are on the RPL: > https://bugzilla.redhat.com/show_bug.cgi?id=1305618. I think that is yet a 3rd issue with an inadvertent/bogus warn_on splat in the rhel76 backport ;-) (first two splats don't appear to be the same issue.. the one in #c3 looks like it could be a userspace issue) (In reply to Ben Skeggs from comment #5) > Do you still see this in 7.5? reinstating needinfo was lost around the #c8 - #c11 confusions about an unrelated bug. Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7. From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. From the RHEL life cycle page: https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase "During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available." If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes: https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns. [0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |