Bug 1982746

Summary: Mesa reuses shader cache created for different CPU causing gnome-shell crash
Product: Red Hat Enterprise Linux 8 Reporter: zzambers
Component: mesaAssignee: Dave Airlie <airlied>
Status: CLOSED ERRATA QA Contact: Peter Kopec <pekopec>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.4CC: csoriano, tpelka
Target Milestone: beta   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-09 18:37:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Deadline: 2021-07-26   
Attachments:
Description Flags
gnome-shell-crash-stacks.txt none

Description zzambers 2021-07-15 15:26:45 UTC
Mesa (llvmpipe) reuses shader cache created for different CPU causing gnome-shell crash. Problem was hit by gnome-shell process in scenario, where VM image with gui was created/updated on one machine and then image was used to create VM on another machine (older/different CPU). Gnome-shell process then crashes due to use of invalid instruction.

from dmesg:
[   13.558905] traps: gnome-shell[1543] trap invalid opcode ip:7fe25c0db024 sp:7ffd73cbda20 error:0

Issue can be workarounded by manually cleaning shader cache in:
~/.cache/mesa_shader_cache


Seems like mesa generates shader code for newer CPU and than tries to reuse that code on older CPU even though it does not support all necessary extensions. Mesa should detect this and clear it's cache automatically.


Steps to reproduce:
- start rhel-8 with gui on machine with newer cpu, so that shader cache is  populated, than should it down
- export it's image (disk)
- use exported image on machine with different/older CPU

packages:

mesa:
mesa-dri-drivers-20.3.3-2.el8.x86_64
mesa-filesystem-20.3.3-2.el8.x86_64
mesa-libEGL-20.3.3-2.el8.x86_64
mesa-libgbm-20.3.3-2.el8.x86_64
mesa-libGL-20.3.3-2.el8.x86_64
mesa-libglapi-20.3.3-2.el8.x86_64
mesa-libGLU-9.0.0-15.el8.x86_64
mesa-libxatracker-20.3.3-2.el8.x86_64

gnome-shell:
gnome-shell-3.32.2-30.el8.x86_64

Comment 1 zzambers 2021-07-15 15:33:12 UTC
Created attachment 1801912 [details]
gnome-shell-crash-stacks.txt

Comment 2 Dave Airlie 2021-07-15 23:54:54 UTC
this has likely been fixed upstream by


commit 9520b70f75d7a695966f36ff619557c88c25a0dc
Author: Dave Airlie <airlied>
Date:   Mon May 24 05:19:43 2021 +1000

    llvmpipe: add the interesting bit of cpu detection to the cache.
    
    This should detect if someone changes CPU configuration that matters like in a VM
    
    Reviewed-by: Emma Anholt <emma>
    Fixes: 6c0c61cb48e8 ("llvmpipe: add infrastructure for disk cache support")
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10946>

This should be in the 21.1.3 packages we are planning for RHEL 8.5

MESA_GLSL_CACHE_DISABLE=1 in the env should also work around it.

Comment 5 Peter Kopec 2021-07-27 15:01:25 UTC
this issue is fixed with packages from rhel 8.5 compose, setting verified tested

Comment 7 Peter Kopec 2021-07-30 07:58:52 UTC
tested with latest packages, issue is fixed

Comment 9 errata-xmlrpc 2021-11-09 18:37:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (mesa and related packages bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:4234