Hide Forgot
Hi, When starting a domain with virgl support enabled, qemu complains that /var/lib/libvirt/.cache doesn't exits. (On debian /var/lib/libvirt/ is the home for the libvirt user) AFAICS, mesa is using this directory to cache the generated shaders. Mesa tries this location because it the XDG location for the shader cache files. IMVHO, this should be moved to /var/cache/libvirt/ when running a system domain by setting XDG_CACHE_HOME to that value.
Looking at mesa code MESA_GLSL_CACHE_DIR could be set as well But I'm starting to wonder if it's not a mesa bug because the error message seems to suggests that it's disabling the feature instead of failing completely
I'm wondering why no one else has reported that problem yet, because on Fedora 'qemu' user home directory is "/", so QEMU should get the same permissions denied if attempting to use $HOME/.cache
Could be related to mesa version? Debian unstable has 18.2.6
I am running into this issue on ArchLinux as well when I am trying to create a Fedora 29 VM. Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: Failed to create //.cache for shader cache (Permission denied)---disabling.' Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/create.py", line 2119, in _do_async_install guest.installer_instance.start_install(guest, meter=meter) File "/usr/share/virt-manager/virtinst/installer.py", line 419, in start_install doboot, transient) File "/usr/share/virt-manager/virtinst/installer.py", line 362, in _create_guest domain = self.conn.createXML(install_xml or final_xml, 0) File "/usr/lib/python3.7/site-packages/libvirt.py", line 3726, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: Failed to create //.cache for shader cache (Permission denied)---disabling.
I can confirm this behaviour on Arch Linux when trying to create VMs of any Distro, but only when i enable "OpenGL" on the Virtio-GPU. When i start the qcow2 image file with QEMU directly by shell, i works with Virtio-GPU and OpenGL enabled! Here is another user (in Arch Forum) with the same exact issue: https://bbs.archlinux.org/viewtopic.php?id=243555 **Here is the error message when trying to start the VM in virt-manager with OpenGL enabled:** Fehler beim Starten der Domain: Interner Fehler: qemu unexpectedly closed the monitor: Failed to create //.cache for shader cache (Permission denied)---disabling. Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 66, in newfn ret = fn(self, *args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1420, in startup self._backend.create() File "/usr/lib/python3.7/site-packages/libvirt.py", line 1080, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirt.libvirtError: Interner Fehler: qemu unexpectedly closed the monitor: Failed to create //.cache for shader cache (Permission denied)---disabling. **And when starting the same qcow2 VM directly with QEMU in shell it works with OpenGL with this command:** sudo qemu-system-x86_64 -enable-kvm -M q35 -smp 2 -m 2G -hda test.qcow2 -net nic,model=virtio -net user,hostfwd=tcp::2222-:22 -vga virtio -display sdl,gl=on This is my very first Linux bug report, so please just tell me if i missed something or did something wrong! Thank you for your great work with virt-manager, and in advance for any help with this issue, if real bug or not! I tried to include all info that could be relevant: **Here is the config of Virt-Manager for this VM:** IMGUR: https://imgur.com/a/QbFvmy6 **Here are all 3 different settings i tried on the display page to enable OpenGL and their corresponding errors:** IMGUR: https://imgur.com/a/oNM6Jsj **Here my Settings as text - While the above provided pictures are in more detail:** GUEST OS: Antergos KDE, Budgie, Mate vCPU: 2 - Copy from Host RAM: 2/3GB CHIPSET: Q35 DISK: Virtio 20GB - Default Settings GPU: Virtio w/ 3D-acceleration DISPLAY: Spice w/ OpenGL NETWORK: Virtio - Standard NAT CDDROM: SATA - Only for Setup USB: 3.0 OTHERS: Defaults PASSTHROUGH: None **INFOS - SYSTEM - SOFTWARE - DESKTOP:** OS: Arch DE: KDE - Modded with Latte Dock + Extensions. No extra themes ATM. Suru Icon pack. BOOTLOADER: systemD UEFI boot SOFTWARE: Mix of all types, KDE+GTK+Java+Electron, you name it. Around some hundred user installed packages. GPU-DRIVER: AMD open source UEFI OR BIOS: Latest UEFI VIRTUALIZATION ENABLED IN UEFI: Yes KERNEL: Latest stable Arch releases MESA: Latest stable Arch releases **INFOS - SYSTEM - HARDWARE - DESKTOP:** MAINBOARD: AMD AM4 - ASUS ROG Strix B450-F Gaming CPU: AMD Ryzen 7 1700 (8C/16T + NO IGP) GPU: Sapphire Radeon RX 570 Nitro 8GB RAM: 16GB = 2X8GB DDR4 non-ECC 2666MHZ Crucial Ballistix Sport LT LAN: 1Gb/S Intel (Luckily NO Realtek) SYSTEM-DISK: Samsung EVO 960 M2 NVME 250GB - LUKS HOME DIR DISK: Samsung EVO 850 SATA 500GB - LUKS HDDS: 4 X 8TB Seagate Ironwolf NAS HDDs 7200 RPM (Only attached to power when needed for Backup from my FreeNAS. LVM RAID5) DISPLAY: LG 4K HDR WebOS TV @ HDMI 2.0 INPUT: USB-wired Mouse + Keyboard
(In reply to Laurent Bigonville from comment #0) > Hi, > > When starting a domain with virgl support enabled, qemu complains that > /var/lib/libvirt/.cache doesn't exits. (On debian /var/lib/libvirt/ is the > home for the libvirt user) > > AFAICS, mesa is using this directory to cache the generated shaders. > > Mesa tries this location because it the XDG location for the shader cache > files. > > IMVHO, this should be moved to /var/cache/libvirt/ when running a I think we could override XDG_CACHE_HOME to point to /var/lib/libvirt/qemu/<domain_name>/.cache (possibly overriding more of the XDG envs).
(In reply to Laurent Bigonville from comment #3) > Could be related to mesa version? Debian unstable has 18.2.6 I'm running fedora 29 and with both mesa 18.2.8 and 18.3.3 I can launch a VM with a virtio GPU. However, I don't see the "mesa_shader_cache" directory being created anywhere which I suppose means that the on-disk cache is disabled, either by default on my distro or something explicitly defines MESA_GLSL_CACHE_DISABLE which I don't see anywhere to be defining on my system, Gerd do you have any idea what I'm missing here?
(In reply to Erik Skultety from comment #7) > (In reply to Laurent Bigonville from comment #3) > > Could be related to mesa version? Debian unstable has 18.2.6 > > I'm running fedora 29 and with both mesa 18.2.8 and 18.3.3 I can launch a VM > with a virtio GPU. However, I don't see the "mesa_shader_cache" directory > being created anywhere which I suppose means that the on-disk cache is > disabled, either by default on my distro or something explicitly defines > MESA_GLSL_CACHE_DISABLE which I don't see anywhere to be defining on my > system, Gerd do you have any idea what I'm missing here? No clue, never seen that. Possibly only some mesa drivers use this cache, so it could depend on the host gpu hardware whenever you run into this or not. Intel graphics here. Comment 9 says radeon (so ati/amd).
This problem suddenly popped-up for me today on Fedora 29 4.20.8-200.fc29.x86_64 after months of trouble free usage. Ryzen 2400G. G
(In reply to gobbledegeek from comment #9) > This problem suddenly popped-up for me today on Fedora 29 > 4.20.8-200.fc29.x86_64 after months of trouble free usage. Ryzen 2400G. > > G Thanks for the input, however, I don't have access to any HW with AMD/ATI graphics at the moment, I need to look around our machine reservation system to reproduce.
My bad - looking closer the errors that popped up for me are a little different: Error starting domain: internal error: qemu unexpectedly closed the monitor: Failed to create //.cache/mesa_shader_cache for shader cache (Permission denied)---disabling. Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 66, in newfn ret = fn(self, *args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1420, in startup self._backend.create() File "/usr/lib64/python3.7/site-packages/libvirt.py", line 1080, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: Failed to create //.cache/mesa_shader_cache for shader cache (Permission denied)---disabling. Does this need a separate CR? G
I've just triggered this on my home machine; 'Failed to create //.cache for shader cache (Permission denied)--disabling. This is f29 with a Radeon RX550 using the default/open/radv mesa drivers.
(In reply to Erik Skultety from comment #6) > I think we could override XDG_CACHE_HOME to point to > /var/lib/libvirt/qemu/<domain_name>/.cache (possibly overriding more of the > XDG envs). Yes, we should definitely set XDG_CACHE_HOME, even for the unprivileged libvirtd, since we need to ensure that each guest gets its own distinct cache directory. We can't have a single shared cache as SELinux will block that.
KVM/QEMU/Virtual Machine Manager: Error starting domain: internal error: process exited while connecting to monitor: Failed to create //.cache for shader cache (Permission denied)---disabling. This Fedora29 VM was working properly on Fedora29 host until running dnf update on 27 Feb (updated the host). The VM still runs with virtio video but I must disable OpenGL in the Spice Server settings. This reduces the performance significantly. I'm a noob, but can follow instructions to retrieve logs, etc. My system details can be found here if any help: https://linux-hardware.org/?probe=e6dca4d6fd Sincere Thanks, Adam
I don't have AMD HW personally, so I'm trying to find some to test reliably, although I can see the error with Intel too, except that with Intel, Mesa will disable the cache if it fails to create one, so the VM still starts. Anyhow, I prepared a simple patch for libvirt, but it crashes in Mesa's Intel driver i965_dri, so I can't verify whether the fix will suffice. If anyone can give my patch a try and verify it works with AMD HW, I can proceed with proposing it upstream and file a bug against Mesa for Intel driver. see my github branch for the patch: https://github.com/eskultety/libvirt/commits/xdg-vars
Tested on my home box; with that libvirt it gets further and falls over on mesa trying to do a pthread_setaffinity (which I saw a different report of). If I turn off sandobxing in qemu.conf (which is of course a bad thing) - the VM starts; not tried a modern guest yet.
there's some other bugs somewhere as well; with an f29 boot iso: a) The bios boot just shows as mush b) and the main OS still shows mush.
Proposed the libvirt changes: https://www.redhat.com/archives/libvir-list/2019-March/msg00323.html
The patches are now upstream: commit 2d69af29073aa3dc2dc5b79afcecaa703b81125a Refs: v5.1.0-248-g2d69af2907 Author: Erik Skultety <eskultet> AuthorDate: Mon Mar 4 12:47:08 2019 +0100 Commit: Erik Skultety <eskultet> CommitDate: Fri Mar 15 16:41:26 2019 +0100 util: command: Introduce virCommandAddEnvXDG helper Some modules/libraries within QEMU could make use of the XDG_ vars when writing their data to the disk. Define the most common XDG variables and point them to the specific driver's libDir, i.e. XDG_CACHE_HOME -> /var/lib/libvirt/<driver>/.cache XDG_DATA_HOME -> /var/lib/libvirt/<driver>/.local/share XDG_CONFIG_HOME -> /var/lib/libvirt/<driver>/.config Signed-off-by: Erik Skultety <eskultet> Reviewed-by: Daniel P. Berrangé <berrange> commit 7e73137495334c9995543e87897237e8a88f6c9b Refs: v5.1.0-249-g7e73137495 Author: Erik Skultety <eskultet> AuthorDate: Fri Mar 8 12:15:07 2019 +0100 Commit: Erik Skultety <eskultet> CommitDate: Fri Mar 15 16:41:26 2019 +0100 qemu: command: Enforce setting XDG variables for system QEMU For session mode, only XDG_CACHE_HOME is set, because we want to remain integrating with services in user session, but for system mode, this would have become reading/writing to '/' which carries the obvious issue with permissions (also, '/' is the wrong location in 99.9% cases anyway). Signed-off-by: Erik Skultety <eskultet> Reviewed-by: Daniel P. Berrangé <berrange> commit 75a916988100b8ecdd72b4715ff3b20fa6e99f53 Refs: v5.1.0-250-g75a9169881 Author: Erik Skultety <eskultet> AuthorDate: Wed Mar 6 13:29:01 2019 +0100 Commit: Erik Skultety <eskultet> CommitDate: Fri Mar 15 16:41:26 2019 +0100 qemu: command: Override HOME variable for system QEMU By default, qemu user's home dir points to '/' which shouldn't be used at all. We therefore pass the HOME variable from the current variable iff not running as SUID, which means that for systemd we never set it. This patch makes sure, that for system QEMU this is always set to libDir/<driver>, session mode is left untouched. Signed-off-by: Erik Skultety <eskultet> Reviewed-by: Daniel P. Berrangé <berrange> Beware, seccomp-sandboxing still needs to be turned off as mentioned in comment 16 in order for this to work properly.
Erik, >>Beware, seccomp-sandboxing still needs to be turned off as mentioned in comment 16 in order for this to work properly. Thanks for your work on this. but IMHO we do not have a complete resolution to this as yet - if security has to be compromised in order for libvirtd to run. Is there a case for a different CR to be opened in order to plug the need to disable qemu sandboxing? Thanks G
There is a patch pending in QEMU to resolve the seccomp problem by returning EPERM from the syscall instead of killing QEMU https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04413.html
(In reply to gobbledegeek from comment #20) > Erik, > > >>Beware, seccomp-sandboxing still needs to be turned off as mentioned in comment 16 in order for this to work properly. > > Thanks for your work on this. but IMHO we do not have a complete resolution > to this as yet - if security has to be compromised in order for libvirtd to > run. > Is there a case for a different CR to be opened in order to plug the need to > disable qemu sandboxing? Not sure I understand your thoughts. From libvirt's POV, I don't think we could do more here, one thing I forgot to do above was to link the ubuntu issue filed for QEMU mentioned by Dave: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889 Apparently, Mesa reverted setting the thread affinity which means the qemu process should not get killed anymore, so you might want to give that Mesa build a try with seccomp enabled in the qemu config and see whether it works for you.