Bug 1957758
Summary: | [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Michal Odehnal <modehnal> | ||||
Component: | kernel | Assignee: | Lyude <lyude> | ||||
kernel sub component: | Graphics | QA Contact: | Desktop QE <desktop-qa-list> | ||||
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | unspecified | CC: | airlied, coolmathgamesx, csoriano, dgilbert, kraxel, lyude, mkrajnak, philipp, pvlasin, rduda, wadehamptoniv | ||||
Version: | 9.0 | Keywords: | Regression, TestBlocker, Triaged | ||||
Target Milestone: | beta | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-09-10 09:49:28 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Michal Odehnal
2021-05-06 12:20:13 UTC
Reproduced with xorg-x11-drv-qxl-0.1.5-21.el9.x86_64 kernel-5.12.0-1.el9.x86_64 VM guest is unusable after a while. Created attachment 1780645 [details]
dmesg
This is not reproducible without spice graphics in guest qemu cmd-line (In reply to Radek Duda from comment #3) > This is not reproducible without spice graphics in guest qemu cmd-line Not true, I can reproduce without spice now Fix (landed upstream in 5.13-rc1): commit 4fff19ae427548d8c37260c975a4b20d3c040ec6 Author: Gerd Hoffmann <kraxel> Date: Wed Feb 17 13:32:05 2021 +0100 drm/qxl: use ttm bo priorities Allow to set priorities for buffer objects. Use priority 1 for surface and cursor command releases. Use priority 0 for drawing command releases. That way the short-living drawing commands are first in line when it comes to eviction, making it *much* less likely that ttm_bo_mem_force_space() picks something which can't be evicted and throws an error after waiting a while without success. > Fix (landed upstream in 5.13-rc1):
note/patch sent to stable@, so the fix should land in 5.{10,11,12} stable branches soon.
In latest RHEL-9 compose I am now seeing similar issue under Xorg, may I assume it is related or is this a different bug? [ 2000.095786] f 4026531864#17936: failed to wait on release 24 after spincount 301 [ 2000.419825] f 4026531864#17936: failed to wait on release 24 after spincount 301 [ 2000.490606] [TTM] Buffer eviction failed [ 2000.490882] qxl 0000:00:01.0: object_init failed for (262144, 0x00000001) [ 2000.491346] [drm:qxl_gem_object_create [qxl]] *ERROR* Failed to allocate GEM object (258580, 1, 4096, -12) [ 2000.491983] [drm:qxl_alloc_ioctl [qxl]] *ERROR* qxl_alloc_ioctl: failed to create gem ret=-12 (In reply to Michal Odehnal from comment #7) > In latest RHEL-9 compose I am now seeing similar issue under Xorg, may I > assume it is related or is this a different bug? > > [ 2000.095786] f 4026531864#17936: failed to wait on release 24 after > spincount 301 > [ 2000.419825] f 4026531864#17936: failed to wait on release 24 after > spincount 301 > [ 2000.490606] [TTM] Buffer eviction failed > [ 2000.490882] qxl 0000:00:01.0: object_init failed for (262144, 0x00000001) > [ 2000.491346] [drm:qxl_gem_object_create [qxl]] *ERROR* Failed to allocate > GEM object (258580, 1, 4096, -12) > [ 2000.491983] [drm:qxl_alloc_ioctl [qxl]] *ERROR* qxl_alloc_ioctl: failed > to create gem ret=-12 Same thing most likely, this is a kernel issue affecting both xorg and wayland. > note/patch sent to stable@, so the fix should land in 5.{10,11,12} stable
> branches soon.
v5.12.4 has the fix now (v5.11.21 too, v5.10.x should follow shortly).
I believe this will be part of the stable backport. Lyude, make sure you include this one. My bad, this is RHEL 9. It should be fixed in any future kernel rebase that includes that kernel version, which I don't know when is planned to happen. From the graphics team perspective, we are not currently backporting fixes in RHEL 9 beta kernel, due to capacity limitations. We are relaying on the regular kernel rebases. hmm, I'm seeing something similar in 5.14.0-0.rc4.35.el9 guest, just at the text console; just with lots of stuff scrolling past it'll pause and get a Buffer eviction failed. object_init failed for (3149824, 0x0....1) qxl_alloc_bo_reserved [qxl]]] *ERROR* failed to alloate VRAM BO Is this bug still being seen on the latest RHEL9 kernels? (In reply to Lyude from comment #16) > Is this bug still being seen on the latest RHEL9 kernels? Seems there are some rare cases where the priority (comment #5) doesn't help. Test case: run "for i in $(seq 1 1000); do dmesg; done" on fbcon. Hangs now and then for a short time, logging an eviction and allocation failure (see also comment #13): [ 582.893166] [TTM] Buffer eviction failed [ 582.899206] qxl 0000:00:01.0: object_init failed for (3149824, 0x00000001) [ 582.900706] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO The good news: Error handling is solid now, can't see any bad effects on driver stability. I have not seen this for quite a while so I would consider this solved from my point of view. I did not meet the bug using latest rhel-9. Concerning my findings and comment #c18 closing this. (In reply to Dr. David Alan Gilbert from comment #13) > hmm, I'm seeing something similar in 5.14.0-0.rc4.35.el9 guest, just at the > text console; just with lots of stuff scrolling past it'll pause > and get a Buffer eviction failed. > object_init failed for (3149824, 0x0....1) > qxl_alloc_bo_reserved [qxl]]] *ERROR* failed to alloate VRAM BO Yup, I'm seeing this on F34 guest running in Qemu/KVM on a CentOS 8 host. 5.14.16-201.fc34.x86_64 is my kernel. Updated as of last Friday to latest on F34 repo. Also on a text-only console for a server. Please reopen. (In reply to Philip Prindeville from comment #21) > (In reply to Dr. David Alan Gilbert from comment #13) > > hmm, I'm seeing something similar in 5.14.0-0.rc4.35.el9 guest, just at the > > text console; just with lots of stuff scrolling past it'll pause > > and get a Buffer eviction failed. > > object_init failed for (3149824, 0x0....1) > > qxl_alloc_bo_reserved [qxl]]] *ERROR* failed to alloate VRAM BO > > Yup, I'm seeing this on F34 guest running in Qemu/KVM on a CentOS 8 host. > > 5.14.16-201.fc34.x86_64 is my kernel. > > Updated as of last Friday to latest on F34 repo. > > Also on a text-only console for a server. > > Please reopen. I should add, I'm *not* seeing this line anywhere in dmesg: [TTM] Buffer eviction failed I am seeing this message on an AlmaLinux 8.5 guest running on an AlmaLinux 8.5 host, both fully updated (as of a few days ago). (In reply to Philip Prindeville from comment #21) > (In reply to Dr. David Alan Gilbert from comment #13) > > hmm, I'm seeing something similar in 5.14.0-0.rc4.35.el9 guest, just at the > > text console; just with lots of stuff scrolling past it'll pause > > and get a Buffer eviction failed. https://slopegame3d.com > > object_init failed for (3149824, 0x0....1) > > qxl_alloc_bo_reserved [qxl]]] *ERROR* failed to alloate VRAM BO > > Yup, I'm seeing this on F34 guest running in Qemu/KVM on a CentOS 8 host. > > 5.14.16-201.fc34.x86_64 is my kernel. > > Updated as of last Friday to latest on F34 repo. > > Also on a text-only console for a server. > > Please reopen. If docs needed, set a value |