Description of problem: Mesa 22.2 introduces a build option that allows building with or without support for some patent encumbered codecs such as h264 and h265 for encoding and decoding, and VC-1 for decoding. By default, none of these codecs are enabled, and this is the case with the most recent Mesa 22.2 RC that is currently in Fedora 37. Prior to this change, support for these codecs was built by default, which is the case with Mesa 22.1.x in Fedora 36. Building without support for these codecs has implications for hardware accelerated encode/decode support via VA-API. Version-Release number of selected component (if applicable): mesa-22.2.0~rc3-1.fc37 How reproducible: Always Steps to Reproduce: 1. Ensure libva and libva-utils are installed on Fedora 37 with the most recent version of mesa 2. Run vainfo in terminal of choice 3. Notice lack of support for the aforementioned codecs (h264, h265, VC-1) Actual results: libva info: VA-API version 1.15.0 libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so libva info: Found init function __vaDriverInit_1_15 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.15 (libva 2.15.0) vainfo: Driver version: Mesa Gallium driver 22.2.0-rc3 for RENOIR (renoir, LLVM 14.0.5, DRM 3.47, 5.19.6-300.fc37.x86_64) vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileJPEGBaseline : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointVLD VAProfileVP9Profile2 : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc Expected results: libva info: VA-API version 1.15.0 libva info: Trying to open /usr/lib64/dri/radeonsi_drv_video.so libva info: Found init function __vaDriverInit_1_15 libva info: va_openDriver() returns 0 vainfo: VA-API version: 1.15 (libva 2.15.0) vainfo: Driver version: Mesa Gallium driver 22.2.0-rc3 for RENOIR (renoir, LLVM 14.0.5, DRM 3.47, 5.19.6-300.fc37.x86_64) vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointVLD VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice VAProfileH264Main : VAEntrypointVLD VAProfileH264Main : VAEntrypointEncSlice VAProfileH264High : VAEntrypointVLD VAProfileH264High : VAEntrypointEncSlice VAProfileHEVCMain : VAEntrypointVLD VAProfileHEVCMain : VAEntrypointEncSlice VAProfileHEVCMain10 : VAEntrypointVLD VAProfileHEVCMain10 : VAEntrypointEncSlice VAProfileJPEGBaseline : VAEntrypointVLD VAProfileVP9Profile0 : VAEntrypointVLD VAProfileVP9Profile2 : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc
FEDORA-2022-7aafc1efd1 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2022-7aafc1efd1
FEDORA-2022-d6edd4beb0 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-d6edd4beb0
FEDORA-2022-7aafc1efd1 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2022-d6edd4beb0 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-d6edd4beb0` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-d6edd4beb0 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2022-d6edd4beb0 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.
This was disabled again in https://src.fedoraproject.org/rpms/mesa/c/94ef544b3f2125912dfbff4c6ef373fe49806b52?branch=rawhide
*** Bug 2130286 has been marked as a duplicate of this bug. ***
Question for mesa maintainers. In order to simplify the extendibility of having a 3rd party build of mesa with only missing codecs (vaapi backends), I think it would be wise to remove the existing mesa backend with only mp2 codec enabled. Thanks in advance for the understanding.
Or another way would be to put the vaapi-backend into a separate sub-package (so we can later swap with another sub-package without having to build the whole mesa package).
(In reply to Nicolas Chauvet (kwizart) from comment #9) > Or another way would be to put the vaapi-backend into a separate sub-package > (so we can later swap with another sub-package without having to build the > whole mesa package). Done in https://src.fedoraproject.org/rpms/mesa/c/07e1e0b1628d9c55d3858c4655409768c5c0b5de?branch=f37
FEDORA-2022-1a1059f24e has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-1a1059f24e
FEDORA-2022-1a1059f24e has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-1a1059f24e` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-1a1059f24e See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
The now obsolete update broke both AMD and Nvidia hardware running nouveau driver due to missing vaapi drives split from mesa package. The suggestion is adding missing dependency.
Proposed as a Blocker for 37-final by Fedora user pwalter using the blocker tracking app because: https://bodhi.fedoraproject.org/updates/FEDORA-2022-494754fe0f broke mesa for radeon and nouveau users due to being pushed without matching libva build. This needs fixing one way or another before release.
Luya, Pete, can you please clarify what exactly is currently broken? Does it mean that desktop now doesn't start at all for AMD and nouveau? Or attempt to use VAAPI now crashes? Please clarify, thank you.
After updating to the latest mesa my system failed to start (it got stuck on switching to GDM) because it was missing mesa-va-drivers which didn't get installed automatically. After I installed it the system booted just fine. I'm using AMD RX570.
@walter.pete @kwizart What happened in https://bodhi.fedoraproject.org/updates/FEDORA-2022-494754fe0f was very unfortunate. F37 now seems broken for all Radeon/Nouveau users, if I understand it correctly. Because the libva update was pulled back, nothing now pulls in mesa-va-drivers, and it seems that having libva installed (that's by default) and missing mesa-va-drivers results in GDM not starting (see comment 16). This is very bad. If I described the situation correctly, we need an IMMEDIATE fix (sorry for the caps, but I really want to highlight this). With each passing hour, more users get affected. Either we need to revert the mesa-va-drivers split, or we need libva to *require* (not recommend, because it doesn't work without it, and some people don't install recommended packages) mesa-va-drivers. Or some other way how to ensure mesa-va-drivers is installed for everyone by default. Can we please resolve this in the fastest way possible, and only then discuss the best approach to do this in an rpmfusion-friendly way? Thanks!
I can only do that in a coordinated manner at it's "own pace" ! See also https://bugzilla.rpmfusion.org/show_bug.cgi?id=6426#c15 (and laters)
Created attachment 1916287 [details] system journal when gdm fails to start I worked with Jiri to retrieve system journal for the failed boot when gdm doesn't start (comment 16). We've had troubles (might be a systemd bug related to timezones), but the log is now attached. Unfortunately I don't see an exact cause of the failure, only this: říj 05 18:47:52 fedora-workstation gnome-session[1244]: gnome-session-binary[1244]: WARNING: Application 'org.gnome.Shell.desktop' failed to register before timeout říj 05 18:47:52 fedora-workstation gnome-session-binary[1244]: WARNING: Application 'org.gnome.Shell.desktop' failed to register before timeout říj 05 18:47:52 fedora-workstation gnome-session-binary[1244]: Unrecoverable failure in required component org.gnome.Shell.desktop However, installing mesa-va-drivers (no other package updated) immediately fixed the problem.
I cannot reproduce this, I just installed an f37 system with gnome and libva is installed and mesa-vaapi-drivers isn't and it works fine on my AMD FIJI GPU. Is there any sign of a gnome-shell core file or anything with a backtrace?
I can boot fine on a P14s G2 AMD laptop running Silverblue 37 without mesa-va-drivers (Ryzen 7 PRO 5850U and RENOIR GPU)
OK, I tested booting Fedora-Workstation-Live-x86_64-37-20221005.n.0.iso (which doesn't have mesa-va-drivers present) on my PC with Radeon 580, and it boots fine. I also tested moving away radeonsi_drv_video.so on F36 which is currently installed there, and it also booted OK (and only vainfo complained, no other problem). So this issue is not universal, which is great news, thanks for verification. At the same time, I don't believe that Jiri simply had a random error which conveniently appeared at the same time we're shuffling VA drivers around, and it magically fixed right after installing mesa-va-drivers. There has to be some connection to it. I talked to Jiri, he tried rebooting several times, with different kernels, disable splashscreen. Nothing helped, until I advised him to boot into runlevel 3 and install that mesa subpackage. Unfortunately, there's nothing in his ABRT or coredumpctl, and the logs don't seem to be helpful. We'll try to dig more into it on his computer. But I'm afraid we have some corner case here which can affect certain users in some unknown cases.
Created attachment 1916479 [details] journal logs
Created attachment 1916480 [details] list of packages
I removed the package and the OS failed to boot into the login screen again. It's an upgraded installation from 2019. When I tried to boot the live image of F37 it booted correctly. So it's something that is in my installation and not on the default F37 that triggers the problem. I'm attaching the journal logs from the boot and the list of packages installed. I haven't found any other clues. gnome-shell is among running processes, so it doesn't crash which explains why there is no coredump. When the boot gets stuck, there is only the boot splash screen and no input events (shortcuts etc) make any change. When I boot with the slashscreen off, it switches from the screen with boot messages to a blank screen with a cursor in the left upper corner and gets stuck, again not shortcuts work.
Hi, I got the same error as Mr Páral mentioned after updating today. I got an AMD RX 6600 (non XT). Installing mesa-va-drivers seemingly fixed GDM.
any gnome shell extensions installed on those systems? something that might be using video enc or dec?
The only enabled gnome shell extension on my system is the default Background Logo.
Hello Mr. Airlie Installed are: trayIconsReloaded apps-menu.github.com background-logo launch-new-instance.github.com places-menu.github.com window-list.github.com gamemode.me appindicatorsupport.com Enabled on my user account are: trayIconsReloaded Here also a list of all installed packages on my system (sudo dnf list --installed): https://paste.centos.org/view/a2215cfc
So...comparing Jiri's logs to mine from a working system, it seems like this is where things get kinda stuck: říj 06 11:36:22 fedora-workstation gnome-shell[1253]: Using Wayland display name 'wayland-0' that's the last message from gnome-shell on his system. On mine, it prints a lot of stuff after that: Oct 10 07:52:19 t16.happyassassin.net gnome-shell[1231]: Using Wayland display name 'wayland-0' Oct 10 07:52:20 t16.happyassassin.net gnome-shell[1231]: JS WARNING: [resource:///org/gnome/gjs/modules/core/overrides/Gio.js 287]: Too many arguments to method Gio.AsyncInitable.init_async: expected 3, got 4 Oct 10 07:52:20 t16.happyassassin.net gnome-shell[1231]: JS WARNING: [resource:///org/gnome/gjs/modules/core/overrides/Gio.js 287]: Too many arguments to method Gio.AsyncInitable.init_async: expected 3, got 4 Oct 10 07:52:20 t16.happyassassin.net gnome-shell[1231]: Unset XDG_SESSION_ID, getCurrentSessionProxy() called outside a user session. Asking logind directly. Oct 10 07:52:20 t16.happyassassin.net gnome-shell[1231]: Will monitor session c1 and so on. Looking at where we are when that message gets printed, we're at the end of `meta_wayland_compositor_new` in mutter src/wayland/meta-wayland.c ; after printing that message it sets a few environment vars and returns. It looks like it's called from `meta_context_start` in src/core/meta-context.c, and the next thing that does is: priv->display = meta_display_new (context, error); if (!priv->display) { priv->state = META_CONTEXT_STATE_TERMINATED; return FALSE; } priv->main_loop = g_main_loop_new (NULL, FALSE); priv->state = META_CONTEXT_STATE_STARTED; and then return. So I'm gonna guess it's getting stuck there, somehow. I say Shell gets "stuck" because it doesn't seem to proceed any further, doesn't log anything else, and eventually gnome-session gets tired of waiting for it: říj 06 11:37:51 fedora-workstation gnome-session[1245]: gnome-session-binary[1245]: WARNING: Application 'org.gnome.Shell.desktop' failed to register before timeout but Jiri says it doesn't *crash*, the process is still there. So, it seems like it's somehow stuck.
Jiri: could you possibly get gdm to run with MUTTER_VERBOSE=1 and MUTTER_DEBUG=1 set? It might give us more idea exactly where things are going wrong...
Created attachment 1917099 [details] backtrace of gnome-shell
I've submitted the backtrace of gnome-shell I got from gdb. I've also tried to set SELinux to permissive, didn't help, also tried to switch to runlevel 3 and back to 5, didn't help either.
(To be clear, the above is not a backtrace of a crash, but a backtrace of a running gnome-shell process (under gdm) which seems to be stuck and waiting for something).
#0 0x00007f41ee922e26 in ppoll () at /lib64/libc.so.6 #1 0x00007f41c7ee5bff in gst_poll_wait () at /lib64/libgstreamer-1.0.so.0 #2 0x00007f41c7eeba1e in exchange_packets () at /lib64/libgstreamer-1.0.so.0 #3 0x00007f41c7eecfbf in plugin_loader_free.lto_priv () at /lib64/libgstreamer-1.0.so.0 #4 0x00007f41c7ef8353 in gst_update_registry () at /lib64/libgstreamer-1.0.so.0 #5 0x00007f41c7e8a48a in init_post () at /lib64/libgstreamer-1.0.so.0 #6 0x00007f41ef947af1 in g_option_context_parse () at /lib64/libglib-2.0.so.0 #7 0x00007f41c7e829ff in gst_init_check () at /lib64/libgstreamer-1.0.so.0 Seems to be trying to rebuild the gdm gstreamer registry and its getting stuck waiting for the rebuild to finish?
Discussed at the 2022-10-10 blocker review meeting: https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2022-10-10/f37-blocker-review.2022-10-10-16.01.html . It was agreed to accept this as a blocker as a conditional violation of the Basic "graphical systems must boot to the login screen" requirement, at least while we continue to investigate it - we expect to revisit this at Thursday's go/no-go.
jiri can you post a backtrace of the hung gst-plugin-scanner process too?
just reading through the gstreamer1-vaapi code I see this bit: /* If no neighboor, or application not interested, use system default */• if (plugin->gl_context) {• display = gst_vaapi_create_display_from_gl_context (plugin->gl_context);• /* Cannot instantiate VA display based on GL context. Reset the• * requested display type to ANY to try again */• if (!display)• gst_vaapi_plugin_base_set_display_type (plugin,• GST_VAAPI_DISPLAY_TYPE_ANY);• }• I'm just completely guessing here (not knowing this part of the stack at all), but it seems at least conceivable to me that the gst_vaapi_create_display_from_gl_context (...) fails if mesa-va-drivers isn't installed, and that retrying with GST_VAAPI_DISPLAY_TYPE_ANY makes it try to talk to the wayland socket that gnome-shell isn't managing yet because it's stuck waiting on this process to finish. anyway, that's one theory...
jiri sent me the backtrace i asked for in comment 37 through irc. It shows that gst-plugin-scanner is trying to connect to Xwayland. the full trace is here: Thread 1 (Thread 0x7f9086d42740 (LWP 1281) "gst-plugin-scan"): #0 0x00007f9087229cf4 in __GI___poll (fds=fds@entry=0x7ffe9a958a10, nfds=nfds@entry=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29 sc_ret = -516 #1 0x00007f908657f583 in poll (__timeout=-1, __nfds=1, __fds=0x7ffe9a958a10) at /usr/include/bits/poll2.h:39 pfd = {fd = 7, events = 1, revents = 0} ret = <optimized out> done = 0 ret = <optimized out> done = 0 #2 read_block (len=8, buf=0x55cc604519d0, fd=7) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_in.c:388 pfd = {fd = 7, events = 1, revents = 0} ret = <optimized out> done = 0 ret = <optimized out> done = 0 #3 _xcb_in_read_block (c=c@entry=0x55cc60453330, buf=0x55cc604519d0, len=len@entry=8) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_in.c:1075 ret = <optimized out> done = 0 #4 0x00007f9086582bc2 in read_setup (c=0x55cc60453330) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_conn.c:177 c = 0x55cc60453330 #5 xcb_connect_to_fd (fd=fd@entry=7, auth_info=auth_info@entry=0x7ffe9a958b50) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_conn.c:359 c = 0x55cc60453330 #6 0x00007f90865834d6 in xcb_connect_to_display_with_auth_info (displayname=displayname@entry=0x0, auth=auth@entry=0x0, screenp=screenp@entry=0x0) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_util.c:532 fd = <optimized out> display = 1024 host = 0x55cc60451b00 "" protocol = 0x0 ourauth = {namelen = 18, name = 0x55cc60451a60 "MIT-MAGIC-COOKIE-1", datalen = 16, data = 0x55cc60451a40 "\005\346wnj\246@.\370\246\274\352\004\245\375*"} c = <optimized out> parsed = <optimized out> #7 0x00007f908658357e in xcb_connect (displayname=displayname@entry=0x0, screenp=screenp@entry=0x0) at /usr/src/debug/libxcb-1.13.1-10.fc37.x86_64/src/xcb_util.c:489 #8 0x00007f908682115a in _XConnectXCB (dpy=0x55cc604520c0, display=0x0, screenp=0x7ffe9a958d6c) at /usr/src/debug/libX11-1.8.1-2.fc37.x86_64/src/xcb_disp.c:78 host = 0x55cc60451b00 "" n = 1024 c = <optimized out> #9 0x00007f9086810f87 in XOpenDisplay (display=display@entry=0x0) at /usr/src/debug/libX11-1.8.1-2.fc37.x86_64/src/OpenDis.c:129 dpy = 0x55cc604520c0 i = <optimized out> j = <optimized out> k = <optimized out> display_name = 0x7ffe9a95af76 ":1024" setup = 0x0 iscreen = 0 vendorlen = <optimized out> u = {setup = <optimized out>, failure = <optimized out>, vendor = <optimized out>, sf = <optimized out>, rp = <optimized out>, dp = <optimized out>, vp = <optimized out>} setuplength = <optimized out> usedbytes = 0 mask = <optimized out> conn_buf_size = <optimized out> xlib_buffer_size = <optimized out> #10 0x00007f908696a694 in drm_auth_x11_init (auth=0x7ffe9a958e80) at drm/va_drm_auth_x11.c:114 vtable = 0x7ffe9a958e88 libva_x11_name = "libva-x11.so.2\000\300" ret = 14 auth = {handle = 0x55cc60440040, vtable = {x11_open_display = 0x7f9086810eb0 <XOpenDisplay>, x11_close_display = 0x7f9086802e00 <XCloseDisplay>, va_dri2_query_extension = 0x7f9086923d70 <VA_DRI2QueryExtension>, va_dri2_query_version = 0x7f9086923db0 <VA_DRI2QueryVersion>, va_dri2_authenticate = 0x7f90869240b0 <VA_DRI2Authenticate>}, display = 0x0, window = 0} success = false ctx = <optimized out> drm_state = 0x55cc604513d0 magic = 1 ret = <optimized out> #11 va_drm_authenticate_x11 (fd=6, magic=1) at drm/va_drm_auth_x11.c:163 auth = {handle = 0x55cc60440040, vtable = {x11_open_display = 0x7f9086810eb0 <XOpenDisplay>, x11_close_display = 0x7f9086802e00 <XCloseDisplay>, va_dri2_query_extension = 0x7f9086923d70 <VA_DRI2QueryExtension>, va_dri2_query_version = 0x7f9086923db0 <VA_DRI2QueryVersion>, va_dri2_authenticate = 0x7f90869240b0 <VA_DRI2Authenticate>}, display = 0x0, window = 0} success = false ctx = <optimized out> drm_state = 0x55cc604513d0 magic = 1 ret = <optimized out> #12 va_drm_authenticate (magic=1, fd=6) at drm/va_drm_auth.c:37 ctx = <optimized out> drm_state = 0x55cc604513d0 magic = 1 ret = <optimized out> #13 va_DisplayContextGetNumCandidates (pDisplayContext=<optimized out>, num_candidates=<optimized out>) at drm/va_drm.c:73 ctx = <optimized out> drm_state = 0x55cc604513d0 magic = 1 ret = <optimized out> #14 0x00007f9086aaf1af in va_getDriverNumCandidates (num_candidates=0x7ffe9a958f4c, dpy=0x55cc60451400) at /usr/src/debug/libva-2.15.0-2.fc37.x86_64/va/va.c:357 pDisplayContext = 0x55cc60451400 driver_name_env = 0x0 vaStatus = 0 ctx = 0x55cc60451570 driver_name = 0x0 num_candidates = 1 candidate_index = 0 vaStatus = <optimized out> __func__ = "vaInitialize" #15 vaInitialize (dpy=dpy@entry=0x55cc60451400, major_version=major_version@entry=0x7ffe9a958fa4, minor_version=minor_version@entry=0x7ffe9a958fa0) at /usr/src/debug/libva-2.15.0-2.fc37.x86_64/va/va.c:730 driver_name = 0x0 num_candidates = 1 candidate_index = 0 vaStatus = <optimized out> __func__ = "vaInitialize" #16 0x00007f9086ccec6c in vaapi_initialize (dpy=0x55cc60451400) at ../gst-libs/gst/vaapi/gstvaapiutils.c:113 major_version = 21964 minor_version = 1615139840 status = <optimized out> __func__ = "vaapi_initialize" #17 0x00007f9086cecb34 in supports_vaapi (fd=6) at ../gst-libs/gst/vaapi/gstvaapidisplay_drm.c:77 ret = <optimized out> va_dpy = 0x55cc60451400 parent = <optimized out> priv = 0x55cc6044c120 devpath = 0x55cc60451090 "/dev/dri/card0" e = 0x55cc6044d000 fd = 6 syspath = <optimized out> udev = 0x55cc6044ed80 device = 0x55cc6044e300 l = 0x55cc60410ca0 i = <optimized out> priv = 0x55cc6044c120 priv = 0x55cc6044c120 #18 get_default_device_path (display=0x55cc6044c1d0) at ../gst-libs/gst/vaapi/gstvaapidisplay_drm.c:140 parent = <optimized out> priv = 0x55cc6044c120 devpath = 0x55cc60451090 "/dev/dri/card0" e = 0x55cc6044d000 fd = 6 syspath = <optimized out> udev = 0x55cc6044ed80 device = 0x55cc6044e300 l = 0x55cc60410ca0 i = <optimized out> priv = 0x55cc6044c120 priv = 0x55cc6044c120 #19 set_device_path (device_path=<optimized out>, display=0x55cc6044c1d0) at ../gst-libs/gst/vaapi/gstvaapidisplay_drm.c:181 priv = 0x55cc6044c120 priv = 0x55cc6044c120 #20 gst_vaapi_display_drm_open_display (display=0x55cc6044c1d0, name=<optimized out>) at ../gst-libs/gst/vaapi/gstvaapidisplay_drm.c:247 priv = 0x55cc6044c120 #21 0x00007f9086cc3eb8 in gst_vaapi_display_create (data=0x0, init_type=GST_VAAPI_DISPLAY_INIT_FROM_DISPLAY_NAME, display=0x55cc6044c1d0) at ../gst-libs/gst/vaapi/gstvaapidisplay.c:958 info = {display = 0x55cc6044c1d0, display_name = 0x0, va_display = 0x0, native_display = 0x0} priv = 0x55cc6044c140 klass = 0x55cc6044bbe0 __func__ = "gst_vaapi_display_config" #22 gst_vaapi_display_config (display=0x55cc6044c1d0, init_type=GST_VAAPI_DISPLAY_INIT_FROM_DISPLAY_NAME, init_value=0x0) at ../gst-libs/gst/vaapi/gstvaapidisplay.c:1265 __func__ = "gst_vaapi_display_config" #23 0x00007f9086cf12c5 in gst_vaapi_display_drm_new (device_path=0x0) at ../gst-libs/gst/vaapi/gstvaapidisplay_drm.c:367 display = <optimized out> types = {2, 1, 2261812962} i = 1 num_types = <optimized out> device_paths = {0x0, 0x0, 0x7f908754e7a0 <known_licenses>} #24 0x00007f9086c892cb in gst_vaapi_create_test_display () at ../gst/vaapi/gstvaapipluginutil.c:929 i = 0 display = 0x0 display = <optimized out> decoders = <optimized out> rank = <optimized out> __func__ = "plugin_init" #25 plugin_init (plugin=0x55cc60439190) at ../gst/vaapi/gstvaapi.c:191 display = <optimized out> decoders = <optimized out> rank = <optimized out> __func__ = "plugin_init" #26 0x00007f90874dd512 in gst_plugin_register_func (plugin=plugin@entry=0x55cc60439190, desc=desc@entry=0x7f9086d3c000 <gst_plugin_desc>, user_data=user_data@entry=0x0) at ../gst/gstplugin.c:532 __func__ = "gst_plugin_register_func" #27 0x00007f90874e50f7 in _priv_gst_plugin_load_file_for_registry (filename=filename@entry=0x55cc60424ffc "/usr/lib64/gstreamer-1.0/libgstvaapi.so", registry=<optimized out>, registry@entry=0x0, error=error@entry=0x0) at ../gst/gstplugin.c:971 desc = 0x7f9086d3c000 <gst_plugin_desc> plugin = 0x55cc60439190 symname = <optimized out> module = 0x55cc60426e30 ret = <optimized out> ptr = 0x7f9086d3c000 <gst_plugin_desc> file_status = {st_dev = 2053, st_ino = 5244013, st_nlink = 1, st_mode = 33261, st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0, st_size = 887528, st_blksize = 4096, st_blocks = 1744, st_atim = {tv_sec = 1665343425, tv_nsec = 730456412}, st_mtim = {tv_sec = 1658410763, tv_nsec = 0}, st_ctim = {tv_sec = 1663168673, tv_nsec = 264324775}, __glibc_reserved = {0, 0, 0}} new_plugin = 1 flags = <optimized out> __func__ = "_priv_gst_plugin_load_file_for_registry" #28 0x00007f90874e590e in gst_plugin_load_file (filename=filename@entry=0x55cc60424ffc "/usr/lib64/gstreamer-1.0/libgstvaapi.so", error=error@entry=0x0) at ../gst/gstplugin.c:689 #29 0x00007f90874e60f6 in do_plugin_load (tag=0, filename=0x55cc60424ffc "/usr/lib64/gstreamer-1.0/libgstvaapi.so", l=0x55cc6041cc30) at ../gst/gstpluginloader.c:845 newplugin = <optimized out> chunks = 0x0 res = 1 magic = <optimized out> packet_len = <optimized out> to_read = <optimized out> tag = 0 in = <optimized out> res = <optimized out> res = <optimized out> __func__ = "exchange_packets" #30 handle_rx_packet (payload_len=<optimized out>, payload=0x55cc60424ffc "/usr/lib64/gstreamer-1.0/libgstvaapi.so", tag=0, pack_type=<optimized out>, l=0x55cc6041cc30) at ../gst/gstpluginloader.c:953 res = 1 magic = <optimized out> packet_len = <optimized out> to_read = <optimized out> tag = 0 in = <optimized out> res = <optimized out> res = <optimized out> __func__ = "exchange_packets" #31 read_one (l=0x55cc6041cc30) at ../gst/gstpluginloader.c:1123 magic = <optimized out> packet_len = <optimized out> to_read = <optimized out> tag = 0 in = <optimized out> res = <optimized out> res = <optimized out> __func__ = "exchange_packets" #32 exchange_packets (l=l@entry=0x55cc6041cc30) at ../gst/gstpluginloader.c:1151 res = <optimized out> __func__ = "exchange_packets" #33 0x00007f90874e7238 in _gst_plugin_loader_client_run () at ../gst/gstpluginloader.c:700 res = 1 l = 0x55cc6041cc30 __func__ = "_gst_plugin_loader_client_run" #34 0x000055cc5fb061e9 in main (argc=<optimized out>, argv=0x7ffe9a959668) at ../libs/gst/helpers/gst-plugin-scanner.c:67 res = 1 my_argv = 0x55cc60414830 my_argc = 1 Detaching from program: /usr/libexec/gstreamer-1.0/gst-plugin-scanner, process 1281
So what's happening I think is, the libva wants to talk to X11 as part of the drm auth protocol for the va api. incidentally, it's missing wayland support: ./va/drm/va_drm_auth.c: /* XXX: try to authenticate through Wayland, etc. */ but trying to talk to X11 means having to start Xwayland. mutter does that when it detects activity on the X11 socket. It won't detect that activity though if gnome-shell is in the middle of a synchronous call waiting for gst-plugin-scanner to finish. I don't know why the problem goes away when mesa-va-drivers is installed (maybe it's only doing this in a fallback path?), but since it does go away when mesa-va-drivers is installed, I think the easiest fix is to just add a: Requires: mesa-va-drivers to gstreamer1-vaapi and maybe a Recommends: mesa-va-drivers to whatever package it got split off from (mesa-dri-drivers ?)
I just want to add Jiri confirmed on IRC removing gstreamer1-vaapi makes the problem go away.
> I don't know why the problem goes away when mesa-va-drivers is installed So just reading more through code, I think what's probably happening is in the working case it's using a drm render node for va which doesn't require the drm auth stuff, but if mesa-va-drivers isn't installed, trying to use the render node probably fails, so it futilely falls back to trying to using the card device and legacy X11 authentication bits. This hangs because, as mentioned before it's waiting for Xwayland to start which won't happen if gnome-shell is blocked in a sync call.
I'm doing gstreamer1-vaapi and mesa builds now with my suggestion from comment 40 but the builders seem hung up on s390x (might be related to an outage in the westford lab over the weekend, not sure). I don't know if it'll eventually complete or i'll have to retry the builds later.
Thanks for investigating, Ray! We should probably make sure this is in line with the wider issue here, though. If I'm understanding correctly, the point of splitting the drivers out in the first place is to allow that package to be potentially replaced with a different third-party package that might contain things Fedora cannot. Do we know if adding the requirement or recommend will cause any problems with that? At least, I think we should potentially just do the requires in gstreamer1-vaapi as a minimal fix here and maybe not bundle it with the recommends in mesa, at least until we're sure it's OK in the wider context. The s390 issue is a general one, there's an outage at the data center where the s390 builders live, AIUI.
I actually took a peek at the alternative mesa package here https://www.thefinalzone.net/packages/mesa-freeworld.spec before initiating the builds. note it has: %package -n %{srcname}-va-drivers-freeworld• ... Provides: %{srcname}-va-drivers = %{?epoch:%{epoch}:}%{version}-%{release}• So the Recommends should cause no issues. I also think the Recommends is a good idea in general because it keeps dependencies close to as they were before the split, so it's the more surgical change.
I think the potential issue is that it's harder to swap a package than install one where no package currently exists. If we recommend mesa-va-drivers from mesa-dri-drivers so that basically everyone gets the Fedora one installed on fresh install, it makes it somewhat harder to switch to a third party one. GUI apps don't have an equivalent of `dnf swap` (and even CLI users don't always know about `dnf swap`). Still it's less of an issue if it's just a Recommends, I guess, as apps shouldn't refuse to let you remove the Fedora one in that case.
I think we should have mesa-va-drivers installed by default (at least in a Workstation), so the Recommends in mesa-dri-drivers sounds reasonable to me. First, we used to have it installed by default, so this approach keeps it the same way. Second, mesa-va-drivers contains VP9 and possibly also AV1 (for the latest hardware) decoders, which is quite important these days (YouTube), and Firefox finally ships with hw accelerated decoding enabled by default. It would be sad to regress on that. Third, while inconvenient, the solution with "dnf swap" and similar when using alternative repos is good enough for the moment, I believe (GNOME Software doesn't show included packages out-of-the-box anyway, you need to install AppStream metadata first, which requires using a cmdline). A better approach than overwriting .so files will be needed anyway to make the maintenance easier, and there are some suggestions proposed already [1]. Overall, the proposed approach in comment 40 seems to me to be the best we can do at the moment. In the long run, IIUIC, we should have a look whether gnome-shell can initialize gstreamer later (once it's fully started up), or whether gstreamer should have some checks for being used while the lower stack is not fully started yet, in order to avoid this deadlock. Should I file some upstream issues for that? [1] https://github.com/intel/libva/issues/639
Sure, that's a good argument. I'll go with that. Marking this as MODIFIED since builds are there, just waiting for the s390x outage to be resolved.
i don't think we should change gnome-shell. I don't think doing gst_init() later is really a great option. Synchronous calls at start up are generally fine, synchronous calls in the middle of when a machine is getting used are bad juju. Also, delay doesn't buy us much. If the user doesn't install mesa-va-drivers or they don't start another gstreamer app to handle the registry getting built, just delaying could lead to a session lock up when we finally get around to doing it. We could put `['--gst-disable-registry-update']` instead of null for the args to gst_init() but then that might break screen recording for fresh accounts until they start a gstreamer app to rebuild the registry, so that's not a great option either. We could run it *earlier*. If there's no DISPLAY set yet then it's not going to try to talk to the X server, so doing it at the top of main() might work. On first blush, that sort of strikes me as a workaround, though. It might be a pragmatic workaround, but i personally don't see the need if we can avoid it. I think maybe gstreamer1-vaapi could e.g., avoid falling back to trying /dev/dri/card0 (and legacy x drm auth stuff) if /dev/dri/renderD128 already failed and it's associated with the same pci device as /dev/dri/card0. that would allow it to fail more gracefully when run from gnome-shell if mesa-va-drivers aren't installed, I think. But it's kind of a niche case to handle, and also not obviously a better solution to me than "make the gstreamer plugin that needs va drivers require va drivers" solution we've come up with. So I don't think filing a gnome-shell issue upstream is right. You could file an upstream gstreamer bug if you want, but i don't think it's strictly necessary, either. I really think deps are the best way forward here, personally. Of course i kind of got pulled into this from the side...Jiri pinged me for help. If those closer to mesa... pwalter, airlied, etc don't like the mesa deps, and if those closer to gstreamer e.g. wtay would rather a dep less solution, too, i'm not trying to step on toes or anything... we can come up with a code solution if we need to.
FEDORA-2022-9ee52e6983 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-9ee52e6983
The builds finished and this is a blocker so I've started the update process. In the off chance we do get objections, we can reroll if we need to, of course.
(In reply to Kamil Páral from comment #47) A better approach than overwriting > .so files will be needed anyway to make the maintenance easier, and there > are some suggestions proposed already [1]. Overall, the proposed approach in > comment 40 seems to me to be the best we can do at the moment. Here is also a possible solution proposed by RPM Fusion folks: https://bugzilla.rpmfusion.org/show_bug.cgi?id=6426#c36
FEDORA-2022-9ee52e6983 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-9ee52e6983` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-9ee52e6983 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2022-9ee52e6983 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.
I have 2 systems that hang before gdm comes up if I install the gstreamer1-vaapi package(s), they have most of the mesa packages, with the mesa-va-driver package(s), installed. This is happening with mesa-2.21.1-1 and gstreamer1-vaapi-1.20.3-3
Brian, can you please ssh into the hanged system and collect system journal (`journalctl -b`) and possibly also gdb traceback from the running gnome-shell process (`gdb -p PID` and then "set logging enabled on" and "t a a bt full") and attach the output? I assume yours might be a different problem, but we can't tell without any logs. Also attach `rpm -qa | sort`, thanks.
Are you sure this is the correct command for gdb? "t a a bt full" Not an expert but it seems wrong somehow. Will provide the info in the next day or so.
Created attachment 1918085 [details] journalctl -b output As requested
Created attachment 1918086 [details] rpm -qa | sort output As requested
I also found and installed all the gstreamer1-*-1.20.4-1 rpms, described as a bugfix release before collecting the attachments.
t a a bt full is correct, it's short for 'thread apply all bt full', which means 'get a full backtrace for all threads'.
Created attachment 1918159 [details] Hung gnome-shell gdb output This is the t a a bt full output from gdb after gnome-shell hangs with the gstreamer1-vaapi rpms installed (both i686 and x86_64) I have no way of knowing whether this is adequate but I did notice a whole lot of debuginfo rpms were listed as missing so if necessary I can install some to get more symbol output.
yes, that would likely help. Did gdb offer to turn on 'debuginfod'? If so, do that, it will automatically download the required symbols (note the downloads may be quite large, don't do this on a metered connection).
So just to follow up, in comment 49 I advocated against a gnome-shell fix, but one got filed upstream independently anyway: https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/5710
Brian does installing the libva-intel-driver package from rpmfusion fix things for you?
Installed libva-intel-driver-2.4.1-9.fc37.x86_64 from rpmfusion on one of my machines, then installed the gstreamer1-vaapi for both x86_64 and i686. After a reboot gdm never gets to the point of showing a login prompt or user box to click on. Installed libva-intel-driver-2.4.1-9.fc37.i686 as well, reinstalled gstreamer1-vaapi for both x86_64 and i686. After another reboot gdm never gets to the point of showing a login prompt or user box to click on. Uninstall gstreamer1-vaapi for both x86_64 and i686 leaving libva-intel-drivers installed. Reboot, gdm back to normal with user selection shown.
https://bugzilla.redhat.com/show_bug.cgi?id=2123998#c65 Do I need to try uninstalling the mesa-va-drivers packages?
After installing more updates as they went into updates-testing or pending->testing I can now install the gstreamer1-vaapi packages without seeing hangs during gdm startup. It's fixed, but I don't know exactly how.