Bug 1637561 - [abrt] xorg-x11-server-Xorg: Xorg server crashed [NEEDINFO]
Summary: [abrt] xorg-x11-server-Xorg: Xorg server crashed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: xorg-x11-drv-ati
Version: 7.6
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Adam Jackson
QA Contact: Desktop QE
URL:
Whiteboard: abrt_hash:c1a1da360712a337a69f3d4e63c...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-09 13:35 UTC by Pavel Holica
Modified: 2019-02-06 16:34 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-06 10:58:32 UTC
tpelka: needinfo? (alanm)


Attachments (Terms of Use)
File: Xorg.0.log (44.41 KB, text/plain)
2018-10-09 13:35 UTC, Pavel Holica
no flags Details
xorg-report.tar.gz (9.25 MB, application/x-gzip)
2018-10-09 13:41 UTC, Pavel Holica
no flags Details
debug-dmesg (442.22 KB, text/plain)
2018-10-09 13:41 UTC, Pavel Holica
no flags Details
debug-journalctl (962.89 KB, text/x-vhdl)
2018-10-09 13:42 UTC, Pavel Holica
no flags Details
lspci (2.28 KB, text/plain)
2018-10-09 13:42 UTC, Pavel Holica
no flags Details
screenshot (165.51 KB, image/png)
2018-10-10 16:24 UTC, Adam Jackson
no flags Details
Xorg crash report (139.19 KB, text/plain)
2018-10-31 11:42 UTC, Ben
no flags Details
xorglog (30.77 KB, application/x-gzip)
2018-11-21 11:37 UTC, cnman
no flags Details

Description Pavel Holica 2018-10-09 13:35:42 UTC
Version-Release number of selected component:
xorg-x11-server-Xorg-1.20.1-3.el7

Additional info:
reporter:       libreport-2.1.11.1
executable:     /usr/bin/X
kernel:         3.10.0-957.el7.x86_64
pkg_fingerprint: 199E 2F91 FD43 1D51
pkg_vendor:     Red Hat, Inc.
reproducible:   Not sure how to reproduce the problem
runlevel:       N 5
type:           xorg
uid:            0

Truncated backtrace:
0: /usr/bin/X (xorg_backtrace+0x55) [0x56012924b155]
1: /usr/bin/X (0x56012909a000+0x1b4dd9) [0x56012924edd9]
2: /lib64/libpthread.so.0 (0x7fb6f0b0a000+0xf5d0) [0x7fb6f0b195d0]
3: /usr/lib64/xorg/modules/libglamoregl.so (0x7fb6ebc1f000+0x1507c) [0x7fb6ebc3407c]
4: /usr/lib64/xorg/modules/libglamoregl.so (0x7fb6ebc1f000+0x1661c) [0x7fb6ebc3561c]
5: /usr/bin/X (0x56012909a000+0x13a8e8) [0x5601291d48e8]
6: /usr/bin/X (0x56012909a000+0x12e8ba) [0x5601291c88ba]
7: /usr/bin/X (0x56012909a000+0x5c35b) [0x5601290f635b]
8: /usr/bin/X (0x56012909a000+0x603aa) [0x5601290fa3aa]
9: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fb6f075f3d5]
10: /usr/bin/X (0x56012909a000+0x4a4ce) [0x5601290e44ce]

Comment 1 Pavel Holica 2018-10-09 13:35:46 UTC
Created attachment 1492052 [details]
File: Xorg.0.log

Comment 3 Pavel Holica 2018-10-09 13:41:05 UTC
Created attachment 1492055 [details]
xorg-report.tar.gz

Unfortunately reporter-bugzilla crashed while uploading more files. Uploading the archive produced by reporter-upload.

Comment 4 Pavel Holica 2018-10-09 13:41:56 UTC
Created attachment 1492056 [details]
debug-dmesg

dmesg collected from another boot with: drm.debug=0xe log_buf_len=1M

Comment 5 Pavel Holica 2018-10-09 13:42:35 UTC
Created attachment 1492057 [details]
debug-journalctl

journal collected from second boot with drm.debug=0xe log_buf_len=1M

Comment 6 Pavel Holica 2018-10-09 13:42:53 UTC
Created attachment 1492058 [details]
lspci

Comment 7 Pavel Holica 2018-10-09 13:46:01 UTC
The system has:
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Turks PRO [Radeon HD 7570] [1002:675d]

System booted to GDM fine, but after login when gnome should be showing, the graphical output starts behaving strangely. Cursor is displayed fine all the time, but the desktop sometimes is shown but doesn't change, sometimes there are just random colours (like some gravel), the screen flickers.

It's not possible to work on such system at all. This didn't happen on RHEL-7.5.

Comment 13 Adam Jackson 2018-10-10 16:24:02 UTC
Created attachment 1492632 [details]
screenshot

I've logged into the machine and exported the display with vnc (x0vncserver), and things appear to look fine, see attached screenshot. I don't doubt that there's a bug involved here, but if vnc looks correct then the issue must be about the actual display output (vga, dvi, whatever), which would make this a kernel bug.

The initial report here included a crash with a backtrace, but without any steps to reproduce. I've not been able to trigger a crash with light desktop usage. If there's something specific I should try, let me know.

Comment 16 Tomas Pelka 2018-10-11 14:24:59 UTC
Also regarding reproducer, screen just get scrambled after gdm start so no real reproducer just boot.

@jstodola/pholica lets just try different monitor+cable (even digial vs vga) to eliminate HW issue.

Comment 17 Tomas Pelka 2018-10-11 14:30:00 UTC
Ajax is there anything we can do locally to help debug.

Comment 18 Jan Stodola 2018-10-11 15:24:50 UTC
Retested using a different cable and a different monitor with a different resolution - the problem is still present.

Comment 19 Tomas Pelka 2018-10-15 06:51:45 UTC
One more think I remember when I observed this issue with pholica. When I run startx session came up but using llvmpipe.

Comment 20 Tomas Pelka 2018-10-16 14:14:37 UTC
Pavel or Jan, would it be possible to install latest Xorg from brew (-5 I think) we have a 0day update with some additional fixes.

Comment 21 Adam Jackson 2018-10-16 18:45:13 UTC
Please try this scratch build:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=18811091

Comment 22 Pavel Holica 2018-10-17 13:37:57 UTC
Ok, so I tried the scratch build from comment 21 and if I'm not mistaken the issue is gone. glxinfo also says it's running on AMD not llvmpipe:
$ glxinfo | grep render
direct rendering: Yes
    GLX_MESA_multithread_makecurrent, GLX_MESA_query_renderer, 
    GLX_MESA_multithread_makecurrent, GLX_MESA_query_renderer, 
Extended renderer info (GLX_MESA_query_renderer):
OpenGL renderer string: AMD TURKS (DRM 2.50.0 / 3.10.0-957.el7.x86_64, LLVM 6.0.1)
    GL_ARB_conditional_render_inverted, GL_ARB_conservative_depth, 
    GL_NV_conditional_render, GL_NV_depth_clamp, GL_NV_packed_depth_stencil, 
    GL_ARB_conditional_render_inverted, GL_ARB_conservative_depth, 
    GL_NV_blend_square, GL_NV_conditional_render, GL_NV_depth_clamp, 
    GL_OES_element_index_uint, GL_OES_fbo_render_mipmap,

I've tried rebooting several times, also cold boot and didn't see the issue.

Going to try the other brew build (I expect that's the one for zero day) tomorrow.

Comment 23 Tomas Pelka 2018-10-17 14:05:44 UTC
I can't speak for ajax but it seems this is a fix in ati userspace driver, we have 0-day for xorg server only. Seems like a candidate for z-stream.

Comment 24 Tomas Pelka 2018-10-17 15:35:44 UTC
Alan may I ask you for GSSApproved for future z-stream?

Thanks
-Tom

Comment 25 Pavel Holica 2018-10-18 06:53:45 UTC
And for question in comment 20. I've tried updating just only Xorg from xorg-x11-server-1.20.1-5.el7 https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=779167 with no change.

Fix from scratch build in comment 21 works.

Comment 26 Ben 2018-10-31 11:26:03 UTC
I've got two Radeon cards in two different Dell workstations:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV610 [Radeon HD 2400 PRO/XT]
and
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland XT [Radeon HD 8670 / R7 250/350] (rev 81)

Both of them work perfectly with xorg-x11-drv-ati-7.7.1-3.20160928git3fc839ff.el7.x86_64 (and associated xorg-* package dependencies), but DO NOT work with anything greater than that, such as

xorg-x11-drv-ati-7.10.0-1.el7.x86_64 or
xorg-x11-drv-ati-18.0.1-1.el7.x86_64

The symptoms are that you can boot, get GDM, and even log in (I use KDE, but I've seen this under GNOME too), but then while the pointer continues to work, the desktop becomes unresponsive (the clock doesn't update either).  The only way I've been able to get things working again is to downgrade to xorg-x11-drv-ati-7.7.1-3.20160928git3fc839ff.el7.x86_64 again.

Please help.

Comment 27 Ben 2018-10-31 11:42:46 UTC
Created attachment 1499340 [details]
Xorg crash report

Attached crash report for Xorg when using the new Xorg ati driver.

Comment 28 Tomas Pelka 2018-10-31 12:51:26 UTC
Raising severity as it seems customer is seeing that too.

Comment 29 Christian Labisch 2018-11-06 12:04:55 UTC
I can confirm the bug, GUI is not usable after upgrading from RHEL 7.5 to 7.6 :
AMD RV635/M86 [Mobility Radeon HD 3650] -> xorg-x11-drv-ati-18.0.1-1.el7.x86_64

I've added nomodeset or radeon.modeset=0 to the boot parameters as a workaround.
Although the GUI screen resolution being in use is only the basic one of course.

The scratch build link provided in Comment 21 seems broken : Server Not Found
I've tested the latest stable fedora package : xorg-x11-drv-ati-18.1.0-1.fc29

Package source -> https://koji.fedoraproject.org/koji/buildinfo?buildID=1153920
GUI seems to work correctly with these drivers ... though ABRT reports crashes.

Comment 30 cnman 2018-11-21 11:36:19 UTC
I can confirm the bug, GUI is not usable after upgrading from RHEL 7.5 to 7.6

$ lspci | grep ATI
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV710/M92 [Mobility Radeon HD 4530/4570/545v]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI Audio [Radeon HD 4000 series]

temporarily install:
xorg-x11-drv-ati-18.1.0-1.fc30.x86_64.rpm or xorg-x11-drv-ati-18.1.0-1.el8.x86_64.rpm works xorg-x11-drv-ati-18.0.1-2.fc29.x86_64.rpm don't work

Comment 31 cnman 2018-11-21 11:37:12 UTC
Created attachment 1507632 [details]
xorglog

Comment 32 Shing-Shong Shei 2018-12-06 14:59:01 UTC
(In reply to cnman from comment #30)
> I can confirm the bug, GUI is not usable after upgrading from RHEL 7.5 to 7.6
> 
> $ lspci | grep ATI
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> RV710/M92 [Mobility Radeon HD 4530/4570/545v]
> 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV710/730 HDMI
> Audio [Radeon HD 4000 series]
> 
> temporarily install:
> xorg-x11-drv-ati-18.1.0-1.fc30.x86_64.rpm or
> xorg-x11-drv-ati-18.1.0-1.el8.x86_64.rpm works
> xorg-x11-drv-ati-18.0.1-2.fc29.x86_64.rpm don't work

Ditto here; i.e., upgrading from RHEL 7.5 to 7.6 broke the GUI. But building the RPM from
https://kojipkgs.fedoraproject.org//packages/xorg-x11-drv-ati/18.1.0/1.fc29/src/xorg-x11-drv-ati-18.1.0-1.fc29.src.rpm

and install it solved the problem. Thanks.

Comment 33 Ben 2018-12-06 15:06:53 UTC
Looks like there's a solution.  Is there any chance RH will roll the changes into an EL7 RPM before Christmas?

Comment 34 cnman 2018-12-07 12:49:24 UTC
(In reply to Ben from comment #33)
> Looks like there's a solution.  Is there any chance RH will roll the changes
> into an EL7 RPM before Christmas?

that not a final solution,xorg-x11-server-Xorg keep crashing。

Comment 35 Martin Jürgens 2018-12-24 00:30:48 UTC
Same issue for me, as a temporary workaround installing xorg-x11-drv-ati-18.1.0-1.fc29.x86_64 fixes this..

01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RS880 [Radeon HD 4200]


0: /usr/bin/X (xorg_backtrace+0x55) [0x55c9ac452185]
1: /usr/bin/X (0x55c9ac2a1000+0x1b4e09) [0x55c9ac455e09]
2: /lib64/libpthread.so.0 (0x7ff48c19d000+0xf5d0) [0x7ff48c1ac5d0]
3: /lib64/libc.so.6 (gsignal+0x37) [0x7ff48be06207]
4: /lib64/libc.so.6 (abort+0x148) [0x7ff48be078f8]
5: /lib64/libc.so.6 (0x7ff48bdd0000+0x2f026) [0x7ff48bdff026]
6: /lib64/libc.so.6 (0x7ff48bdd0000+0x2f0d2) [0x7ff48bdff0d2]
7: /usr/lib64/xorg/modules/libfb.so (0x7ff4874e4000+0x4ac0) [0x7ff4874e8ac0]
8: /usr/lib64/xorg/modules/libfb.so (0x7ff4874e4000+0xf1fa) [0x7ff4874f31fa]
9: /usr/lib64/xorg/modules/libglamoregl.so (glamor_validate_gc+0x185) [0x7ff4872be805]
10: /usr/bin/X (0x55c9ac2a1000+0x138b54) [0x55c9ac3d9b54]
11: /usr/bin/X (ValidateGC+0x1b) [0x55c9ac31106b]
12: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7ff48833a000+0x50680) [0x7ff48838a680]
13: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7ff48833a000+0x4dc9d) [0x7ff488387c9d]
14: /usr/bin/X (AbortDDX+0x85) [0x55c9ac33c525]
15: /usr/bin/X (0x55c9ac2a1000+0x1bd772) [0x55c9ac45e772]
16: /usr/bin/X (0x55c9ac2a1000+0x1be5dd) [0x55c9ac45f5dd]
17: /usr/bin/X (0x55c9ac2a1000+0x1b4e69) [0x55c9ac455e69]
18: /lib64/libpthread.so.0 (0x7ff48c19d000+0xf5d0) [0x7ff48c1ac5d0]
19: /lib64/libc.so.6 (gsignal+0x37) [0x7ff48be06207]
20: /lib64/libc.so.6 (abort+0x148) [0x7ff48be078f8]
21: /lib64/libc.so.6 (0x7ff48bdd0000+0x2f026) [0x7ff48bdff026]
22: /lib64/libc.so.6 (0x7ff48bdd0000+0x2f0d2) [0x7ff48bdff0d2]
23: /usr/lib64/xorg/modules/libfb.so (0x7ff4874e4000+0x4ac0) [0x7ff4874e8ac0]
24: /usr/lib64/xorg/modules/libfb.so (0x7ff4874e4000+0xf1fa) [0x7ff4874f31fa]
25: /usr/lib64/xorg/modules/libglamoregl.so (glamor_validate_gc+0x185) [0x7ff4872be805]
26: /usr/bin/X (0x55c9ac2a1000+0x138b54) [0x55c9ac3d9b54]
27: /usr/bin/X (ValidateGC+0x1b) [0x55c9ac31106b]
28: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7ff48833a000+0x50680) [0x7ff48838a680]
29: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7ff48833a000+0x4dc9d) [0x7ff488387c9d]
30: /usr/bin/X (0x55c9ac2a1000+0xac274) [0x55c9ac34d274]
31: /usr/bin/X (0x55c9ac2a1000+0xc88dc) [0x55c9ac3698dc]
32: /usr/bin/X (0x55c9ac2a1000+0xe7bf8) [0x55c9ac388bf8]
33: /usr/bin/X (0x55c9ac2a1000+0x604c4) [0x55c9ac3014c4]
34: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7ff48bdf23d5]
35: /usr/bin/X (0x55c9ac2a1000+0x4a4ce) [0x55c9ac2eb4ce]

Comment 37 Christian Labisch 2019-01-12 15:42:28 UTC
It's about two and a half months ago since RHEL 7.6 has been released.
I'm still using the xorg-x11-drv-ati-18.1.0-1.el8.x86_64 beta drivers.
To avoid the annoying error messages, I've disabled abrt-xorg.service.
It would be nice to get a fixed drivers package in the near future. :)

Comment 38 Levente Farkas 2019-01-13 16:43:52 UTC
IMHO it's a serious regression. we also use the 18.1.0-1 from fedora to fix all of our servers. I don't really understand why rh do not release a patch/update???

Comment 39 Ben 2019-01-13 18:49:22 UTC
I've a number of workstations with ATI cards, all of which are stuck at 7.7.1-3.20160928git3fc839ff until something is done.  I guess there's a queue for things like this?  If so, can someone say where it is in it?

Comment 40 Christian Labisch 2019-01-30 10:38:47 UTC
The bug report can be closed because the drivers 18.1.0-1.el7_6 are available in rhel-7-server-rpms.
https://access.redhat.com/downloads/content/xorg-x11-drv-ati/18.1.0-1.el7_6/x86_64/fd431d51/package
Although ABRT still reports (false positive ?) crashes. But the main issue is fixed, GUI works fine.

Comment 41 Tomas Pelka 2019-02-06 10:58:32 UTC
Closing based on c40.


Note You need to log in before you can comment on or make changes to this bug.