Bug 1010357 - lock up of Xorg on start when openssl-fips is installed
lock up of Xorg on start when openssl-fips is installed
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openssl (Show other bugs)
7.0
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Tomas Mraz
Hubert Kario
[cat:lockup]
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-20 11:18 EDT by Matěj Cepl
Modified: 2014-06-17 23:51 EDT (History)
4 users (show)

See Also:
Fixed In Version: openssl-1.0.1e-21.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 06:43:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg output (73.23 KB, text/plain)
2013-09-20 11:18 EDT, Matěj Cepl
no flags Details
/var/log/Xorg.0.log (21.18 KB, text/plain)
2013-09-20 11:19 EDT, Matěj Cepl
no flags Details

  None (edit)
Description Matěj Cepl 2013-09-20 11:18:18 EDT
Created attachment 800532 [details]
dmesg output

Description of problem:
I don't know what actually happened, but I am not able to get Xorg running. Both gdm, startx and plain /usr/bin/X running lock up the screen. I can connect via ssh, but I have no screen (even Ctrl-Alt-F2) by any means.

On Adam's advice I have tried to play with

Option     "AccelMethod"      "UXA"

and now I have

Option     "NoAccel"           "yes"

but it doesn't seem to make any difference, I have still the backtrace which to my naive eyes look same (as shown below).

Version-Release number of selected component (if applicable):
xorg-x11-drv-intel-2.21.12-2.el7.x86_64
xorg-x11-server-common-1.14.2-10.el7.x86_64
xorg-x11-server-Xorg-1.14.2-10.el7.x86_64
libdrm-2.4.46-1.el7.x86_64
mesa-libGL-9.2-1.20130902.el7.i686


How reproducible:
Unfortunately 100%, so I don't have my work computer :()

Steps to Reproduce:
1.see above
2.
3.

Actual results:
black (or kind whiteish, greysih, when running gdm) screen

Expected results:
either perfectly working Xorg or some kind of degraded experience, but SOMETHING, please ... please.

Additional info:

Full backtrace of the Xorg server:

0x00007fe968d19bfc in __pthread_mutex_unlock_usercnt (decr=1, mutex=0x7fe969d6d908 <_rtld_local+2312>) at pthread_mutex_unlock.c:52
52	      lll_unlock (mutex->__data.__lock, PTHREAD_MUTEX_PSHARED (mutex));
Missing separate debuginfos, use: debuginfo-install freetype-2.4.11-6.el7.x86_64 libXdamage-1.1.4-3.el7.x86_64 libXext-1.3.2-1.el7.x86_64 libXfixes-5.0.1-1.el7.x86_64 libXxf86vm-1.1.3-1.el7.x86_64 libfontenc-1.1.1-3.el7.x86_64 libgcc-4.8.1-9.el7.x86_64 libxcb-1.9-3.el7.x86_64 mesa-libEGL-9.2-1.20130902.el7.x86_64 mesa-libGL-9.2-1.20130902.el7.x86_64 mesa-libgbm-9.2-1.20130902.el7.x86_64 mesa-libglapi-9.2-1.20130902.el7.x86_64 pcre-8.32-7.el7.x86_64 xorg-x11-drv-intel-2.21.12-2.el7.x86_64 zlib-1.2.7-10.el7.x86_64
(gdb) bt full
(gdb) bt full
#0  0x00007fe968d19bfc in __pthread_mutex_unlock_usercnt (decr=1, mutex=0x7fe969d6d908 <_rtld_local+2312>) at pthread_mutex_unlock.c:52
        type = 1
#1  __GI___pthread_mutex_unlock (mutex=0x7fe969d6d908 <_rtld_local+2312>) at pthread_mutex_unlock.c:297
No locals.
#2  0x00007fe969b4d069 in tls_get_addr_tail (ti=0x7fe969936f58, dtv=0x7fe969d34290, the_map=0x7fe969d6bb20) at dl-tls.c:753
No locals.
#3  0x00007fe96972077e in init_thread_destructor () at procattr.c:64
No locals.
#4  getprocattrcon_raw (context=context@entry=0x7fff5c993fa8, pid=pid@entry=0, attr=attr@entry=0x7fe96972f3e6 "current") at procattr.c:112
        buf = <optimized out>
        size = <optimized out>
        fd = <optimized out>
        ret = <optimized out>
        errno_hold = <optimized out>
        prev_context = <optimized out>
#5  0x00007fe969720a3e in getcon_raw_internal (c=c@entry=0x7fff5c993fa8) at procattr.c:325
No locals.
#6  0x00007fe96972eba0 in is_selinux_enabled_internal () at enabled.c:26
        enabled = 1
        con = 0x7fe969d6e208 ""
#7  0x00000000004f5e29 in SELinuxExtensionInit () at xselinux_ext.c:695
        extEntry = <optimized out>
#8  0x00000000004b9cb1 in InitExtensions (argc=argc@entry=12, argv=argv@entry=0x7fff5c994138) at ../../../mi/miinitext.c:337
        i = <optimized out>
        ext = <optimized out>
#9  0x00000000004260c0 in main (argc=12, argv=0x7fff5c994138, envp=<optimized out>) at main.c:208
        i = <optimized out>
        alwaysCheckForInput = {0, 1}
(gdb)
Comment 1 Matěj Cepl 2013-09-20 11:19:33 EDT
Created attachment 800542 [details]
/var/log/Xorg.0.log
Comment 3 Matěj Cepl 2013-09-23 09:44:59 EDT
After removal of openssl-fips (and dependent packages) Xorg works like charm.
Comment 4 Tomas Mraz 2013-09-23 10:23:28 EDT
The situation as is:

1. openssl needs to do some initialization in a constructor due to new FIPS requirements

2. unless the code done in the constructor is trivial, the X will hang in cycle in tls_get_addr_tail() from glibc which is called from libselinux. This happens after the openssl constructor already run.
Comment 5 Tomas Mraz 2013-09-23 10:26:50 EDT
Carlos, do you have any idea what could be the cause or at least how to find it?
Comment 6 Tomas Mraz 2013-09-23 12:29:17 EDT
So the cause was that I was dlopening libssl.so from libcrypto in the constructor. Which I can avoid. Although I'd still like to know from some glibc guru why it must not be done.
Comment 7 Carlos O'Donell 2013-09-24 04:29:38 EDT
(In reply to Tomas Mraz from comment #6)
> So the cause was that I was dlopening libssl.so from libcrypto in the
> constructor. Which I can avoid. Although I'd still like to know from some
> glibc guru why it must not be done.

There should be no problem using dlopen to load libssl.so from a constructor in libcrypto e.g. __attribute__ ((constructor)).

Your particular backtrace doesn't show any dlopen-related calls. It doesn't show that the particular thread (are there threads?) is blocked, just that it's unlocking a mutex (which is normal). Where is the cycle?
Comment 8 Tomas Mraz 2013-09-24 04:44:17 EDT
Actually the mutex lock/unlock immediately returns and the cycle is done by goto again; in the tls_get_addr_tail() Also note that there is only a single thread so there is no reason why the mutex should block.

There might be a problem that libssl.so depends on many other shared libraries and something is messed up on the load. I don't know. Anyway it seems to be much safer to not load it as it is not strictly necessary.
Comment 9 Carlos O'Donell 2013-09-24 05:46:54 EDT
(In reply to Tomas Mraz from comment #8)
> Actually the mutex lock/unlock immediately returns and the cycle is done by
> goto again; in the tls_get_addr_tail() Also note that there is only a single
> thread so there is no reason why the mutex should block.

That's certainly odd. The only way for this happen would be for l_tls_offset to be non-zero and positive but the module's block to remain at TLS_DTV_UNALLOCATED. That should never happen.
Comment 10 Tomas Mraz 2013-09-24 06:06:47 EDT
As I said there is only a single thread so the values can hardly change in the loop.
Comment 15 Ludek Smid 2014-06-13 06:43:58 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.