Bug 1264556 - Make dlclose and exit concurrent calls work correctly per POSIX.
Summary: Make dlclose and exit concurrent calls work correctly per POSIX.
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1189856 (view as bug list)
Depends On:
Blocks: 1189856 1274922
TreeView+ depends on / blocked
 
Reported: 2015-09-18 19:31 UTC by Robbie Harwood
Modified: 2019-04-09 14:43 UTC (History)
33 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-17 23:59:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rpc-gssd output with LD_DEBUG=all (1.46 MB, text/plain)
2016-11-23 11:08 UTC, nicolas.rolans
no flags Details
Output of ssh session with LD_DEBUG=all (7.51 MB, text/plain)
2016-11-23 11:09 UTC, Lorenzo Dalrio
no flags Details
ssh with LD_DEBUG=all (7.88 MB, text/plain)
2016-11-23 16:18 UTC, Viorel Tabara
no flags Details

Description Robbie Harwood 2015-09-18 19:31:28 UTC
Hello,

krb5 (libgssapi) dlopen()s a file which dlopen()s krb5.  This has in the past caused problems[0] and so in [1] we started dlopen()ing with RTLD_NODELETE.

However, in rawhide the problem is back.  Here's a trace:

(gdb) bt
#0  _dl_close (_map=0xafa160) at dl-close.c:811
#1  0x00007f7b0ca85164 in _dl_catch_error (objname=0xafa0c0, errstring=0xafa0c8, mallocedp=0xafa0b8, operate=0x7f7b0ba07090 <dlclose_doit>, args=0xafa160) at dl-error.c:187
#2  0x00007f7b0ba07631 in _dlerror_run (operate=operate@entry=0x7f7b0ba07090 <dlclose_doit>, args=0xafa160) at dlerror.c:163
#3  0x00007f7b0ba070bf in __dlclose (handle=<optimized out>) at dlclose.c:46
#4  0x00007f7b0b0e17c5 in krb5int_close_plugin (h=0xb020e0) at plugins.c:416
#5  0x00007f7b0c4275aa in ?? ()
#6  0x000000000cc8c040 in ?? ()
#7  0x335b2b426230eb00 in ?? ()
#8  0x0000000000afa0b0 in ?? ()
#9  0x0000000000000000 in ?? ()
(gdb) p map->l_init_called
$6 = 0
(gdb) 

To reiterate, this is during the exit phase of the program, after main has returned, and the program otherwise ran correctly.

[0]: http://krbdev.mit.edu/rt/Ticket/Display.html?id=7135
[1]: http://krbdev.mit.edu/rt/Ticket/Display.html?id=7947

Comment 1 Carlos O'Donell 2015-09-19 02:40:10 UTC
This looks like dlclose racing against exit. It is undefined behaviour to call anything after exit has been started (for many reasons including the fact that destructors are starting to run for all not-yet closed objects).

Only exit will call _dl_fini which will set map->l_init_called to zero, to indicate that the shtudown processing has handled that particular finalization for the object.

Therefore this looks like a user bug. You're going to have to track down if there is a thread running that's still active and calling dlclose, while the rest of the program has already called exit to shutdown. For the sake of robustness you should ensure all threads are stopped and joined before you call exit.

Given my description, is it possible you have a dlclose and an exit happening near eachother in the way I've described?

Comment 2 Carlos O'Donell 2015-09-19 02:54:30 UTC
Sorry, I mentioned "undefined behaviour" and that's not quite true. In the broadest sense you get data races which make your program undefined per the C standard, but that need not be the case. You could in theory have a thread that survives the exit process and is terminated only by the final kill signal. An atexit registered function might even load a value from that thread using an atomic read of some shared global state. Either way, I just wanted to correct the slight language issue.

There is still the real issue that dlclose and exit without any happens-before relationships is going to cause problems. However, because I know nothing about krb5 and libgssapi I have no idea about the threads you might have active. I'm relying on your help there.

Comment 3 Carlos O'Donell 2015-09-22 11:43:02 UTC
Upstream discussion here:
https://www.sourceware.org/ml/libc-alpha/2015-09/msg00536.html

I had my suspicious about glibc's behaviour here, and upstream agrees with me, that while this is dangerous to do in glibc, it is standards mandated behaviour via POSIX that dlclose and exit shall both be thread safe. Therefore any data races are flaws in glibc

Here is a build with fixes for this issue:
http://koji.fedoraproject.org/koji/taskinfo?taskID=11172534

Comment 4 Carlos O'Donell 2015-10-30 14:40:24 UTC
The next step here is to get access again to a machine that reproduce the issue and work out the concurrency fixes required to fix exit and dlclose racing.

Comment 8 Jan Kurik 2016-02-24 13:46:28 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

Comment 9 Viorel Tabara 2016-08-21 04:39:13 UTC
I've hit this in ipa-client-install and ssh:

   [root@omiday ~]# ipa-client-install 
   Using existing certificate '/etc/ipa/ca.crt'.
   Discovery was successful!
   Client hostname: omiday.can.local
   Realm: CAN.LOCAL
   DNS Domain: can.local
   IPA Server: stuff.can.local
   BaseDN: dc=can,dc=local

   Continue to configure the system with these values? [no]: yes
   Removed old keys for realm CAN.LOCAL from /etc/krb5.keytab
   Synchronizing time with KDC...
   Attempting to sync time using ntpd.  Will timeout after 15 seconds
   Attempting to sync time using ntpd.  Will timeout after 15 seconds
   Unable to sync time with NTP server, assuming the time is in sync. Please check that 123 UDP port is opened.
   User authorized to enroll computers: admin
   Password for admin: 
   Joining realm failed: Host is already joined.
   Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!

   Installation failed. Rolling back changes.
   IPA client is not configured on this system.


   omiday ~ $ ssh -S none stuff.can.local 
   Last login: Sat Aug 20 22:03:41 2016 from 192.168.0.11
   [root@stuff ~]# logout
   Connection to 192.168.0.9 closed.
   Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!
 
This is on a system upgraded earlier this month from F24 to F25-alpha:

   [root@omiday ~]# dnf history list 100
   ID     | Command line             | Date and time    | Action(s)      | Altered
   -------------------------------------------------------------------------------
      100 | --releasever=25 system-u | 2016-08-11 13:56 | D, E, I, O, U  | 1492 EE
 

Running glibc-2.24-3.fc25.x86_64.

Comment 10 Viorel Tabara 2016-10-06 05:23:59 UTC
While investigating hit-and-miss Ansible playbooks connection errors I 
discovered that disabling 'gssapi-with-mic' works around the ssh issue:

   omiday ~ $ ssh -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey stuff.can.local ; echo $?
   Last login: Wed Oct  5 22:53:08 2016 from 192.168.0.11
   [root@stuff ~]# logout
   Connection to 192.168.0.9 closed.
   Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!
   127
   omiday ~ $ 
   omiday ~ $ 
   omiday ~ $ 
   omiday ~ $ ssh -o PreferredAuthentications=gssapi-keyex,hostbased,publickey stuff.can.local ; echo $?
   Last login: Wed Oct  5 22:55:16 2016 from 192.168.0.11
   [root@stuff ~]# logout
   Connection to 192.168.0.9 closed.
   0

So I don't know if this helps in any way with the request in comment 4 but if 
there is anything else I can help with let me know.

Comment 11 Robbie Harwood 2016-10-12 18:07:42 UTC
*** Bug 1189856 has been marked as a duplicate of this bug. ***

Comment 12 Baptiste Mille-Mathias 2016-10-31 17:40:16 UTC
to comment #4 the ssh server git.gnome.org exhibits the issue. I have an error each time I clone a repository. (I've open the related bug #

Comment 13 Viorel Tabara 2016-10-31 18:13:48 UTC
(In reply to Baptiste Mille-Mathias from comment #12)
> to comment #4 the ssh server git.gnome.org exhibits the issue. I have an
> error each time I clone a repository. (I've open the related bug #

I've worked around it by setting PreferredAuthentications=publickey in ssh_config so scripts relying on ssh exit code still work. It's a client side issue.

Comment 14 Bojan Smojver 2016-11-12 06:47:16 UTC
I'm seeing this after upgrade to F25. But only on one x86_64 system, which is a VM running on ESXi. The other two x86_64 systems, which are bare metal, do not seem to be showing this problem.

It appears for both ssh and rcp.gssd, if it matters.

Comment 15 Bojan Smojver 2016-11-16 03:16:06 UTC
Does anyone know how to fix this for good in glibc? Or krb5?

This is 100% repeatable on my F25 x86_64 VM that uses Kerberos against AD.

PS. Have another F25 x86_64 machine (bare metal) that uses Kerberos against MIT krb5 server and I haven't seen it there. In fact, rpc-gssd.service fires up on it just fine.

Comment 16 Robbie Harwood 2016-11-16 04:15:51 UTC
(In reply to Bojan Smojver from comment #15)
> Does anyone know how to fix this for good in glibc? Or krb5?

This is a glibc bug, not a krb5 bug.  It is a race condition between dlclose() and exit() that trips an assert in glibc.

A possible workaround might be to LD_PRELOAD something to turn dlclose() into a no-op, but I can't speak to what the consequences of this are.

Alternately, a build with the assert patched out may work; I do not know.

Comment 17 Bojan Smojver 2016-11-16 05:37:55 UTC
Hmm, wouldn't patching out assert just mask the real issue?

Anyhow, if this is a race, is there a way to fix the race or is the nature of the problem such that this is simply unfixable?

Comment 18 Carlos O'Donell 2016-11-16 13:04:18 UTC
(In reply to Bojan Smojver from comment #17)
> Hmm, wouldn't patching out assert just mask the real issue?
> 
> Anyhow, if this is a race, is there a way to fix the race or is the nature
> of the problem such that this is simply unfixable?

It is fixable, we (Fedora glibc team) have just prioritized other issues over this one. Thank you for commenting on this issue though since it helps us keep the priority current.

Comment 19 Bojan Smojver 2016-11-16 21:58:58 UTC
(In reply to Carlos O'Donell from comment #18)

> It is fixable, we (Fedora glibc team) have just prioritized other issues
> over this one. Thank you for commenting on this issue though since it helps
> us keep the priority current.

No worries. I can pester you with questions all you want. :-)

PS. For people that are hitting this problem, the system runs in a degraded state, because rpc-gssd.service won't start.

Comment 20 Lorenzo Dalrio 2016-11-20 10:39:42 UTC
I am hitting this one after upgrading from f24 to f25 yesterday on a t450 laptop.

Comment 21 Florian Weimer 2016-11-20 13:36:59 UTC
Have we actually confirmed this is a race involving multiple threads?

Comment 22 Viorel Tabara 2016-11-20 19:59:52 UTC
(In reply to Florian Weimer from comment #21)
> Have we actually confirmed this is a race involving multiple threads?

How do you do that?

Comment 23 Florian Weimer 2016-11-20 20:29:20 UTC
(In reply to Viorel Tabara from comment #22)
> (In reply to Florian Weimer from comment #21)
> > Have we actually confirmed this is a race involving multiple threads?
> 
> How do you do that?

Run “info threads” if you can reproduce the assertion failure under GDB, or when examining the core file.  If there is just one thread, the program may still have been multi-threaded at the time the bug was trigger, so it is only a reliable indicator when multiple threads are present.  If you can trigger the bug while running under GDB, then you can watch out for the LWP messages.

(The question was more directed towards Carlos.)

Comment 24 Michael Cronenworth 2016-11-23 05:45:28 UTC
Downgrading to glibc-2.23.1-11.fc24 makes the assert stop.

OpenSSH does not have multiple threads running when this occurs.

It seems to only affect OpenSSH. Other apps that dlopen() gssapi krb5 libs, such as the OpenSSL and cURL clients, do not hit this assert.

Comment 25 Bojan Smojver 2016-11-23 06:06:21 UTC
(In reply to Michael Cronenworth from comment #24)

> It seems to only affect OpenSSH.

Try starting rpc-gssd.service. I have a machine where it works and one where it won't start because of the assert.

Comment 26 Florian Weimer 2016-11-23 10:55:26 UTC
I think we have two different bugs here.  The first one is the multi-threading race between dlclose and process exit.

The other bug is some other form of link map corruption, probably along the lines of bug 1395758.

Can those who encounter the other bug (with ssh and rpc-ssd.service) please run these processes with LD_DEBUG=all and attach the output?  Thanks.

Comment 27 nicolas.rolans 2016-11-23 11:08:18 UTC
Created attachment 1223171 [details]
rpc-gssd output with LD_DEBUG=all

As requested by Florian Weimer.

Comment 28 Lorenzo Dalrio 2016-11-23 11:09:59 UTC
Created attachment 1223172 [details]
Output of ssh session with LD_DEBUG=all

This is the output of ssh with LD_DEBUG=all.

Comment 29 Fernando Herrera 2016-11-23 15:57:30 UTC
I think the problem with ssh and krb is that _exit calls _dl_fini for /usr/lib64/gssntlmssp/gssntlmssp.so first and then it calls _dl_fini for /lib64/libgssapi_krb5.so.2 which does freeMechList -> releaseMechInfo which calls krb5int_close_plugin for the already closed gssntlmssp.so

Comment 30 Viorel Tabara 2016-11-23 16:18:00 UTC
Created attachment 1223334 [details]
ssh with LD_DEBUG=all

(In reply to Florian Weimer from comment #26)
> > Can those who encounter the other bug (with ssh and rpc-ssd.service) please
> run these processes with LD_DEBUG=all and attach the output?  Thanks.

There you go. Thanks!

Comment 31 Fernando Herrera 2016-11-23 16:46:38 UTC
gssntlmssp.so plugin is opened at gss_init_sec_context -> gssint_get_mechanism but as mech->gss_init_sec_context is NULL the function returns without closing the plugin

Comment 32 Florian Weimer 2016-11-24 15:15:18 UTC
The ssh client assert has been filed as bug 1398370.

Comment 33 Colin.Simpson 2016-11-25 14:17:58 UTC
I see rpc.gssd failing to start from boot in the F25.

Do you still need debug on this?



systemctl status rpc-gssd.service
● rpc-gssd.service - RPC security service for NFS client and server
   Loaded: loaded (/usr/lib/systemd/system/rpc-gssd.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2016-11-25 14:15:34 GMT; 7s ago
  Process: 5528 ExecStart=/usr/sbin/rpc.gssd $RPCSVCGSSDARGS (code=exited, status=127)

Nov 25 14:15:34 shader systemd[1]: Starting RPC security service for NFS client and server...
Nov 25 14:15:34 shader rpc.gssd[5528]: Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!
Nov 25 14:15:34 shader systemd[1]: rpc-gssd.service: Control process exited, code=exited status=127
Nov 25 14:15:34 shader systemd[1]: Failed to start RPC security service for NFS client and server.
Nov 25 14:15:34 shader systemd[1]: rpc-gssd.service: Unit entered failed state.
Nov 25 14:15:34 shader systemd[1]: rpc-gssd.service: Failed with result 'exit-code'.

Comment 34 more.andres 2016-12-06 01:19:54 UTC
Also bumped into the same issue after upgrading into F25.

rpc.gssd[1179]: Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called'

Comment 35 Mats Wichmann 2016-12-07 23:33:04 UTC
I'm running into this in a very persistent way using git-over-ssh.

Comment 36 Robbie Harwood 2016-12-12 19:23:00 UTC
*** Bug 1398370 has been marked as a duplicate of this bug. ***

Comment 37 Anthony Messina 2016-12-27 13:18:32 UTC
This seems to be fixed with the updates described in #1398370

Comment 38 Florian Weimer 2016-12-27 13:26:04 UTC
(In reply to Anthony Messina from comment #37)
> This seems to be fixed with the updates described in #1398370

No, bug 1398370 and bug 1264556 are different bugs, and bug 1264556 is not yet fixed.

Comment 39 Bojan Smojver 2016-12-28 01:07:20 UTC
(In reply to Florian Weimer from comment #38)

> No, bug 1398370 and bug 1264556 are different bugs, and bug 1264556 is not
> yet fixed.

Interesting. That gss service now starts fine for me. And openssh stopped with the errors as well.

Comment 40 Fedora End Of Life 2017-11-16 19:45:40 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 41 Robbie Harwood 2017-11-16 20:59:13 UTC
Reproducer is gone, but the issue remains, and may come back.

Comment 42 Carlos O'Donell 2017-11-17 23:59:16 UTC
I believe we have fixed the most egregious problems just recently in upstream. Therefore I think this is fixed for Rawhide. I'm going to mark this CLOSED/RAWHIDE, and it can be reopened if someone shows a testcase that fails in Rawhide.

commit e5e4d7cc056ffae51fc55b66d9dd0abd99927486
Author: Paul Pluzhnikov <ppluzhnikov>
Date:   Thu Sep 21 12:14:41 2017 -0700

    Fix BZ# 22180.
    
    POSIX requires that dlclose() and exit() be thread safe, therefore
    you can have one thread in the middle of dlclose() and another thread
    executing exit() without causing any undefined behaviour on the part
    of the implementation.
    
    The existing implementation had a flaw that exit() exit handler processing
    did not consider a concurrent dlclose() and would not mark already run
    exit handlers using the ef_free flavour. The consequence of this is that
    a concurrent exit() with dlclose() will run all the exit handlers that
    dlclose() had not yet run, but then will block on the loader lock. The
    concurrent dlclose() will continue to run all the exit handlers again
    (twice) in violation of the Itanium C++ ABI requirements for __cxa_atexit().
    
    This commit fixes this by having exit() mark all handlers with ef_free to
    ensure that concurrent dlclose() won't re-run registered exit handlers that
    have already run.

Comment 43 Jan Pokorný [poki] 2019-04-09 13:01:53 UTC
Interestingly, have just observed "Inconsistency detected by ld.so"
message when running cvlc (vlc-core-3.0.5-6.fc30.x86_64, built on
buildvm-06.virt.rpmfusion.net so the source is clear, yet still
reproducible if need be).  Sadly, I've been resizing the terminal
and got the message lost, but ISTR there was indeed "map" variable
mentioned.

Cannot reproduce that anymore now, it hapenned when cvlc was run for
the first time after boot, which happened several hours earlier).

$ rpm -q glibc
> glibc-2.29.9000-9.fc31.x86_64

$ file /usr/bin/cvlc
> /usr/bin/cvlc: a /usr/bin/sh script, ASCII text executable

$ cat /usr/bin/cvlc
> #!/usr/bin/sh
> exec /usr/bin/vlc -I "dummy" "$@"

$ ldd /usr/bin/vlc
>	linux-vdso.so.1 (0x00007ffde278a000)
>	libvlc.so.5 => /lib64/libvlc.so.5 (0x00007fa0cd846000)
>	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa0cd825000)
>	libdl.so.2 => /lib64/libdl.so.2 (0x00007fa0cd81e000)
>	libc.so.6 => /lib64/libc.so.6 (0x00007fa0cd659000)
>	libvlccore.so.9 => /lib64/libvlccore.so.9 (0x00007fa0cd566000)
>	libm.so.6 => /lib64/libm.so.6 (0x00007fa0cd420000)
>	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fa0cd404000)
>	/lib64/ld-linux-x86-64.so.2 (0x00007fa0cd8a8000)
>	libidn.so.12 => /lib64/libidn.so.12 (0x00007fa0cd3cf000)
>	libdbus-1.so.3 => /lib64/libdbus-1.so.3 (0x00007fa0cd379000)
>	libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fa0cd2c9000)
>	librt.so.1 => /lib64/librt.so.1 (0x00007fa0cd2be000)
>	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fa0cd293000)
>	liblz4.so.1 => /lib64/liblz4.so.1 (0x00007fa0cd274000)
>	libgcrypt.so.20 => /lib64/libgcrypt.so.20 (0x00007fa0cd156000)
>	libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007fa0cd133000

$ ldd /lib64/libvlc.so.5
>	linux-vdso.so.1 (0x00007ffe8d16d000)
>	libvlccore.so.9 => /lib64/libvlccore.so.9 (0x00007f9b58364000)
>	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9b58343000)
>	libm.so.6 => /lib64/libm.so.6 (0x00007f9b581fd000)
>	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9b581e3000)
>	libc.so.6 => /lib64/libc.so.6 (0x00007f9b5801e000)
>	libidn.so.12 => /lib64/libidn.so.12 (0x00007f9b57fe9000)
>	libdl.so.2 => /lib64/libdl.so.2 (0x00007f9b57fe0000)
>	libdbus-1.so.3 => /lib64/libdbus-1.so.3 (0x00007f9b57f8a000)
>	/lib64/ld-linux-x86-64.so.2 (0x00007f9b584b4000)
>	libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007f9b57eda000)
>	librt.so.1 => /lib64/librt.so.1 (0x00007f9b57ecf000)
>	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f9b57ea6000)
>	liblz4.so.1 => /lib64/liblz4.so.1 (0x00007f9b57e87000)
>	libgcrypt.so.20 => /lib64/libgcrypt.so.20 (0x00007f9b57d67000)
>	libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007f9b57d44000)

$ ldd /lib64/libvlccore.so.9
>	linux-vdso.so.1 (0x00007ffe07fd9000)
>	libidn.so.12 => /lib64/libidn.so.12 (0x00007f2d05346000)
>	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2d05325000)
>	libdl.so.2 => /lib64/libdl.so.2 (0x00007f2d0531e000)
>	libm.so.6 => /lib64/libm.so.6 (0x00007f2d051d8000)
>	libdbus-1.so.3 => /lib64/libdbus-1.so.3 (0x00007f2d05182000)
>	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f2d05168000)
>	libc.so.6 => /lib64/libc.so.6 (0x00007f2d04fa1000)
>	/lib64/ld-linux-x86-64.so.2 (0x00007f2d054a7000)
>	libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007f2d04ef1000)
>	librt.so.1 => /lib64/librt.so.1 (0x00007f2d04ee6000)
>	liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f2d04ebd000)
>	liblz4.so.1 => /lib64/liblz4.so.1 (0x00007f2d04e9e000)
>	libgcrypt.so.20 => /lib64/libgcrypt.so.20 (0x00007f2d04d80000)
>	libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007f2d04d5b000)

Comment 44 Carlos O'Donell 2019-04-09 13:47:27 UTC
(In reply to Jan Pokorný [poki] from comment #43)
> Interestingly, have just observed "Inconsistency detected by ld.so"
> message when running cvlc (vlc-core-3.0.5-6.fc30.x86_64, built on
> buildvm-06.virt.rpmfusion.net so the source is clear, yet still
> reproducible if need be).  Sadly, I've been resizing the terminal
> and got the message lost, but ISTR there was indeed "map" variable
> mentioned.
> 
> Cannot reproduce that anymore now, it hapenned when cvlc was run for
> the first time after boot, which happened several hours earlier).

We need a core from the abort. If you can get one we can try fix it :-)

Comment 45 Jan Pokorný [poki] 2019-04-09 14:33:31 UTC
Yep, would like to, but there's no trace of the coredump or even the
crash at sight.  The only idea is then that either the glibc source
of that message is not actually aborting loudly (the program has
terminated, though), or that vlc catches SIGABRT and turns that into
a graceful stop.  Either of that will make any troubleshooting tough
for sure.  Main intention is to keep trace of this or similar problem
recidivating to be possibly linked to any future observations,
hopefully with more info available.

Comment 46 Carlos O'Donell 2019-04-09 14:43:52 UTC
(In reply to Jan Pokorný [poki] from comment #45)
> Yep, would like to, but there's no trace of the coredump or even the
> crash at sight.  The only idea is then that either the glibc source
> of that message is not actually aborting loudly (the program has
> terminated, though), or that vlc catches SIGABRT and turns that into
> a graceful stop.  Either of that will make any troubleshooting tough
> for sure.  Main intention is to keep trace of this or similar problem
> recidivating to be possibly linked to any future observations,
> hopefully with more info available.

Agree, it helps to get reports of seeing this in the wild.


Note You need to log in before you can comment on or make changes to this bug.