Bug 1389881

Summary: Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!
Product: [Fedora] Fedora Reporter: Baptiste Mille-Mathias <baptiste.millemathias>
Component: opensshAssignee: Jakub Jelen <jjelen>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 25CC: amahdal, andi.kleen, antonio, baptiste.millemathias, besser82, c.david86, chrisw, fherrera, fweimer, jan.vesely, jbowes, jjelen, mattias.ellert, mgrepl, plautrba, pstodulk, rkagan, ronald.j.bynoe, tmraz, tmz
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-28 09:24:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ssh -v -v -v output against a f25 box (triggers assertion)
none
ssh -v -v -v output against a debian (no assertion) none

Description Baptiste Mille-Mathias 2016-10-29 09:12:19 UTC
Description of problem:
git reports the error "Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!" while cloning repository.

it seems to affect the integrity of the checkout as I have random errors following message like impossible to checkout the repository. 

I reinstalled the git package but it didn't fixed the error

Version-Release number of selected component (if applicable):
2.9.3

How reproducible:
for me, it appears each time,

Comment 1 Petr Stodulka 2016-10-31 14:45:01 UTC
Could you provide reproducer? I can't reproduce it on other systems, neither colleague next to me on F25. Do you have updated system?

Comment 2 Petr Stodulka 2016-10-31 15:27:33 UTC
Just to be sure, you use just packages (including libraries) from F25, right? There are not some old libraries - like really old glibc and so - and you are not using some own version of F25 libraries, right?

Comment 3 Baptiste Mille-Mathias 2016-10-31 17:11:58 UTC
Hi, 

(In reply to pstodulk from comment #1)
> Could you provide reproducer? I can't reproduce it on other systems, neither
> colleague next to me on F25. Do you have updated system?

I just have one system so can't reproduce on other system. I do a a daily "dnf distro-sync".

(In reply to pstodulk from comment #2)
> Just to be sure, you use just packages (including libraries) from F25,
> right? There are not some old libraries - like really old glibc and so - and
> you are not using some own version of F25 libraries, right?

Well, I found some fc24 / fc23 packages, they seems to have not being rebuilded for FC25 (mass rebuild)
- rpm -qa | grep fc25 | wc -l
1439
- rpm -qa | grep fc24 | wc -l
492

rpm -qa --queryformat '%{size} %{name}-%{version}-%{release}\n'|sort -n | grep -v fc25 | tail -n 1
58576522 adobe-source-han-sans-cn-fonts-1.004-2.fc24

http://koji.fedoraproject.org/koji/packageinfo?buildOrder=-completion_time&packageID=18858&tagOrder=name&tagStart=0#buildlist

specifically for glibc I'm up to date
glibc-langpack-en-2.24-3.fc25.x86_64
glibc-langpack-fr-2.24-3.fc25.x86_64
glibc-devel-2.24-3.fc25.x86_64
glibc-common-2.24-3.fc25.x86_64
glibc-utils-2.24-3.fc25.x86_64
glibc-minimal-langpack-2.24-3.fc25.x86_64
glibc-headers-2.24-3.fc25.x86_64
glibc-2.24-3.fc25.x86_64

Comment 4 Baptiste Mille-Mathias 2016-10-31 17:30:06 UTC
The problem lies actually in ssh, as if I ssh directly to the git repository I experience problem with I hit the samel error.

18:28 $ ssh git.gnome.org
SSH authentication succeeded. Interactive login is not allowed.
Connection to git.gnome.org closed.
Inconsistency detected by ld.so: dl-close.c: 811: _dl_close: Assertion `map->l_init_called' failed!

could be related to https://bugzilla.redhat.com/show_bug.cgi?id=1264556

Comment 5 Jakub Jelen 2016-11-01 08:13:37 UTC
Can you provide the verbose log from ssh? Can you provide the backtrace from where the failure comes (to make sure it is the same problem as linked)? I don't think it will be something we can fix in openssh.

Comment 6 Baptiste Mille-Mathias 2016-11-01 11:01:57 UTC
hi Jakub,

Could you give me hints to get that specific backtrace, I know how to get one from a crash not from a running program.

Thanks

Comment 7 Fernando Herrera 2016-11-23 13:53:21 UTC
from ssh under gdb when the assertion is reached:

Breakpoint 1, _dl_close (_map=0x555555834a70) at dl-close.c:811
811	      assert (map->l_init_called);
(gdb) bt
#0  _dl_close (_map=0x555555834a70) at dl-close.c:811
#1  0x00007ffff7de7874 in _dl_catch_error (objname=0x55555582e1a0, errstring=0x55555582e1a8, mallocedp=0x55555582e198, operate=0x7ffff7349fd0 <dlclose_doit>, args=0x555555834a70) at dl-error.c:187
#2  0x00007ffff734a591 in _dlerror_run (operate=operate@entry=0x7ffff7349fd0 <dlclose_doit>, args=0x555555834a70) at dlerror.c:163
#3  0x00007ffff7349fff in __dlclose (handle=<optimized out>) at dlclose.c:46
#4  0x00007ffff3e84315 in krb5int_close_plugin () from /lib64/libkrb5support.so.0
#5  0x00007ffff644709a in releaseMechInfo () from /lib64/libgssapi_krb5.so.2
#6  0x00007ffff643dc44 in gssint_mechglue_fini () from /lib64/libgssapi_krb5.so.2
#7  0x00007ffff7de811a in _dl_fini () at dl-fini.c:235
#8  0x00007ffff598b420 in __run_exit_handlers (status=0, listp=0x7ffff5d115b8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83
#9  0x00007ffff598b47a in __GI_exit (status=<optimized out>) at exit.c:105
#10 0x00007ffff5971408 in __libc_start_main (main=0x55555555dcc0 <main>, argc=5, argv=0x7fffffffe058, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe048) at ../csu/libc-start.c:323
#11 0x000055555556152a in _start ()
(gdb) p *map
$1 = {l_addr = 140737157447680, l_name = 0x555555832f30 "/usr/lib64/gssntlmssp/gssntlmssp.so", l_ld = 0x7fffec685ac0, l_next = 0x555555834f30, l_prev = 0x555555834180, l_real = 0x555555834a70, l_ns = 0, l_libname = 0x555555834ee8, 
  l_info = {0x0, 0x7fffec685b60, 0x0, 0x7fffec685c30, 0x0, 0x7fffec685bf0, 0x7fffec685c00, 0x7fffec685c40, 0x7fffec685c50, 0x7fffec685c60, 0x7fffec685c10, 0x7fffec685c20, 0x7fffec685b80, 0x7fffec685b90, 0x7fffec685b70, 0x0, 0x0, 0x0, 
    0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffec685c80, 0x7fffec685ba0, 0x7fffec685bc0, 0x7fffec685bb0, 0x7fffec685bd0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffec685ca0, 0x7fffec685c90, 0x0, 0x0, 0x7fffec685c80, 0x0, 0x7fffec685cc0, 0x0, 0x0, 0x0, 
    0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffec685cb0, 0x0 <repeats 25 times>, 0x7fffec685be0}, l_phdr = 0x7fffec46c040, l_entry = 140737157457440, l_phnum = 7, l_ldnum = 38, l_searchlist = {r_list = 0x555555832e30, r_nlist = 21}, 
  l_symbolic_searchlist = {r_list = 0x555555834ee0, r_nlist = 0}, l_loader = 0x0, l_versions = 0x555555833d70, l_nversions = 13, l_nbuckets = 17, l_gnu_bitmask_idxbits = 3, l_gnu_shift = 8, l_gnu_bitmask = 0x7fffec46c200, {
    l_gnu_buckets = 0x7fffec46c220, l_chain = 0x7fffec46c220}, {l_gnu_chain_zero = 0x7fffec46c104, l_buckets = 0x7fffec46c104}, l_direct_opencount = 1, l_type = lt_loaded, l_relocated = 1, l_init_called = 0, l_global = 0, 
  l_reserved = 0, l_phdr_allocated = 0, l_soname_added = 0, l_faked = 0, l_need_tls_init = 0, l_auditing = 0, l_audit_any_plt = 0, l_removed = 0, l_contiguous = 1, l_symbolic_in_local_scope = 0, l_free_initfini = 1, l_rpath_dirs = {
    dirs = 0xffffffffffffffff, malloced = 0}, l_reloc_result = 0x0, l_versyms = 0x7fffec46d5fe, l_origin = 0x5555558314e0 "/usr/lib64/gssntlmssp", l_map_start = 140737157447680, l_map_end = 140737159651416, l_text_end = 140737157554176, 
  l_scope_mem = {0x7ffff7ffe3e0, 0x555555834d28, 0x0, 0x0}, l_scope_max = 4, l_scope = 0x555555834dc8, l_local_scope = {0x555555834d28, 0x0}, l_file_id = {dev = 2067, ino = 270762}, l_runpath_dirs = {dirs = 0xffffffffffffffff, 
    malloced = 0}, l_initfini = 0x555555832d80, l_reldeps = 0x0, l_reldepsmax = 0, l_used = 1, l_feature_1 = 0, l_flags_1 = 9, l_flags = 0, l_idx = 36, l_mach = {plt = 0, gotplt = 0, tlsdesc_table = 0x0}, l_lookup_cache = {
    sym = 0x7fffec46cb08, type_class = 4, value = 0x5555558353e0, ret = 0x7fffebfebae8}, l_tls_initimage = 0x0, l_tls_initimage_size = 0, l_tls_blocksize = 0, l_tls_align = 0, l_tls_firstbyte_offset = 0, l_tls_offset = 0, 
  l_tls_modid = 0, l_tls_dtor_count = 0, l_relro_addr = 2201992, l_relro_size = 1656, l_serial = 36, l_audit = 0x555555834ee0}

Comment 8 Fernando Herrera 2016-11-23 13:54:36 UTC
Removing package gssntlmssp-0.7.0-1.fc25.x86_64 stops triggering the assertion.

Comment 9 Jakub Jelen 2016-11-23 14:32:34 UTC
Still not luck to reproduce this behavior. But it looks like related to the kerberos somehow. Are you authenticating to the git.gnome.org using kerberos? How does the output look like if you add -vvv to get verbose log?

Comment 10 Fernando Herrera 2016-11-23 14:41:59 UTC
(In reply to Jakub Jelen from comment #9)
> Still not luck to reproduce this behavior. But it looks like related to the
> kerberos somehow. Are you authenticating to the git.gnome.org using
> kerberos? How does the output look like if you add -vvv to get verbose log?

No, this happens when exiting ssh to another f25 box using password or pubkey authentication. When I do ssh to my rpi with debian I don't get the assertion on exit. I am attaching both logs.

Comment 11 Fernando Herrera 2016-11-23 14:42:53 UTC
Created attachment 1223280 [details]
ssh -v -v -v output against a f25 box (triggers assertion)

Comment 12 Fernando Herrera 2016-11-23 14:43:24 UTC
Created attachment 1223281 [details]
ssh -v -v -v output against a debian (no assertion)

Comment 13 Jakub Jelen 2016-11-23 15:00:27 UTC
Yes, the difference is that against Fedora box, the GSSAPI authentication is attempted and therefore kerberos libraries loaded. You can workaround it by disabling GSSAPIKeyExchange and GSSAPIAuthentication, but it is probably the only thing we can do from OpenSSH side. The rest is up to the kerberos guys to fix, probably.

    ssh -o GSSAPIKeyExchange=no -o GSSAPIAuthentication=no fer.name

or in ~/.ssh/config.

Comment 14 Fernando Herrera 2016-11-23 15:07:04 UTC
When running gdb the sequence is:

#0  call_init (l=0x5555558350a0, argc=argc@entry=5, argv=argv@entry=0x7fffffffe058, env=env@entry=0x555555813130) at dl-init.c:36
#1  0x00007ffff7de7adb in call_init (env=0x555555813130, argv=0x7fffffffe058, argc=5, l=<optimized out>) at dl-init.c:30
#2  _dl_init (main_map=main_map@entry=0x5555558350a0, argc=5, argv=0x7fffffffe058, env=0x555555813130) at dl-init.c:120
#3  0x00007ffff7deca36 in dl_open_worker (a=a@entry=0x7fffffffa900) at dl-open.c:564
#4  0x00007ffff7de7874 in _dl_catch_error (objname=objname@entry=0x7fffffffa8f0, errstring=errstring@entry=0x7fffffffa8f8, mallocedp=mallocedp@entry=0x7fffffffa8ef, operate=operate@entry=0x7ffff7dec6d0 <dl_open_worker>, 
    args=args@entry=0x7fffffffa900) at dl-error.c:187
#5  0x00007ffff7dec009 in _dl_open (file=0x555555830800 "/usr/lib64/gssntlmssp/gssntlmssp.so", mode=-2147479550, caller_dlopen=0x7ffff3e8416d <krb5int_open_plugin+237>, nsid=-2, argc=<optimized out>, argv=<optimized out>, 
    env=0x555555813130) at dl-open.c:649
#6  0x00007ffff7349f09 in dlopen_doit (a=a@entry=0x7fffffffab30) at dlopen.c:66
#7  0x00007ffff7de7874 in _dl_catch_error (objname=0x55555582e800, errstring=0x55555582e808, mallocedp=0x55555582e7f8, operate=0x7ffff7349eb0 <dlopen_doit>, args=0x7fffffffab30) at dl-error.c:187
#8  0x00007ffff734a591 in _dlerror_run (operate=operate@entry=0x7ffff7349eb0 <dlopen_doit>, args=args@entry=0x7fffffffab30) at dlerror.c:163
#9  0x00007ffff7349fa2 in __dlopen (file=file@entry=0x555555830800 "/usr/lib64/gssntlmssp/gssntlmssp.so", mode=mode@entry=4098) at dlopen.c:87
#10 0x00007ffff3e8416d in krb5int_open_plugin (filepath=0x555555830800 "/usr/lib64/gssntlmssp/gssntlmssp.so", h=h@entry=0x7fffffffac38, ep=ep@entry=0x7fffffffac40) at plugins.c:270
#11 0x00007ffff644c0e8 in gssint_get_mechanism (oid=<optimized out>) at g_initialize.c:1117
#12 0x00007ffff6446d1a in gss_init_sec_context (minor_status=0x555555834084, claimant_cred_handle=0x0, context_handle=0x555555834088, target_name=0x555555834170, req_mech_type=<optimized out>, req_flags=34, time_req=0, 
    input_chan_bindings=0x0, input_token=0x0, actual_mech_type=0x0, output_token=0x7fffffffadb0, ret_flags=0x0, time_rec=0x0) at g_init_sec_context.c:151
#13 0x00005555555a454f in ssh_gssapi_init_ctx ()
#14 0x00005555555a4825 in ssh_gssapi_check_mechanism ()
#15 0x0000555555571a75 in userauth_gssapi ()
#16 0x00005555555749d8 in userauth ()
#17 0x0000555555574b2f in input_userauth_failure ()
#18 0x000055555559d7d9 in ssh_dispatch_run ()
#19 0x000055555557468f in ssh_userauth2 ()
#20 0x000055555557088c in ssh_login ()
#21 0x000055555555f90b in main ()


that sets map->l_init_called to 1 for gssntlmssp.so

Then when closing the connection _dl_fini is called:
#0  _dl_fini () at dl-fini.c:213
#1  0x00007ffff598b420 in __run_exit_handlers (status=0, listp=0x7ffff5d115b8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83
#2  0x00007ffff598b47a in __GI_exit (status=<optimized out>) at exit.c:105
#3  0x00007ffff5971408 in __libc_start_main (main=0x55555555dcc0 <main>, argc=5, argv=0x7fffffffe058, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe048) at ../csu/libc-start.c:323
#4  0x000055555556152a in _start ()


that sets map->l_init_called to 0 for gssntlmssp.so

but that function triggers an error that ends up in dl-close triggering the assertion:

#0  _dl_close (_map=0x5555558350a0) at dl-close.c:811
#1  0x00007ffff7de7874 in _dl_catch_error (objname=0x55555582e800, errstring=0x55555582e808, mallocedp=0x55555582e7f8, operate=0x7ffff7349fd0 <dlclose_doit>, args=0x5555558350a0) at dl-error.c:187
#2  0x00007ffff734a591 in _dlerror_run (operate=operate@entry=0x7ffff7349fd0 <dlclose_doit>, args=0x5555558350a0) at dlerror.c:163
#3  0x00007ffff7349fff in __dlclose (handle=<optimized out>) at dlclose.c:46
#4  0x00007ffff3e84315 in krb5int_close_plugin (h=0x555555830a30) at plugins.c:416
#5  0x00007ffff644709a in releaseMechInfo (pCf=pCf@entry=0x7fffffffdd00) at g_initialize.c:521
#6  0x00007ffff643dc44 in freeMechList () at g_initialize.c:935
#7  gssint_mechglue_fini () at g_initialize.c:148
#8  0x00007ffff7de811a in _dl_fini () at dl-fini.c:235
#9  0x00007ffff598b420 in __run_exit_handlers (status=0, listp=0x7ffff5d115b8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83
#10 0x00007ffff598b47a in __GI_exit (status=<optimized out>) at exit.c:105
#11 0x00007ffff5971408 in __libc_start_main (main=0x55555555dcc0 <main>, argc=5, argv=0x7fffffffe058, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe048) at ../csu/libc-start.c:323
#12 0x000055555556152a in _start ()

Comment 15 Roman Kagan 2016-11-28 07:38:33 UTC
I'm also affected by this.

Seems like gssntlmssp doesn't match the libgssapi_krb5 API expectations:

# LD_DEBUG=all ssh some_host true
...
     13221:	calling init: /usr/lib64/gssntlmssp/gssntlmssp.so
     13221:	
     13221:	opening file=/usr/lib64/gssntlmssp/gssntlmssp.so [0]; direct_opencount=1
     13221:	
     13221:	symbol=gss_mech_initialize;  lookup in file=/usr/lib64/gssntlmssp/gssntlmssp.so [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libkrb5.so.3 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libk5crypto.so.3 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libcom_err.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libssl.so.10 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libcrypto.so.10 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libwbclient.so.0 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libz.so.1 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libunistring.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libgssapi_krb5.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libpthread.so.0 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libc.so.6 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libkrb5support.so.0 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libkeyutils.so.1 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libresolv.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libdl.so.2 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/usr/lib64/samba/libwinbind-client-samba4.so [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/usr/lib64/samba/libreplace-samba4.so [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libselinux.so.1 [0]
     13221:	symbol=gss_mech_initialize;  lookup in file=/lib64/libpcre.so.1 [0]
     13221:	/usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gss_mech_initialize (fatal)
...

and so on, in total five symbol lookup failures:

     13221:     /usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gss_mech_initialize (fatal)
     13221:     /usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gss_internal_release_oid (fatal)
     13221:     /usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gssspi_authorize_localname (fatal)
     13221:     /usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gssspi_import_sec_context_by_mech (fatal)
     13221:     /usr/lib64/gssntlmssp/gssntlmssp.so: error: symbol lookup error: undefined symbol: gssspi_import_cred_by_mech (fatal)


I.e. the module load apparently fails.  Then I guess error handling in libgssapi_krb5 causes double dlclose().

So the are probably two bugs: one gssntlmssp not providing the expected symbols, and another in libgssapi_krb5 not removing the failed module from the list and attempting to unload it again.

Comment 16 Roman Kagan 2016-11-28 08:21:08 UTC
OOPS, I took a look at krb5 code and it seems that not implementing part of the module API is OK.  So my analysis is incorrect, sorry for the noise.

Looks like this is a dup of bug 1264556.

Comment 17 Tomas Mraz 2016-11-28 09:24:45 UTC

*** This bug has been marked as a duplicate of bug 1398370 ***