Bug 1493965 - 389-ds-base crashed as part of ipa-server-intall
389-ds-base crashed as part of ipa-server-intall
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: 389-ds-base (Show other bugs)
26
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Rich Megginson
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-21 04:43 EDT by Lukas Slebodnik
Modified: 2017-11-15 15:17 EST (History)
10 users (show)

See Also:
Fixed In Version: 389-ds-base-1.3.6.11-1.fc26
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1496226 (view as bug list)
Environment:
Last Closed: 2017-11-15 15:17:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
valgrind log (31.48 KB, text/x-vhdl)
2017-09-21 12:12 EDT, Lukas Slebodnik
no flags Details
Tentative fix (2.54 KB, patch)
2017-09-25 10:52 EDT, thierry bordaz
no flags Details | Diff

  None (edit)
Description Lukas Slebodnik 2017-09-21 04:43:04 EDT
Description of problem:
Directory sever is part of freIPA and I noticed a crash as part of installation of ipa server.

Version-Release number of selected component (if applicable):
sh# rpm -q 389-ds-base
389-ds-base-1.3.7.3-1.fc27.x86_64

How reproducible:
Deterministic

Additional info:
sh# coredumpctl info
           PID: 18988 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Thu 2017-09-21 04:18:34 EDT (23min ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: 3d59fc90a8fe480095660b18d3888f24
    Machine ID: 41bff4ef34a252914cb2c593320f4ee3
      Hostname: hp-dl380pgen8-02-vm-11.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.3d59fc90a8fe480095660b18d3888f24.18988.1505981914000000.lz4
       Message: Process 18988 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 18988:
                #0  0x00007f2a1706f1a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007f2a1706f52e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007f2a070dd0a3 __buf_free (libnssdbm3.so)
                #3  0x00007f2a070db7e7 hdestroy (libnssdbm3.so)
                #4  0x00007f2a070db93c hash_close (libnssdbm3.so)
                #5  0x00007f2a070c8103 dbs_close (libnssdbm3.so)
                #6  0x00007f2a070d3d9c certdb_Close (libnssdbm3.so)
                #7  0x00007f2a070d7506 nsslowcert_ClosePermCertDB (libnssdbm3.so)
                #8  0x00007f2a070d15f3 lg_Close (libnssdbm3.so)
                #9  0x00007f2a07607b18 sftkdb_CloseDB (libsoftokn3.so)
                #10 0x00007f2a075edcca sftk_DBShutdown (libsoftokn3.so)
                #11 0x00007f2a075f21f2 SFTK_ShutdownSlot (libsoftokn3.so)
                #12 0x00007f2a075f2482 SFTK_DestroySlotData (libsoftokn3.so)
                #13 0x00007f2a075f2d94 nscFreeAllSlots (libsoftokn3.so)
                #14 0x00007f2a075f342c nsc_CommonFinalize (libsoftokn3.so)
                #15 0x00007f2a075f363a NSC_Finalize (libsoftokn3.so)
                #16 0x00007f2a166ace11 SECMOD_UnloadModule (libnss3.so)
                #17 0x00007f2a166c6370 SECMOD_SlotDestroyModule (libnss3.so)
                #18 0x00007f2a166c24df PK11_DestroySlot (libnss3.so)
                #19 0x00007f2a166c6433 SECMOD_DestroyModule (libnss3.so)
                #20 0x00007f2a166c679a SECMOD_DestroyModuleListElement (libnss3.so)
                #21 0x00007f2a166c6c45 SECMOD_DestroyModuleList (libnss3.so)
                #22 0x00007f2a166c6cc9 SECMOD_Shutdown (libnss3.so)
                #23 0x00007f2a16685db2 nss_Shutdown (libnss3.so)
                #24 0x00007f2a16685ead NSS_Shutdown (libnss3.so)
                #25 0x0000559056655381 main (ns-slapd)
                #26 0x00007f2a14f4703a __libc_start_main (libc.so.6)
                #27 0x0000559056656f5a _start (ns-slapd)
                
                Stack trace of thread 19013:
                #0  0x00007f2a150402e6 epoll_pwait (libc.so.6)
                #1  0x00007f2a15330c5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f2a1531b8de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f2a1852bcbe ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f2a1852a939 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f2a157e4609 start_thread (libpthread.so.0)
                #6  0x00007f2a1504017f __clone (libc.so.6)
Comment 1 Lukas Slebodnik 2017-09-21 04:43:41 EDT
sh$ rpm -q freeipa-server selinux-policy
freeipa-server-4.6.0-3.fc27.x86_64
selinux-policy-3.13.1-283.3.fc27.noarch
Comment 3 Lukas Slebodnik 2017-09-21 04:51:26 EDT
And maybe version of nss are important as well

sh$ rpm -qf /usr/lib64/libnss3.so /usr/lib64/libsoftokn3.so /usr/lib64/libnssdbm3.so 
nss-3.32.1-1.0.fc27.x86_64
nss-softokn-3.32.0-7.fc27.x86_64
nss-softokn-3.32.0-7.fc27.x86_64
Comment 4 Lukas Slebodnik 2017-09-21 05:38:53 EDT
I found few other crashes on different machines:

           PID: 22948 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Wed 2017-09-20 20:25:05 EDT (9h ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: 4e2e1bd001054fdeabccd44da5e812c8
    Machine ID: ef62f1c5934c4ad3b714fd4a132b57d4
      Hostname: hp-dl585g5-01.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.4e2e1bd001054fdeabccd44da5e812c8.22948.1505953505000000.lz4
       Message: Process 22948 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 22948:
                #0  0x00007f497baee1a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007f497baee52e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007f4979a7e28d closedir (libc.so.6)
                #3  0x00007f497cd164d8 remove_and_update (libslapd.so.0)
                #4  0x00007f497cd16638 remove_slapd_process (libslapd.so.0)
                #5  0x00007f49799e0bd8 __run_exit_handlers (libc.so.6)
                #6  0x00007f49799e0c2a exit (libc.so.6)
                #7  0x00007f49799c6041 __libc_start_main (libc.so.6)
                #8  0x000055c105b2ef5a _start (ns-slapd)
Comment 5 Lukas Slebodnik 2017-09-21 05:39:11 EDT
           PID: 21424 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Wed 2017-09-20 22:10:58 EDT (7h ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: c823afc1f3e445b98c32c90464aca5b7
    Machine ID: c0d65234e1954eafb26208ccb021c843
      Hostname: hp-dl120g5-01.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.c823afc1f3e445b98c32c90464aca5b7.21424.1505959858000000.lz4
       Message: Process 21424 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 21424:
                #0  0x00007f6172e7c1a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007f6172e7c52e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007f61633ff554 SFTK_DestroySlotData (libsoftokn3.so)
                #3  0x00007f61633ffd94 nscFreeAllSlots (libsoftokn3.so)
                #4  0x00007f616340042c nsc_CommonFinalize (libsoftokn3.so)
                #5  0x00007f616340063a NSC_Finalize (libsoftokn3.so)
                #6  0x00007f61724b9e11 SECMOD_UnloadModule (libnss3.so)
                #7  0x00007f61724d3370 SECMOD_SlotDestroyModule (libnss3.so)
                #8  0x00007f61724cf4df PK11_DestroySlot (libnss3.so)
                #9  0x00007f61724d3433 SECMOD_DestroyModule (libnss3.so)
                #10 0x00007f61724d379a SECMOD_DestroyModuleListElement (libnss3.so)
                #11 0x00007f61724d3c45 SECMOD_DestroyModuleList (libnss3.so)
                #12 0x00007f61724d3cc9 SECMOD_Shutdown (libnss3.so)
                #13 0x00007f6172492db2 nss_Shutdown (libnss3.so)
                #14 0x00007f6172492ead NSS_Shutdown (libnss3.so)
                #15 0x0000556395775381 main (ns-slapd)
                #16 0x00007f6170d5403a __libc_start_main (libc.so.6)
                #17 0x0000556395776f5a _start (ns-slapd)
                
                Stack trace of thread 21449:
                #0  0x00007f6170e4d2e6 epoll_pwait (libc.so.6)
                #1  0x00007f617113dc5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f61711288de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f6174338cbe ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f6174337939 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f61715f1609 start_thread (libpthread.so.0)
                #6  0x00007f6170e4d17f __clone (libc.so.6)
Comment 6 Lukas Slebodnik 2017-09-21 05:39:30 EDT
           PID: 19732 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Wed 2017-09-20 21:46:24 EDT (7h ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: a9165bb26d374913b7bd700309a8a49b
    Machine ID: 85798e17e07f460490da106a2201860f
      Hostname: kvm-02-guest21.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.a9165bb26d374913b7bd700309a8a49b.19732.1505958384000000.lz4
       Message: Process 19732 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 19732:
                #0  0x00007f521ffd11a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007f521ffd152e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007f521003f0a3 __buf_free (libnssdbm3.so)
                #3  0x00007f521003d7e7 hdestroy (libnssdbm3.so)
                #4  0x00007f521003d93c hash_close (libnssdbm3.so)
                #5  0x00007f521002a103 dbs_close (libnssdbm3.so)
                #6  0x00007f5210035d9c certdb_Close (libnssdbm3.so)
                #7  0x00007f5210039506 nsslowcert_ClosePermCertDB (libnssdbm3.so)
                #8  0x00007f52100335f3 lg_Close (libnssdbm3.so)
                #9  0x00007f5210569b18 sftkdb_CloseDB (libsoftokn3.so)
                #10 0x00007f521054fcca sftk_DBShutdown (libsoftokn3.so)
                #11 0x00007f52105541f2 SFTK_ShutdownSlot (libsoftokn3.so)
                #12 0x00007f5210554482 SFTK_DestroySlotData (libsoftokn3.so)
                #13 0x00007f5210554d94 nscFreeAllSlots (libsoftokn3.so)
                #14 0x00007f521055542c nsc_CommonFinalize (libsoftokn3.so)
                #15 0x00007f521055563a NSC_Finalize (libsoftokn3.so)
                #16 0x00007f521f60ee11 SECMOD_UnloadModule (libnss3.so)
                #17 0x00007f521f628370 SECMOD_SlotDestroyModule (libnss3.so)
                #18 0x00007f521f6244df PK11_DestroySlot (libnss3.so)
                #19 0x00007f521f628433 SECMOD_DestroyModule (libnss3.so)
                #20 0x00007f521f62879a SECMOD_DestroyModuleListElement (libnss3.so)
                #21 0x00007f521f628c45 SECMOD_DestroyModuleList (libnss3.so)
                #22 0x00007f521f628cc9 SECMOD_Shutdown (libnss3.so)
                #23 0x00007f521f5e7db2 nss_Shutdown (libnss3.so)
                #24 0x00007f521f5e7ead NSS_Shutdown (libnss3.so)
                #25 0x0000557a8908c381 main (ns-slapd)
                #26 0x00007f521dea903a __libc_start_main (libc.so.6)
                #27 0x0000557a8908df5a _start (ns-slapd)
                
                Stack trace of thread 19757:
                #0  0x00007f521dfa22e6 epoll_pwait (libc.so.6)
                #1  0x00007f521e292c5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f521e27d8de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f522148dcbe ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f522148c939 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f521e746609 start_thread (libpthread.so.0)
                #6  0x00007f521dfa217f __clone (libc.so.6)

           PID: 3208 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Thu 2017-09-21 05:10:02 EDT (26min ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-KDOM-02-GUEST21-TESTRELM-TEST -i /var/run/dirsrv/slapd-KDOM-02-GUEST21-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@KDOM-02-GUEST21-TESTRELM-TEST.service
          Unit: dirsrv@KDOM-02-GUEST21-TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: a9165bb26d374913b7bd700309a8a49b
    Machine ID: 85798e17e07f460490da106a2201860f
      Hostname: kvm-02-guest21.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.a9165bb26d374913b7bd700309a8a49b.3208.1505985002000000.lz4
       Message: Process 3208 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 3208:
                #0  0x00007f25478791a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007f254787952e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007f2546869984 PL_HashTableRemove (libplds4.so)
                #3  0x00007f25383efd9f nscFreeAllSlots (libsoftokn3.so)
                #4  0x00007f25383f042c nsc_CommonFinalize (libsoftokn3.so)
                #5  0x00007f25383f063a NSC_Finalize (libsoftokn3.so)
                #6  0x00007f2546eb6e11 SECMOD_UnloadModule (libnss3.so)
                #7  0x00007f2546ed0370 SECMOD_SlotDestroyModule (libnss3.so)
                #8  0x00007f2546ecc4df PK11_DestroySlot (libnss3.so)
                #9  0x00007f2546ed0433 SECMOD_DestroyModule (libnss3.so)
                #10 0x00007f2546ed079a SECMOD_DestroyModuleListElement (libnss3.so)
                #11 0x00007f2546ed0c45 SECMOD_DestroyModuleList (libnss3.so)
                #12 0x00007f2546ed0cc9 SECMOD_Shutdown (libnss3.so)
                #13 0x00007f2546e8fdb2 nss_Shutdown (libnss3.so)
                #14 0x00007f2546e8fead NSS_Shutdown (libnss3.so)
                #15 0x0000560b3bb07381 main (ns-slapd)
                #16 0x00007f254575103a __libc_start_main (libc.so.6)
                #17 0x0000560b3bb08f5a _start (ns-slapd)
                
                Stack trace of thread 3225:
                #0  0x00007f254584a2e6 epoll_pwait (libc.so.6)
                #1  0x00007f2545b3ac5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f2545b258de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f2548d35cbe ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f2548d34939 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f2545fee609 start_thread (libpthread.so.0)
                #6  0x00007f254584a17f __clone (libc.so.6)

           PID: 3283 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 11 (SEGV)
     Timestamp: Thu 2017-09-21 05:10:16 EDT (26min ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-KDOM-02-GUEST21-TESTRELM-TEST -i /var/run/dirsrv/slapd-KDOM-02-GUEST21-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@KDOM-02-GUEST21-TESTRELM-TEST.service
          Unit: dirsrv@KDOM-02-GUEST21-TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: a9165bb26d374913b7bd700309a8a49b
    Machine ID: 85798e17e07f460490da106a2201860f
      Hostname: kvm-02-guest21.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.a9165bb26d374913b7bd700309a8a49b.3283.1505985016000000.lz4
       Message: Process 3283 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 3283:
                #0  0x00007fe3f839d1a3 _ZN8tcmalloc11ThreadCache21ReleaseToCentralCacheEPNS0_8FreeListEji (libtcmalloc.so.4)
                #1  0x00007fe3f839d52e _ZN8tcmalloc11ThreadCache8ScavengeEv (libtcmalloc.so.4)
                #2  0x00007fe3e88e0554 SFTK_DestroySlotData (libsoftokn3.so)
                #3  0x00007fe3e88e0d94 nscFreeAllSlots (libsoftokn3.so)
                #4  0x00007fe3e88e142c nsc_CommonFinalize (libsoftokn3.so)
                #5  0x00007fe3e88e163a NSC_Finalize (libsoftokn3.so)
                #6  0x00007fe3f79dae11 SECMOD_UnloadModule (libnss3.so)
                #7  0x00007fe3f79f4370 SECMOD_SlotDestroyModule (libnss3.so)
                #8  0x00007fe3f79f04df PK11_DestroySlot (libnss3.so)
                #9  0x00007fe3f79f4433 SECMOD_DestroyModule (libnss3.so)
                #10 0x00007fe3f79f479a SECMOD_DestroyModuleListElement (libnss3.so)
                #11 0x00007fe3f79f4c45 SECMOD_DestroyModuleList (libnss3.so)
                #12 0x00007fe3f79f4cc9 SECMOD_Shutdown (libnss3.so)
                #13 0x00007fe3f79b3db2 nss_Shutdown (libnss3.so)
                #14 0x00007fe3f79b3ead NSS_Shutdown (libnss3.so)
                #15 0x0000563fc9a95381 main (ns-slapd)
                #16 0x00007fe3f627503a __libc_start_main (libc.so.6)
                #17 0x0000563fc9a96f5a _start (ns-slapd)
                
                Stack trace of thread 3304:
                #0  0x00007fe3f636e2e6 epoll_pwait (libc.so.6)
                #1  0x00007fe3f665ec5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007fe3f66498de event_base_loop (libevent-2.0.so.5)
                #3  0x00007fe3f9859cbe ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007fe3f9858939 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007fe3f6b12609 start_thread (libpthread.so.0)
                #6  0x00007fe3f636e17f __clone (libc.so.6)
Comment 7 thierry bordaz 2017-09-21 05:51:08 EDT
All crashes occurred at shutdown of ns-slapd. The crash is looking like a heap corruption detected by tcmalloc. Most of the crash are detected during NSS_Shutdown BUT https://bugzilla.redhat.com/show_bug.cgi?id=1493965#c4 shows that it is call before NSS_Shutdown.

So if it is a corruption happening in nss it is NOT related to NSS_Shutdown.
IMHO, DS should have created a corruption somewhere.

It would be interesting to rerun with valgrind or memory analyser.
Comment 8 Ludwig 2017-09-21 06:07:09 EDT
(In reply to thierry bordaz from comment #7)
> All crashes occurred at shutdown of ns-slapd. The crash is looking like a
> heap corruption detected by tcmalloc. Most of the crash are detected during
> NSS_Shutdown BUT https://bugzilla.redhat.com/show_bug.cgi?id=1493965#c4
> shows that it is call before NSS_Shutdown.

but remove_slapd_process() is registered to be called atexit(), so it is after NSS_Shutdown
> 
> So if it is a corruption happening in nss it is NOT related to NSS_Shutdown.
> IMHO, DS should have created a corruption somewhere.

If it would be a general heap corruption generated by DS I would expect the crash to happen anytime, not only during shutdown

> 
> It would be interesting to rerun with valgrind or memory analyser.
Comment 9 Lukas Slebodnik 2017-09-21 12:12 EDT
Created attachment 1329068 [details]
valgrind log
Comment 10 Lukas Slebodnik 2017-09-21 15:29:30 EDT
I tested 389-ds from copr repo mreynolds/389-ds-base which use glibc instead of tcmalloc and it crashed as well.

sh# rpm -q 389-ds-base
389-ds-base-1.3.7.4-2.fc28.x86_64

           PID: 19768 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 6 (ABRT)
     Timestamp: Thu 2017-09-21 15:09:19 EDT (15min ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: 2c9a9cf0384443b0bca5c271ea5aee0d
    Machine ID: 41bff4ef34a252914cb2c593320f4ee3
      Hostname: hp-dl380pgen8-02-vm-11.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.2c9a9cf0384443b0bca5c271ea5aee0d.19768.1506020959000000.lz4
       Message: Process 19768 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 19768:
                #0  0x00007f573caa46bb raise (libc.so.6)
                #1  0x00007f573caa6447 abort (libc.so.6)
                #2  0x00007f573caeeb87 __libc_message (libc.so.6)
                #3  0x00007f573caf5efe malloc_printerr (libc.so.6)
                #4  0x00007f573caf79f9 _int_free (libc.so.6)
                #5  0x00007f573cb0035e __libc_free (libc.so.6)
                #6  0x00007f573f98de91 slapi_ch_free (libslapd.so.0)
                #7  0x00007f573fa1c582 vattr_map_entry_free (libslapd.so.0)
                #8  0x00007f573fa1c5ad vattr_he_cleanup_fn (libslapd.so.0)
                #9  0x00007f573dba6ac0 PL_HashTableEnumerateEntries (libplds4.so)
                #10 0x00007f573fa1be62 vattr_cleanup (libslapd.so.0)
                #11 0x000055bee8bf6643 main (ns-slapd)
                #12 0x00007f573ca8e03a __libc_start_main (libc.so.6)
                #13 0x000055bee8bf6f4a _start (ns-slapd)
                
                Stack trace of thread 19786:
                #0  0x00007f573cb872e6 epoll_pwait (libc.so.6)
                #1  0x00007f573ce77c5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f573ce628de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f573fc7dcae ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f573fc7c929 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f573d32b609 start_thread (libpthread.so.0)
                #6  0x00007f573cb8717f __clone (libc.so.6)

           PID: 20438 (ns-slapd)
           UID: 389 (dirsrv)
           GID: 389 (dirsrv)
        Signal: 6 (ABRT)
     Timestamp: Thu 2017-09-21 15:09:31 EDT (15min ago)
  Command Line: /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-TESTRELM-TEST -i /var/run/dirsrv/slapd-TESTRELM-TEST.pid
    Executable: /usr/sbin/ns-slapd
 Control Group: /system.slice/system-dirsrv.slice/dirsrv@TESTRELM-TEST.service
          Unit: dirsrv@TESTRELM-TEST.service
         Slice: system-dirsrv.slice
       Boot ID: 2c9a9cf0384443b0bca5c271ea5aee0d
    Machine ID: 41bff4ef34a252914cb2c593320f4ee3
      Hostname: hp-dl380pgen8-02-vm-11.testrelm.test
       Storage: /var/lib/systemd/coredump/core.ns-slapd.389.2c9a9cf0384443b0bca5c271ea5aee0d.20438.1506020971000000.lz4
       Message: Process 20438 (ns-slapd) of user 389 dumped core.
                
                Stack trace of thread 20438:
                #0  0x00007f3771edd6bb raise (libc.so.6)
                #1  0x00007f3771edf447 abort (libc.so.6)
                #2  0x00007f3771f27b87 __libc_message (libc.so.6)
                #3  0x00007f3771f2eefe malloc_printerr (libc.so.6)
                #4  0x00007f3771f309f9 _int_free (libc.so.6)
                #5  0x00007f3771f3935e __libc_free (libc.so.6)
                #6  0x00007f3774dc6e91 slapi_ch_free (libslapd.so.0)
                #7  0x00007f3774e55582 vattr_map_entry_free (libslapd.so.0)
                #8  0x00007f3774e555ad vattr_he_cleanup_fn (libslapd.so.0)
                #9  0x00007f3772fdfac0 PL_HashTableEnumerateEntries (libplds4.so)
                #10 0x00007f3774e54e62 vattr_cleanup (libslapd.so.0)
                #11 0x0000558e54f00643 main (ns-slapd)
                #12 0x00007f3771ec703a __libc_start_main (libc.so.6)
                #13 0x0000558e54f00f4a _start (ns-slapd)
                
                Stack trace of thread 20464:
                #0  0x00007f3771fc02e6 epoll_pwait (libc.so.6)
                #1  0x00007f37722b0c5b epoll_dispatch (libevent-2.0.so.5)
                #2  0x00007f377229b8de event_base_loop (libevent-2.0.so.5)
                #3  0x00007f37750b6cae ns_event_fw_loop (libnunc-stans.so.0)
                #4  0x00007f37750b5929 event_loop_thread_func (libnunc-stans.so.0)
                #5  0x00007f3772764609 start_thread (libpthread.so.0)
                #6  0x00007f3771fc017f __clone (libc.so.6)
Comment 12 mreynolds 2017-09-21 15:41:58 EDT
Looks like memory corruption.  Can this same test be run under valgrind?
Comment 16 Lukas Slebodnik 2017-09-22 12:48:30 EDT
Unfortunately, I cannot see any crash with ASAN build
389-ds-base-1.3.7.4-3.fc28.x86_64. I might try with valgrind but I expected that it will be simpler reproduce with ASAN because valgrind is much slower.
Comment 18 Lukas Slebodnik 2017-09-22 17:00:13 EDT
Based on following output it might be caused by issue in ipa plugins.
But ipa plugins are not buit with asan so we can see just libipa_uuid.so

=================================================================
==15203==ERROR: AddressSanitizer: heap-use-after-free on address 0x6060004cb260 at pc 0x7f7e0941d36e bp 0x7f7dd66aa1d0 sp 0x7f7dd66a9978
READ of size 53 at 0x6060004cb260 thread T40
    #0 0x7f7e0941d36d  (/lib64/libasan.so.4+0x5136d)
    #1 0x7f7e08d50e65 in slapi_dn_issuffix (/usr/lib64/dirsrv/libslapd.so.0+0xe7e65)
    #2 0x7f7df9934f8a  (/usr/lib64/dirsrv/plugins/libipa_uuid.so+0x2f8a)
    #3 0x7f7e08e115cc  (/usr/lib64/dirsrv/libslapd.so.0+0x1a85cc)
    #4 0x7f7e08e119e0 in plugin_call_plugins (/usr/lib64/dirsrv/libslapd.so.0+0x1a89e0)
    #5 0x7f7e08d20e7e  (/usr/lib64/dirsrv/libslapd.so.0+0xb7e7e)
    #6 0x7f7e08d23b68 in do_add (/usr/lib64/dirsrv/libslapd.so.0+0xbab68)
    #7 0x5626b422bb46  (/usr/sbin/ns-slapd+0x46b46)
    #8 0x7f7e06ca407a  (/lib64/libnspr4.so+0x2907a)
    #9 0x7f7e0663f608 in start_thread (/lib64/libpthread.so.0+0x7608)
    #10 0x7f7e05e9b17e in clone (/lib64/libc.so.6+0x11a17e)

0x6060004cb260 is located 0 bytes inside of 53-byte region [0x6060004cb260,0x6060004cb295)
freed by thread T40 here:
    #0 0x7f7e094aa4b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
    #1 0x7f7e08d3b2b8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
    #2 0x7f7e08d51c27 in slapi_sdn_done (/usr/lib64/dirsrv/libslapd.so.0+0xe8c27)
    #3 0x7f7e08d52941 in slapi_sdn_free (/usr/lib64/dirsrv/libslapd.so.0+0xe9941)
    #4 0x7f7e08e036d3 in slapi_pblock_set (/usr/lib64/dirsrv/libslapd.so.0+0x19a6d3)
    #5 0x7f7df9935527  (/usr/lib64/dirsrv/plugins/libipa_uuid.so+0x3527)

previously allocated by thread T40 here:
    #0 0x7f7e09443238 in __interceptor_strdup (/lib64/libasan.so.4+0x77238)
    #1 0x7f7e08d3afc9 in slapi_ch_strdup (/usr/lib64/dirsrv/libslapd.so.0+0xd1fc9)
    #2 0x7f7e08d52461 in slapi_sdn_set_normdn_byval (/usr/lib64/dirsrv/libslapd.so.0+0xe9461)
    #3 0x7f7e08d5253c in slapi_sdn_new_normdn_byval (/usr/lib64/dirsrv/libslapd.so.0+0xe953c)
    #4 0x7f7e08d20e24  (/usr/lib64/dirsrv/libslapd.so.0+0xb7e24)
    #5 0x7f7e08d23b68 in do_add (/usr/lib64/dirsrv/libslapd.so.0+0xbab68)
    #6 0x5626b422bb46  (/usr/sbin/ns-slapd+0x46b46)
    #7 0x7f7e06ca407a  (/lib64/libnspr4.so+0x2907a)

Thread T40 created by T0 here:
    #0 0x7f7e09403a2f in pthread_create (/lib64/libasan.so.4+0x37a2f)
    #1 0x7f7e06ca3d59  (/lib64/libnspr4.so+0x28d59)

SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.4+0x5136d) 
Shadow bytes around the buggy address:
  0x0c0c800915f0: 00 00 01 fa fa fa fa fa fd fd fd fd fd fd fd fa
  0x0c0c80091600: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
  0x0c0c80091610: fd fd fd fd fd fd fd fd fa fa fa fa 00 00 00 00
  0x0c0c80091620: 00 00 05 fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c0c80091630: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
=>0x0c0c80091640: 00 00 00 00 00 00 05 fa fa fa fa fa[fd]fd fd fd
  0x0c0c80091650: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fa
  0x0c0c80091660: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
  0x0c0c80091670: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd fd
  0x0c0c80091680: fd fd fd fa fa fa fa fa fd fd fd fd fd fd fd fa
  0x0c0c80091690: fa fa fa fa fd fd fd fd fd fd fd fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==15203==ABORTING
Comment 19 Lukas Slebodnik 2017-09-22 17:30:12 EDT
And here is related part of ipaserver-install log when dirsrv failed due to asan heap-use-after-free

2017-09-22T21:18:37Z DEBUG   [35/45]: creating default HBAC rule allow_all
2017-09-22T21:18:37Z DEBUG Starting external process
2017-09-22T21:18:37Z DEBUG args=/usr/bin/ldapmodify -v -f /tmp/tmp77o8844b -H ldapi://%2fvar%2frun%2fslapd-TESTRELM-TEST.socket -Y EXTERNAL
2017-09-22T21:18:37Z DEBUG Process finished, return code=255
2017-09-22T21:18:37Z DEBUG stdout=add objectclass:
        ipaassociation
        ipahbacrule
add cn:
        allow_all
add accessruletype:
        allow
add usercategory:
        all
add hostcategory:
        all
add servicecategory:
        all
add ipaenabledflag:
        TRUE
add description:
        Allow all users to access any host from any host
add ipauniqueid:
        autogenerate
adding new entry "ipauniqueid=autogenerate,cn=hbac,dc=testrelm,dc=test"


2017-09-22T21:18:37Z DEBUG stderr=ldap_initialize( ldapi://%2Fvar%2Frun%2Fslapd-TESTRELM-TEST.socket/??base )
SASL/EXTERNAL authentication started
SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
SASL SSF: 0
ldap_result: Can't contact LDAP server (-1)

2017-09-22T21:18:37Z CRITICAL Failed to load default-hbac.ldif: Command '/usr/bin/ldapmodify -v -f /tmp/tmp77o8844b -H ldapi://%2fvar%2frun%2fslapd-TESTRELM-TEST.socket -Y EXTERNAL' returned non-zero exit status 255.
2017-09-22T21:18:37Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ipaserver/install/service.py", line 506, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python3.6/site-packages/ipaserver/install/service.py", line 496, in run_step
    method()
  File "/usr/lib/python3.6/site-packages/ipaserver/install/dsinstance.py", line 960, in add_hbac
    self._ldap_mod("default-hbac.ldif", self.sub_dict)
  File "/usr/lib/python3.6/site-packages/ipaserver/install/service.py", line 309, in _ldap_mod
    ipautil.run(args, nolog=nologlist)
  File "/usr/lib/python3.6/site-packages/ipapython/ipautil.py", line 523, in run
    raise CalledProcessError(p.returncode, arg_string, str(output))
subprocess.CalledProcessError: Command '/usr/bin/ldapmodify -v -f /tmp/tmp77o8844b -H ldapi://%2fvar%2frun%2fslapd-TESTRELM-TEST.socket -Y EXTERNAL' returned non-zero exit status 255.
Comment 20 thierry bordaz 2017-09-25 10:51:35 EDT
I think the problem is due to ipa-uuid plugin.
I am attaching a tentative patch but has not yet verified it.
Comment 21 thierry bordaz 2017-09-25 10:52 EDT
Created attachment 1330595 [details]
Tentative fix
Comment 22 Lukas Slebodnik 2017-09-25 10:58:05 EDT
Stanislav :-),
Could you prepare copr build for 4.6.1 with patch from comment 21?
Comment 23 Ludwig 2017-09-25 11:06:10 EDT
(In reply to thierry bordaz from comment #20)
> I think the problem is due to ipa-uuid plugin.
> I am attaching a tentative patch but has not yet verified it.

I think your patch is good, but I also think that the plugin code hasn't changed for years and I couldn't find a change in DS which would trigger this now.
Maybe the problem was there for some time and now ASAN catches it and we do not get to the shutdown crash.
Comment 24 thierry bordaz 2017-09-25 11:16:18 EDT
Ludwig, I fully agree with your comment. I was reviewing the previous crashes and was getting to the same concern. The pb found by ASAN is during a ADD, so far from initial shutdown crashes. 

So two different bugs ? I do not know. I am really looking forward the test result that Stanislas and Lukas will run.
Comment 25 Lukas Slebodnik 2017-09-25 11:22:06 EDT
(In reply to thierry bordaz from comment #24)
> Ludwig, I fully agree with your comment. I was reviewing the previous
> crashes and was getting to the same concern. The pb found by ASAN is during
> a ADD, so far from initial shutdown crashes. 
>

ipa-server-install restart directory server few times as part of installation;
and if some thread corrupts heap then you never know with standard build when it will crash.
Comment 26 thierry bordaz 2017-09-25 11:40:50 EDT
@Lukas, my understanding is that ASAN would report all heap corruption (like use after free). The ASAN report you copy/paste is likely not to happen on shutdown. May be others ASAN reports were in the output. Do you still have the complete ASAN report ?

Also, would you mind to install debuginfo-install 389-ds-base and ipa-server it can help for debugging.
Comment 27 Stanislav Laznicka 2017-09-25 12:03:24 EDT
Here's a repo that should have the fix included:
https://copr.fedorainfracloud.org/coprs/stlaz/occasional_freeipa/packages/
Comment 28 Lukas Slebodnik 2017-09-25 16:46:53 EDT
(In reply to thierry bordaz from comment #20)
> I think the problem is due to ipa-uuid plugin.
> I am attaching a tentative patch but has not yet verified it.

FYI it is interesting that I cannot reproduce crash on default f25
freeipa-server-4.4.4-1.fc25.x86_64
389-ds-base-1.3.5.18-1.fc25.x86_64

But it is reproducible on f26:
freeipa-server-4.4.4-4.fc26
389-ds-base-1.3.6.8-1.fc26

So maybe some refactoring/change in directory server revealed bug in ipa plugin.
Comment 29 thierry bordaz 2017-09-26 05:09:07 EDT
Thanks Lukas for this precious results. IIUC:

The crash is not reproducible in F25 (freeipa-server-4.4.4-1.fc25.x86_64).
ipa-uuid did not change since 2014 (was ~freeipa-server-4.1 f22) so the problem detected by ASAN and addressed by tentative fix https://bugzilla.redhat.com/attachment.cgi?id=1330595. Is *NOT* responsible of the crash. The patch will just make ASAN happy.

The RC of the crash (shutdown) is likely not yet identified. It would be interesting to rerun the test in F26
freeipa-server-4.4.4-4.fc26  + patch ipa-uuid
389-ds-base-1.3.6.8-1.fc26   + with ASAN build
Comment 30 mreynolds 2017-09-26 08:39:53 EDT
This might be a case were valgrind is better than ASAN, valgrind will continue to run and detect all the issues, where ASAN appears to crash at the first fault.
Comment 31 Viktor Ashirov 2017-09-26 08:45:46 EDT
ASAN can continue to run if the binary has this env variable ASAN_OPTIONS=halt_on_error=0
Comment 32 Lukas Slebodnik 2017-09-26 13:21:25 EDT
(In reply to thierry bordaz from comment #29)
> Thanks Lukas for this precious results. IIUC:
> 
> The crash is not reproducible in F25 (freeipa-server-4.4.4-1.fc25.x86_64).
> ipa-uuid did not change since 2014 (was ~freeipa-server-4.1 f22) so the
> problem detected by ASAN and addressed by tentative fix
> https://bugzilla.redhat.com/attachment.cgi?id=1330595. Is *NOT* responsible
> of the crash. The patch will just make ASAN happy.
> 

I created clone for freeipa usa-after free BZ1496226
Comment 33 Lukas Slebodnik 2017-09-26 13:22:52 EDT
Here is a ASAN report with fixed UUID bug in ipa

=================================================================
==26521==ERROR: AddressSanitizer: attempting double-free on 0x603000db1ab0 in thread T0:
    #0 0x7f101fcb84b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
    #1 0x7f101f5492b8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
    #2 0x7f101f6b8ad4 in vattr_map_entry_free (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
    #3 0x7f101f6b8b1e  (/usr/lib64/dirsrv/libslapd.so.0+0x241b1e)
    #4 0x7f101d6c8abf in PL_HashTableEnumerateEntries (/lib64/libplds4.so+0x1abf)
    #5 0x7f101f6b74c9 in vattr_cleanup (/usr/lib64/dirsrv/libslapd.so.0+0x2404c9)
    #6 0x55c257c6642f  (/usr/sbin/ns-slapd+0x2d42f)
    #7 0x7f101c5b0039 in __libc_start_main (/lib64/libc.so.6+0x21039)
    #8 0x55c257c68079  (/usr/sbin/ns-slapd+0x2f079)

0x603000db1ab0 is located 0 bytes inside of 24-byte region [0x603000db1ab0,0x603000db1ac8)
freed by thread T0 here:
    #0 0x7f101fcb84b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
    #1 0x7f101f5492b8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
    #2 0x7f101f6b8ad4 in vattr_map_entry_free (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
    #3 0x7f101f6b8b1e  (/usr/lib64/dirsrv/libslapd.so.0+0x241b1e)
    #4 0x7f101d6c8abf in PL_HashTableEnumerateEntries (/lib64/libplds4.so+0x1abf)

previously allocated by thread T0 here:
    #0 0x7f101fcb8a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f101f548ec1 in slapi_ch_calloc (/usr/lib64/dirsrv/libslapd.so.0+0xd1ec1)
    #2 0x7f101f6b84fd in slapi_vattrspi_register_internal (/usr/lib64/dirsrv/libslapd.so.0+0x2414fd)
    #3 0x7f10113c2038 in cos_cache_init (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0xf038)
    #4 0x7f10113b9dc7 in cos_start (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0x6dc7)
    #5 0x7f101f61f5cc  (/usr/lib64/dirsrv/libslapd.so.0+0x1a85cc)
    #6 0x7f101f6202ff in plugin_startall (/usr/lib64/dirsrv/libslapd.so.0+0x1a92ff)
    #7 0x55c257c662bd  (/usr/sbin/ns-slapd+0x2d2bd)
    #8 0x7f101c5b0039 in __libc_start_main (/lib64/libc.so.6+0x21039)

SUMMARY: AddressSanitizer: double-free (/lib64/libasan.so.4+0xde4b8) in __interceptor_free
Comment 35 Ludwig 2017-09-27 04:20:25 EDT
(In reply to Lukas Slebodnik from comment #33)
> Here is a ASAN report with fixed UUID bug in ipa
> 
Good. This points to the same crash as we have seen in comment #10, so we now are back at the shutdown crash
Comment 36 Ludwig 2017-09-27 04:33:40 EDT
and vattr_map_entry_free was introduced with the patch for ticket 49066 and is not in 1.3.5.
The patch attempted to fix memory leaks at shutdown, but probably did free too much.
Comment 37 Lukas Slebodnik 2017-09-27 05:25:40 EDT
(In reply to Ludwig from comment #36)
> and vattr_map_entry_free was introduced with the patch for ticket 49066 and
> is not in 1.3.5.
> The patch attempted to fix memory leaks at shutdown, but probably did free
> too much.

BTW if ASAN output enough for finding a bug or do you need to see valgrind output. Because it take me a little bit longer to run it with valgrind
Comment 38 Ludwig 2017-09-27 06:03:39 EDT
(In reply to Lukas Slebodnik from comment #37)

> 
> BTW if ASAN output enough for finding a bug or do you need to see valgrind
> output. Because it take me a little bit longer to run it with valgrind

this should be sufficient, thanks
Comment 39 thierry bordaz 2017-09-29 03:12:15 EDT
(In reply to Lukas Slebodnik from comment #32)
> (In reply to thierry bordaz from comment #29)
> > Thanks Lukas for this precious results. IIUC:
> > 
> > The crash is not reproducible in F25 (freeipa-server-4.4.4-1.fc25.x86_64).
> > ipa-uuid did not change since 2014 (was ~freeipa-server-4.1 f22) so the
> > problem detected by ASAN and addressed by tentative fix
> > https://bugzilla.redhat.com/attachment.cgi?id=1330595. Is *NOT* responsible
> > of the crash. The patch will just make ASAN happy.
> > 
> 
> I created clone for freeipa usa-after free BZ1496226

Thanks Lukas, I updated/attached https://bugzilla.redhat.com/attachment.cgi?id=1330595 to BZ1496226

So this BZ is now only related to shutdown double free. (Possibly related to https://pagure.io/389-ds-base/issue/49066 as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1493965#c36)
Comment 40 Lukas Slebodnik 2017-10-02 10:56:47 EDT
(In reply to thierry bordaz from comment #39)
> So this BZ is now only related to shutdown double free. (Possibly related to
Sure, I prefer I ticket per bug 

> https://pagure.io/389-ds-base/issue/49066 as mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1493965#c36)

Let me know if you have some POC patch which we can test :-)
Comment 41 wibrown@redhat.com 2017-10-17 11:01:14 EDT
(In reply to Lukas Slebodnik from comment #33)
> Here is a ASAN report with fixed UUID bug in ipa
> 
> =================================================================
> ==26521==ERROR: AddressSanitizer: attempting double-free on 0x603000db1ab0
> in thread T0:
>     #0 0x7f101fcb84b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
>     #1 0x7f101f5492b8 in slapi_ch_free
> (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
>     #2 0x7f101f6b8ad4 in vattr_map_entry_free
> (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
>     #3 0x7f101f6b8b1e  (/usr/lib64/dirsrv/libslapd.so.0+0x241b1e)
>     #4 0x7f101d6c8abf in PL_HashTableEnumerateEntries
> (/lib64/libplds4.so+0x1abf)
>     #5 0x7f101f6b74c9 in vattr_cleanup
> (/usr/lib64/dirsrv/libslapd.so.0+0x2404c9)
>     #6 0x55c257c6642f  (/usr/sbin/ns-slapd+0x2d42f)
>     #7 0x7f101c5b0039 in __libc_start_main (/lib64/libc.so.6+0x21039)
>     #8 0x55c257c68079  (/usr/sbin/ns-slapd+0x2f079)
> 
> 0x603000db1ab0 is located 0 bytes inside of 24-byte region
> [0x603000db1ab0,0x603000db1ac8)
> freed by thread T0 here:
>     #0 0x7f101fcb84b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
>     #1 0x7f101f5492b8 in slapi_ch_free
> (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
>     #2 0x7f101f6b8ad4 in vattr_map_entry_free
> (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
>     #3 0x7f101f6b8b1e  (/usr/lib64/dirsrv/libslapd.so.0+0x241b1e)
>     #4 0x7f101d6c8abf in PL_HashTableEnumerateEntries
> (/lib64/libplds4.so+0x1abf)
> 
> previously allocated by thread T0 here:
>     #0 0x7f101fcb8a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
>     #1 0x7f101f548ec1 in slapi_ch_calloc
> (/usr/lib64/dirsrv/libslapd.so.0+0xd1ec1)
>     #2 0x7f101f6b84fd in slapi_vattrspi_register_internal
> (/usr/lib64/dirsrv/libslapd.so.0+0x2414fd)
>     #3 0x7f10113c2038 in cos_cache_init
> (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0xf038)
>     #4 0x7f10113b9dc7 in cos_start
> (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0x6dc7)
>     #5 0x7f101f61f5cc  (/usr/lib64/dirsrv/libslapd.so.0+0x1a85cc)
>     #6 0x7f101f6202ff in plugin_startall
> (/usr/lib64/dirsrv/libslapd.so.0+0x1a92ff)
>     #7 0x55c257c662bd  (/usr/sbin/ns-slapd+0x2d2bd)
>     #8 0x7f101c5b0039 in __libc_start_main (/lib64/libc.so.6+0x21039)
> 
> SUMMARY: AddressSanitizer: double-free (/lib64/libasan.so.4+0xde4b8) in
> __interceptor_free


Hey mate,

On your machine can you download this script:

https://github.com/Firstyear/ds-devel-root/blob/master/ds_utils/asan_symbolize_el7.py

And please call:

python asan_symbolize_el7.py < asanoutput.txt

Then post the result. This will give us line numbers and more info that we need to solve it,
Comment 42 Ludwig 2017-10-17 11:07:37 EDT
but vattr_map_entry_free() shouldn't exist in so many places
Comment 43 wibrown@redhat.com 2017-10-18 05:23:15 EDT
(In reply to Ludwig from comment #42)
> but vattr_map_entry_free() shouldn't exist in so many places

Without the addition where I put it, we are leaking memory on shutdown and potentially from dynamic plugins. It's likely that either the first free is leaving a reference behind and we need to null the poiner to stop the double free. That's very likely the cause but without the line numbers I can't see where the fault is,
Comment 44 Lukas Slebodnik 2017-10-18 10:04:34 EDT
I have no idea why but output is not fully symbolized

sh# rpm -qa 389-ds\*
389-ds-base-debuginfo-1.3.7.4-3.fc28.x86_64
389-ds-base-1.3.7.4-3.fc28.x86_64
389-ds-base-libs-debuginfo-1.3.7.4-3.fc28.x86_64
389-ds-base-libs-1.3.7.4-3.fc28.x86_64
389-ds-base-debugsource-1.3.7.4-3.fc28.x86_64

sh# python asan_symbolize_el7.py < /var/run/dirsrv/ns-slapd-TESTRELM-TEST.asan.23338
=================================================================
==23338==ERROR: AddressSanitizer: attempting double-free on 0x603000db12a0 in thread T0:
    #0 0x7f184ea0e4b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
    #1 0x7f184e29f2b8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
    #2 0x7f184e40ead4 in vattr_map_entry_free (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
    #0 0x7f184e40eb1e in vattr_he_cleanup_fn /usr/src/debug/389-ds-base-1.3.7.4-3.fc28.x86_64/ldap/servers/slapd/vattr.c:1875
    #4 0x7f184c41dabf in PL_HashTableEnumerateEntries (/lib64/libplds4.so+0x1abf)
    #5 0x7f184e40d4c9 in vattr_cleanup (/usr/lib64/dirsrv/libslapd.so.0+0x2404c9)
    #1 0x55d11e05242f in ?? /usr/src/debug/389-ds-base-1.3.7.4-3.fc28.x86_64/ldap/servers/slapd/main.c:1191 (discriminator 1)
    #7 0x7f184b305039 in __libc_start_main (/lib64/libc.so.6+0x21039)
    #2 0x55d11e054079 in ?? ??:0

0x603000db12a0 is located 0 bytes inside of 24-byte region [0x603000db12a0,0x603000db12b8)
freed by thread T0 here:
    #0 0x7f184ea0e4b8 in __interceptor_free (/lib64/libasan.so.4+0xde4b8)
    #1 0x7f184e29f2b8 in slapi_ch_free (/usr/lib64/dirsrv/libslapd.so.0+0xd22b8)
    #2 0x7f184e40ead4 in vattr_map_entry_free (/usr/lib64/dirsrv/libslapd.so.0+0x241ad4)
    #3 0x7f184e40eb1e in vattr_he_cleanup_fn /usr/src/debug/389-ds-base-1.3.7.4-3.fc28.x86_64/ldap/servers/slapd/vattr.c:1875
    #4 0x7f184c41dabf in PL_HashTableEnumerateEntries (/lib64/libplds4.so+0x1abf)

previously allocated by thread T0 here:
    #0 0x7f184ea0ea38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f184e29eec1 in slapi_ch_calloc (/usr/lib64/dirsrv/libslapd.so.0+0xd1ec1)
    #2 0x7f184e40e4fd in slapi_vattrspi_register_internal (/usr/lib64/dirsrv/libslapd.so.0+0x2414fd)
    #3 0x7f18400c2038 in cos_cache_init (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0xf038)
    #4 0x7f18400b9dc7 in cos_start (/usr/lib64/dirsrv/plugins/libcos-plugin.so+0x6dc7)
    #4 0x7f184e3755cc in plugin_call_func /usr/src/debug/389-ds-base-1.3.7.4-3.fc28.x86_64/ldap/servers/slapd/plugin.c:2028 (discriminator 1)
    #6 0x7f184e3762ff in plugin_startall (/usr/lib64/dirsrv/libslapd.so.0+0x1a92ff)
    #5 0x55d11e0522bd in main /usr/src/debug/389-ds-base-1.3.7.4-3.fc28.x86_64/ldap/servers/slapd/main.c:1124
    #8 0x7f184b305039 in __libc_start_main (/lib64/libc.so.6+0x21039)

SUMMARY: AddressSanitizer: double-free (/lib64/libasan.so.4+0xde4b8) in __interceptor_free
==23338==ABORTING
Comment 45 Lukas Slebodnik 2017-10-18 11:56:05 EDT
So I checked coredump generated from glibc version + env variable MALLOC_PERTURB_

And problematic seems to be krbPwdPolicyReference. I haveno idea whether it will help or no

(gdb) l  ldap/servers/slapd/vattr.c:1852
1847            }
1848            return 0;
1849    }
1850
1851    void vattr_map_entry_free(vattr_map_entry *vae) {
1852        slapi_ch_free((void **)&(vae->sp_list));
1853        slapi_ch_free_string(&(vae->type_name));
1854        slapi_ch_free((void **)&vae);
1855    }
1856
(gdb) l ldap/servers/slapd/ch_malloc.c:274
269          */
270         if ( ptr==NULL ){
271             return;
272         }
273
274         free (*ptr);
275         *ptr = NULL;
276         return;
277     }
278     #endif /* !MEMPOOL_EXPERIMENTAL */
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fcb55f57800 in __GI_abort () at abort.c:89
#2  0x00007fcb55f9bbb1 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7fcb560bb9f8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007fcb55fa6a59 in malloc_printerr (ar_ptr=<optimized out>, ptr=<optimized out>, str=0x7fcb560bbda8 "double free or corruption (fasttop)", action=<optimized out>) at malloc.c:5077
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3873
#5  0x00007fcb55fac3be in __GI___libc_free (mem=<optimized out>) at malloc.c:2947
#6  0x00007fcb5878ed31 in slapi_ch_free (ptr=0x7fcb08012ba8) at ldap/servers/slapd/ch_malloc.c:274
#7  0x00007fcb5881bec2 in vattr_map_entry_free (vae=<optimized out>) at ldap/servers/slapd/vattr.c:1852
#8  0x00007fcb5881beed in vattr_he_cleanup_fn (he=0x7fcb08012c20, index=<optimized out>, arg=<optimized out>) at ldap/servers/slapd/vattr.c:1859
#9  0x00007fcb56dbfac0 in PL_HashTableEnumerateEntries (ht=0x55e4f749b680, f=f@entry=0x7fcb5881bee0 <vattr_he_cleanup_fn>, arg=arg@entry=0x0) at ../../nspr/lib/ds/plhash.c:374
#10 0x00007fcb5881b7a2 in vattr_map_destroy () at ldap/servers/slapd/vattr.c:1867
#11 vattr_cleanup () at ldap/servers/slapd/vattr.c:121
#12 0x000055e4f536981a in main (argc=5, argv=0x7ffcfc1e9908) at ldap/servers/slapd/main.c:1138
(gdb) p ((vattr_map_entry *)he->value)[0]
$68 = {type_name = 0x7fcb08012390 "krbPwdPolicyReference", sp_list = 0x55e4f7e6a910, objectclasses = 0x7fcb08012460}
(gdb) p ((vattr_map_entry *)he->value)[0].sp_list[0]
$69 = {sp = 0xbebebebebebebebe, next = 0xbebebebebebebebe, hint = 0x0}
Comment 46 Ludwig 2017-11-02 06:49:04 EDT
is this fixed by the fix in #49346 ?
Comment 47 Lukas Slebodnik 2017-11-06 08:30:36 EST
I think Ludwig meant https://pagure.io/389-ds-base/issue/49436

But I seems that patch was not pushed to 1.3.6 and therefore I can still
see a crash with 389-ds-base-1.3.6.10-1.fc26.x86_64
Comment 48 Lukas Slebodnik 2017-11-06 08:31:53 EST
Could you you confirm whether it is the same bug?
Comment 49 mreynolds 2017-11-06 09:18:49 EST
Correct the fix for 49436 was not committed to the 1.3.6 branch.  I just cherry-picked it, and I'll work on a new f26 build...
Comment 50 Lukas Slebodnik 2017-11-06 09:27:36 EST
BTW I can confirm that crash is gone with 1.3.7.7-1.fc27

(In reply to mreynolds from comment #49)
> Correct the fix for 49436 was not committed to the 1.3.6 branch.  I just
> cherry-picked it, and I'll work on a new f26 build...

Thank you very much
Comment 51 Fedora Update System 2017-11-07 09:54:22 EST
389-ds-base-1.3.6.11-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-075a0277f4
Comment 52 Fedora Update System 2017-11-07 19:07:25 EST
389-ds-base-1.3.6.11-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-075a0277f4
Comment 53 Fedora Update System 2017-11-15 15:17:16 EST
389-ds-base-1.3.6.11-1.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.