Bug 1300636 - nsupdate crashing during system join against AD
nsupdate crashing during system join against AD
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: bind (Show other bugs)
7.2
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Tomáš Hozza
qe-baseos-daemons
:
: 1438393 (view as bug list)
Depends On:
Blocks: 1465887 1465928 1380362
  Show dependency treegraph
 
Reported: 2016-01-21 05:27 EST by Nirupama Karandikar
Modified: 2017-11-04 21:10 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
genrated coredump during crash (5.65 MB, application/x-gzip)
2016-01-21 05:27 EST, Nirupama Karandikar
no flags Details
sssd domain full logs (1.47 MB, text/plain)
2016-07-19 10:17 EDT, Steeve Goveas
no flags Details
Coredump 19 July 16 (303.20 KB, application/x-gzip)
2016-07-19 10:19 EDT, Steeve Goveas
no flags Details

  None (edit)
Description Nirupama Karandikar 2016-01-21 05:27:16 EST
Created attachment 1116891 [details]
genrated coredump during crash

Description of problem:
/usr/bin/nsupdate -g -d -D

Version-Release number of selected component (if applicable):
bind-utils-9.9.4-29.el7_2.1

How reproducible:
Always


Actual results:
nsupdate crashing during system configuration against AD

Expected results:
nsupdate should not crash

Additional info:
Attaching generated core file
Comment 2 Tomáš Hozza 2016-01-21 06:16:00 EST
Hello.

Please provide steps to reproduce including the overall system setup and the sequence of commands that led to the crash.

Also if you have a system on which you can reproduce the issue, it would be really helpful if you could provide me access to it for debugging purposes.

Thanks
Comment 7 Steeve Goveas 2016-07-19 10:14:05 EDT
[root@bumblebee ad_parameters]# cat /etc/sssd/sssd.conf

[sssd]
config_file_version = 2
services = nss, pam
domains = sssdad.com
user = 

[nss]
filter_groups = root
filter_users = root
default_shell = /bin/bash
override_homedir = /home/%u

[domain/sssdad.com]
debug_level = 0xFFF0
id_provider = ad


ad_domain = sssdad.com
krb5_realm = SSSDAD
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
ldap_schema = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
use_fully_qualified_names = False
fallback_homedir = /home/%u@%d
access_provider = ad
ldap_id_mapping = false


* Crash happens in this test block when sssd is restarted

:: [   PASS   ] :: Command 'service sssd stop; rm -f /var/lib/sss/{mc,db}/*; service sssd start' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'date; id aa0002-8794'
Tue Jul 19 16:29:15 IST 2016
uid=10031(aa0002-8794) gid=10035(tgroup01-8794) groups=10035(tgroup01-8794),100000(domain users)
:: [   PASS   ] :: Command 'date; id aa0002-8794' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'service sssd stop; rm -f /var/lib/sss/{mc,db}/*; service sssd start'
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
:: [   PASS   ] :: Command 'service sssd stop; rm -f /var/lib/sss/{mc,db}/*; service sssd start' (Expected 0, got 0)
:: [  BEGIN   ] :: Running 'date; id aa0002-8794'
Tue Jul 19 16:29:31 IST 2016
uid=10031(aa0002-8794) gid=10035(tgroup01-8794) groups=10035(tgroup01-8794),100000(domain users)

* Crash occurred at 16:29:08

[root@bumblebee sssd]# tail -f /var/log/messages

Jul 19 16:29:07 bumblebee systemd: Stopping System Security Services Daemon...
Jul 19 16:29:07 bumblebee sssd[be[sssdad.com]]: Shutting down
Jul 19 16:29:07 bumblebee sssd[nss]: Shutting down
Jul 19 16:29:07 bumblebee sssd[pam]: Shutting down
Jul 19 16:29:08 bumblebee abrt-hook-ccpp: Process 26991 (nsupdate) of user 0 killed by SIGABRT - dumping core
Jul 19 16:29:08 bumblebee systemd: Stopped System Security Services Daemon.
Jul 19 16:29:08 bumblebee abrt-server: Duplicate: core backtrace
Jul 19 16:29:08 bumblebee abrt-server: DUP_OF_DIR: /var/spool/abrt/ccpp-2016-07-19-16:19:44-25069
Jul 19 16:29:08 bumblebee abrt-server: Deleting problem directory ccpp-2016-07-19-16:29:08-26991 (dup of ccpp-2016-07-19-16:19:44-25069)
Jul 19 16:29:08 bumblebee systemd: Starting System Security Services Daemon...
Jul 19 16:29:08 bumblebee dbus[591]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
Jul 19 16:29:08 bumblebee dbus-daemon: dbus[591]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
Jul 19 16:29:08 bumblebee dbus[591]: [system] Successfully activated service 'org.freedesktop.problems'
Jul 19 16:29:08 bumblebee dbus-daemon: dbus[591]: [system] Successfully activated service 'org.freedesktop.problems'
Jul 19 16:29:09 bumblebee sssd: Starting up
Jul 19 16:29:09 bumblebee sssd[be[sssdad.com]]: Starting up
Jul 19 16:29:09 bumblebee sssd[pam]: Starting up
Jul 19 16:29:09 bumblebee sssd[nss]: Starting up
Jul 19 16:29:09 bumblebee systemd: Started System Security Services Daemon.


* sssd domain logs around the time of the crash. Full logs will be attached along with the core file

[root@bumblebee sssd]# tail -f sssd_sssdad.com.log

(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [child_sig_handler] (0x1000): Waiting for child [26983].
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [child_sig_handler] (0x0100): child [26983] finished successfully.
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [be_nsupdate_done] (0x0200): nsupdate child status: 0
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [nsupdate_msg_create_common] (0x0200): Creating update message for realm [(null)].
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [be_nsupdate_create_ptr_msg] (0x0400):  -- Begin nsupdate message -- 

update add 73.206.65.10.in-addr.arpa. 3600 in PTR bumblebee.sssdad.com.
send
 -- End nsupdate message -- 
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [child_handler_setup] (0x2000): Setting up signal handler up for pid [26991]
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [child_handler_setup] (0x2000): Signal handler set up for pid [26991]
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [write_pipe_handler] (0x0400): All data has been sent!
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [nsupdate_child_stdin_done] (0x1000): Sending nsupdate data complete
(Tue Jul 19 16:29:06 2016) [sssd[be[sssdad.com]]] [be_nsupdate_args] (0x0200): nsupdate auth type: GSS-TSIG
setup_system()
reset_system()
user_interaction()
do_next_command()
start_update()
done_update()
reset_system()
user_interaction()
do_next_command()
evaluate_update()
update_addordelete()
do_next_command()
start_update()
recvsoa()
About to create rcvmsg
show_message()
Reply from SOA query:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:  50028
;; flags: qr aa rd ra; QUESTION: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 2
;; QUESTION SECTION:
;73.206.65.10.in-addr.arpa.     IN      SOA

;; AUTHORITY SECTION:
10.in-addr.arpa.        3600    IN      SOA     pluto.sssdad.com. hostmaster.sssdad.com. 1557 900 600 86400 3600

;; ADDITIONAL SECTION:
pluto.sssdad.com.       3600    IN      A       10.8.63.34
pluto.sssdad.com.       3600    IN      AAAA    2620:52:0:83f:9149:4ef6:d466:9e9f

Found zone name: 10.in-addr.arpa
The master is: pluto.sssdad.com
start_gssrequest
send_gssrequest
show_message()
Out of recvsoa
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_krb5_info_files] (0x0200): Could not remove [/var/lib/sss/pubconf/kpasswdinfo.SSSDAD.COM], [2][No such file or directory]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [be_ptask_destructor] (0x0400): Terminating periodic task [AD machine account password renewal]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [dp_terminate_active_requests] (0x0400): Terminating active data provider requests
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [dp_client_destructor] (0x0400): Removed NSS client
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [dp_client_destructor] (0x0400): Removed PAM client
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [sbus_remove_watch] (0x2000): 0x7f7708377d90/0x7f7708370a30
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_socket_symlink] (0x4000): The symlink points to [/var/lib/sss/pipes/private/sbus-dp_sssdad.com.26958]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_socket_symlink] (0x4000): The path including our pid is [/var/lib/sss/pipes/private/sbus-dp_sssdad.com.26958]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_socket_symlink] (0x4000): Removed the symlink
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [be_ptask_destructor] (0x0400): Terminating periodic task [Subdomains Refresh]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [sdap_handle_release] (0x2000): Trace: sh[0x7f77083be000], connected[1], ops[(nil)], ldap[0x7f7708398070], destructor_lock[0], release_memory[0]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_connection_callback] (0x4000): Successfully removed connection callback.
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [sdap_handle_release] (0x2000): Trace: sh[0x7f770837b560], connected[1], ops[(nil)], ldap[0x7f770837fe90], destructor_lock[0], release_memory[0]
(Tue Jul 19 16:29:07 2016) [sssd[be[sssdad.com]]] [remove_connection_callback] (0x4000): Successfully removed connection callback.
cleanup()
Detaching GSS-TSIG keyring
Shutting down task manager
shutdown_program()
Shutting down request manager
recvgss()
Shutting down request manager
Destroy DST lib
Destroying request manager
Freeing the dispatchers
Shutting down dispatch manager
Destroying event
Shutting down socket manager
Shutting down timer manager
Destroying hash context
Destroying name state
Removing log context
Destroying memory context
mem.c:1102: INSIST(ctx->stats[i].gets == 0U) failed, back trace
#0 0x7f33496d10d7 in ??
#1 0x7f33496d102a in ??
#2 0x7f33496e0aa1 in ??
#3 0x7f33496e0f1c in ??
#4 0x404c07 in ??
#5 0x7f33474f3b35 in ??
#6 0x404da5 in ??
Comment 8 Steeve Goveas 2016-07-19 10:17 EDT
Created attachment 1181649 [details]
sssd domain full logs
Comment 9 Steeve Goveas 2016-07-19 10:19 EDT
Created attachment 1181650 [details]
Coredump 19 July 16
Comment 10 Tomáš Hozza 2016-07-19 10:33:00 EDT
Hi.

Please provide the exact NVR of bind-utils RPM used and ideally also the set of commands passes to nsupdate. Thanks
Comment 11 Jakub Hrozek 2016-07-20 04:33:15 EDT
The nsupdate message is:

-- Begin nsupdate message -- 

update add 73.206.65.10.in-addr.arpa. 3600 in PTR bumblebee.sssdad.com.
send
 -- End nsupdate message -- 

We usually invoke the nsupdate with -g, with this high debug level we also add -D and -d.
Comment 12 Steeve Goveas 2016-07-20 04:37:41 EDT
We've seen this crash in rhel-7.2, 6.8 and now 7.3. The attached logs and corefile are from a rhel-7.3 system.

[root@bumblebee ~]# rpm -q bind-utils
bind-utils-9.9.4-36.el7.x86_64
Comment 19 Tomáš Hozza 2017-09-08 08:57:44 EDT
*** Bug 1438393 has been marked as a duplicate of this bug. ***
Comment 20 Tomáš Hozza 2017-09-08 09:02:00 EDT
Hello Steve.

Could you please rerun the test, in which nsupdate crashes, with "-M" option? It enables additional memory debugging in nsupdate and it should provide more information where the leak is happening.

Thanks!
Comment 26 Tomáš Hozza 2017-09-19 11:27:35 EDT
Reported to upstream:
https://bugs.isc.org/SelfService/Display.html?id=46044 (not public yet).

Tracked as:
[ISC-Bugs #46044]

Note You need to log in before you can comment on or make changes to this bug.