Bug 1519884

Summary: Winbind Core Dumps
Product: Red Hat Enterprise Linux 6 Reporter: MarkS <mark>
Component: sambaAssignee: Andreas Schneider <asn>
Status: CLOSED ERRATA QA Contact: Andrej Dzilský <adzilsky>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.10CC: adzilsky, asakure, asn, enewland, gdeschner, jarrpa, jkurik, mark, rhack
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: samba-3.6.23-48.el6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-19 05:08:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1504542    
Attachments:
Description Flags
Core Dump none

Description MarkS 2017-12-01 16:01:25 UTC
Description of problem:
winbindd core dumps.

Version-Release number of selected component (if applicable):
samba-winbind-clients-3.6.23-45.el6_9.x86_64
samba4-libs-4.2.10-11.el6_9.x86_64
samba-winbind-3.6.23-45.el6_9.x86_64
samba-common-3.6.23-45.el6_9.x86_64

How reproducible:
Regular periods it will core dump, unknown trigger.

Steps to Reproduce:
1. N/A

Actual results:

[2017/12/01 15:46:29.568351,  0] ../lib/util/debug.c:413(talloc_log_fn)
  Bad talloc magic value - unknown value
[2017/12/01 15:46:29.568712,  0] lib/util.c:1117(smb_panic)
  PANIC (pid 13745): Bad talloc magic value - unknown value
[2017/12/01 15:46:29.570115,  0] lib/util.c:1221(log_stack_trace)
  BACKTRACE: 20 stack frames:
   #0 winbindd(log_stack_trace+0x1a) [0x7fb8b1db872a]
   #1 winbindd(smb_panic+0x2b) [0x7fb8b1db87fb]
   #2 /usr/lib64/libtalloc.so.2(+0x26cb) [0x7fb8af98d6cb]
   #3 /usr/lib64/libtalloc.so.2(_talloc_zero+0x56) [0x7fb8af98dfe6]
   #4 winbindd(ndr_push_init_ctx+0x12) [0x7fb8b1dd1bd2]
   #5 winbindd(+0x104ecd) [0x7fb8b1d0becd]
   #6 winbindd(winbindd_dual_ndrcmd+0xb7) [0x7fb8b1d017d7]
   #7 winbindd(+0xf879b) [0x7fb8b1cff79b]
   #8 /usr/lib64/libtevent.so.0(+0x9ea6) [0x7fb8af786ea6]
   #9 /usr/lib64/libtevent.so.0(+0x82d6) [0x7fb8af7852d6]
   #10 /usr/lib64/libtevent.so.0(_tevent_loop_once+0x9d) [0x7fb8af780c3d]
   #11 winbindd(+0xf98f4) [0x7fb8b1d008f4]
   #12 winbindd(+0xfa065) [0x7fb8b1d01065]
   #13 /usr/lib64/libtevent.so.0(tevent_common_loop_immediate+0xe8) [0x7fb8af781868]
   #14 /usr/lib64/libtevent.so.0(+0x9c96) [0x7fb8af786c96]
   #15 /usr/lib64/libtevent.so.0(+0x82d6) [0x7fb8af7852d6]
   #16 /usr/lib64/libtevent.so.0(_tevent_loop_once+0x9d) [0x7fb8af780c3d]
   #17 winbindd(main+0x7b4) [0x7fb8b1cd7a04]
   #18 /lib64/libc.so.6(__libc_start_main+0xfd) [0x7fb8aedced1d]
   #19 winbindd(+0xce0d9) [0x7fb8b1cd50d9]
[2017/12/01 15:46:29.572004,  0] lib/fault.c:372(dump_core)
  dumping core in /var/log/samba/cores/winbindd

Expected results:
No core dumps

Additional info:

# cat /etc/samba/smb.conf 
[global]
   workgroup = DOMAIN
   realm = DOMAIN
   security = ads
   kerberos method = secrets and keytab
   log file = /var/log/samba/%m.log

Comment 2 Andreas Schneider 2017-12-05 15:11:35 UTC
Could you please install samba-debuginfo an get a full backtrace?

Also running winbind with valgrind WITHOUT leak checking would be interesting.

valgrind --tool=memcheck -v --num-callers=20 --track-origins=yes

Comment 3 MarkS 2017-12-05 16:22:36 UTC
Not sure how much more information I can provide as I don't know specific what to look for that would assist you in diagnosing the issue.

The fault seems appear every 10 minutes or so in our dev system. Core dumps occur with the PANIC.

Dec  5 14:46:33 SERVER winbindd[30191]:   PANIC (pid 30191): Bad talloc magic value - unknown value
Dec  5 14:56:33 SERVER winbindd[30474]:   PANIC (pid 30474): Bad talloc magic value - unknown value
Dec  5 15:06:33 SERVER winbindd[31081]:   PANIC (pid 31081): Bad talloc magic value - unknown value
Dec  5 15:16:32 SERVER winbindd[31342]:   PANIC (pid 31342): Bad talloc magic value - unknown value
Dec  5 15:26:33 SERVER winbindd[31607]:   PANIC (pid 31607): Bad talloc magic value - unknown value

This is the transaction which updated it

Updated samba-common-3.6.23-30.el6_7.x86_64          @rhel-6-server-rpms
Update               3.6.23-45.el6_9.x86_64          @rhel-6-server-rpms
Updated samba-winbind-3.6.23-30.el6_7.x86_64         @rhel-6-server-rpms
Update                3.6.23-45.el6_9.x86_64         @rhel-6-server-rpms
Updated samba-winbind-clients-3.6.23-30.el6_7.x86_64 @rhel-6-server-rpms
Update                        3.6.23-45.el6_9.x86_64 @rhel-6-server-rpms
Updated samba4-libs-4.2.10-6.el6_7.x86_64            @rhel-6-server-rpms
Update              4.2.10-11.el6_9.x86_64           @rhel-6-server-rpms

I did attempt a reinstall to ensure it was not a failed patch. That didn't provide any change.

After which I have attempted a downgrade

yum downgrade samba-common-3.6.23-30.el6_7.x86_64 samba-winbind-3.6.23-30.el6_7.x86_64 samba-winbind-clients-3.6.23-30.el6_7.x86_64 samba4-libs-4.2.10-6.el6_7.x86_64

After which it looks like we experience no core dumps or complains.

I will run it overnight like this to see if anything appears and get back to you.

Let me know if there is anything specific you desire with regards to information.

Comment 4 MarkS 2017-12-06 08:53:01 UTC
I can confirm that a downgrade stopped the PANIC and winbind no longer core dumps.

Comment 5 Andreas Schneider 2017-12-06 14:54:15 UTC
I'm not able to reproduce this.

Comment 6 MarkS 2017-12-06 16:11:02 UTC
Created attachment 1363725 [details]
Core Dump

Valgrind output from core dump.

Comment 7 Andreas Schneider 2017-12-07 09:44:38 UTC
I have a test package, but you need to contact Red Hat support so that I can provide it. Could you do that?

Comment 8 MarkS 2017-12-07 10:08:36 UTC
CASE 01988844 opened.

Comment 11 Andreas Schneider 2017-12-07 12:18:35 UTC
MarkS: Please try the test package and report back. It is possible that it isn't fixed, in that case valgrind logs are of interest! Thanks.

Comment 13 Andrej Dzilský 2017-12-07 13:28:45 UTC
QA_ack+ if you won't find repoducer as you proposed Andreas, then I will just put this into sanity_only state.

Comment 14 MarkS 2017-12-08 08:56:25 UTC
I can confirm that the test package does resolve the core dump issue.

Comment 15 Andreas Schneider 2017-12-08 10:37:50 UTC
Awesome, thanks!

Comment 19 errata-xmlrpc 2018-06-19 05:08:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1860