Bug 1513877

Summary: Regression: winbind leaks memory after upgrade
Product: Red Hat Enterprise Linux 6 Reporter: amitkuma
Component: sambaAssignee: Andreas Schneider <asn>
Status: CLOSED ERRATA QA Contact: Robin Hack <rhack>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.5CC: adzilsky, amitkuma, asn, enewland, gdeschner, gparente, hchatter, jarrpa, jkurik, jvilicic, kbittner, kludhwan, minyu, pdhamdhe, rhack, tscherf, utnoor, vmishra
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: samba-3.6.23-51.el6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-19 05:08:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1504542    
Attachments:
Description Flags
valgrind output none

Description amitkuma 2017-11-16 08:09:26 UTC
Created attachment 1353272 [details]
valgrind output

Description of problem:
After the winbind package upgraded from samba-winbind-3.6.23-43.el6_9 to samba-winbind-3.6.23-45.el6_9, Customer noticed it consumes huge memory and then oom-killer is invoked and kills the process.

$ dmesg|grep -i killed
Killed process pid, UID 0, (winbindd) total-vm:206912kB, anon-rss:572kB, file-rss:124kB
Killed process pid3, UID 0, (winbindd) total-vm:204672kB, anon-rss:296kB, file-rss:788kB
Killed process pid4, UID 0, (winbindd) total-vm:204836kB, anon-rss:292kB, file-rss:812kB
Killed process pid7, UID 0, (winbindd) total-vm:204836kB, anon-rss:292kB, file-rss:704kB

If customer downgrades winbind package to:
samba-winbind-3.6.23-43.el6_9.x86_64.rpm
Issue is not seen.

Issue seen with these packages:
# rpm -qa|grep samba
samba-winbind-3.6.23-45.el6_9.x86_64
samba4-libs-4.2.10-11.el6_9.x86_64
samba-common-3.6.23-45.el6_9.x86_64
samba-client-3.6.23-45.el6_9.x86_64
samba-winbind-clients-3.6.23-45.el6_9.x86_64

No Issues with these packages:
libsmbclient-3.6.23-43.el6_9.x86_64.rpm
samba4-libs-4.2.10-10.el6_9.x86_64.rpm
samba-client-3.6.23-43.el6_9.x86_64.rpm
samba-common-3.6.23-43.el6_9.x86_64.rpm
samba-winbind-3.6.23-43.el6_9.x86_64.rpm
samba-winbind-clients-3.6.23-43.el6_9.x86_64.rpm

Other Info:
1. selinux is disabled.
2. Issue persists even if no AD user logs into samba server.
3. Has winbind offline logon = false in smb.conf
4. Issue reappears after a while of restarting the winbind service no need for user to login.
5. Customer is ready to test the test-rpms.

Version-Release number of selected component (if applicable):
samba-winbind-3.6.23-45.el6_9.x86_64

How reproducible:
All times in customer env.

Steps to Reproduce:
1. Set up samba-6.5 authenticating to AD using winbind
2. winbind package should be samba-winbind-3.6.23-45.
3. start winbind and try connecting to one of the share.
4. Only connection from only 1 user also shows winbind consuming lot of memory.

Actual results:
winbind memory leak and consumes resources

Expected results:
winbind should not leak memory and consume resources

Additional info:

Comment 2 amitkuma 2017-11-16 08:31:06 UTC
Customer provided information.
Issue is also present on samba-winbind-3.6.23-44

Comment 3 amitkuma 2017-11-17 06:14:56 UTC
Hello,

Another Customer(Cargill Inc) using samba-3.6.23-45.el6_9 found winbind leak Issues.

# tail -f /var/log/samba/winbindd.log
[2017/11/16 12:30:45.012275,  0] winbindd/winbindd_dual.c:1392(fork_domain_child)
  Could not fork: Cannot allocate memory
[2017/11/16 12:30:45.015812,  0] winbindd/winbindd_dual.c:1392(fork_domain_child)
  Could not fork: Cannot allocate memory
[2017/11/16 12:30:45.021137,  0] winbindd/winbindd_dual.c:1392(fork_domain_child)
  Could not fork: Cannot allocate memory
[2017/11/16 12:32:01.927146,  0] param/loadparm.c:8056(lp_do_parameter)
  Global parameter unix extensions found in service section!
[2017/11/16 12:32:01.937565,  0] winbindd/winbindd_cache.c:3204(initialize_winbindd_cache)
  initialize_winbindd_cache: clearing cache and re-creating with version number 2

# tail -f /var/log/messages
Nov 16 12:30:50 XXXX kernel: [20260]     0 20260     1026       17   2       0             0 tail
Nov 16 12:30:50 XXXX kernel: [30144]     0 30144   102022      262   3       0             0 LogMonitoring
Nov 16 12:30:50 XXXX kernel: Out of memory: Kill process 15343 (winbindd) score 637 or sacrifice child
Nov 16 12:30:50 XXXX kernel: Killed process 15343, UID 0, (winbindd) total-vm:39083488kB, anon-rss:22851136kB, file-rss:36kB
Nov 16 12:32:01 XXXX winbindd[9283]: [2017/11/16 12:32:01.927146,  0] param/loadparm.c:8056(lp_do_parameter)
Nov 16 12:32:01 XXXX winbindd[9283]:   Global parameter unix extensions found in service section!
Nov 16 12:32:01 XXXX winbindd[9285]: [2017/11/16 12:32:01.937565,  0] winbindd/winbindd_cache.c:3204(initialize_winbindd_cache)
Nov 16 12:32:01 XXXX winbindd[9285]:   initialize_winbindd_cache: clearing cache and re-creating with version number 2

valgrind Output is taken after running:
# /usr/bin/valgrind --trace-children=yes --show-reachable=yes --track-origins=yes --read-var-info=yes --tool=memcheck --leak-check=full --num-callers=50 -v --log-file=/tmp/valgrind.out /usr/sbin/winbindd -F -S

Thanks
Amit

Comment 4 Andreas Schneider 2017-11-21 08:42:35 UTC
Please always install the 'samba-debuginfo' package before you run valgrind!


USE: valgrind --tool=memcheck -v --num-callers=20 --track-origins=yes --trace-children=yes


DO NOT ADD: --leak-check nor --show-reachables

Comment 34 Andreas Schneider 2018-01-23 15:12:30 UTC
It looked like from previous information that the issue is in the main winbindd process (see 'ps afx'). Assuming that this piece of information is correct we will need the following steps:

1. Upgrade the samba packages that had the issue.
2. Install samba-debuginfo libtevent-debuginfo
3. Enable samba debug logs

service winbind stop; service smb stop; service nmb stop;

     log level = 10
     debug pid = true
     max log size = 0

rm -f /var/log/samba/*
service smb start; service nmb start

4. valgrind --tool=memcheck -v --num-callers=20 --track-origins=yes --leak-check=full /usr/sbin/winbindd  and wait for the issue to happen (the memory of winbindd starts growing and growing)

5. Provide samba and valgrind logs including the output of 'wbinfo --trusted-domains --verbose"'

Comment 36 Andreas Schneider 2018-02-08 11:20:29 UTC
Where can I find the logs?

Comment 43 Andreas Schneider 2018-02-20 12:48:09 UTC
Could I get the coredump please?

Comment 59 Andreas Schneider 2018-03-21 16:36:10 UTC
*** Bug 1558933 has been marked as a duplicate of this bug. ***

Comment 71 errata-xmlrpc 2018-06-19 05:08:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1860