Bug 1274633 - SSSD is not closing sockets properly
SSSD is not closing sockets properly
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd (Show other bugs)
6.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Lukas Slebodnik
Namita Soman
:
Depends On:
Blocks: 1172231 1272422
  Show dependency treegraph
 
Reported: 2015-10-23 04:27 EDT by Jakub Hrozek
Modified: 2016-05-10 16:24 EDT (History)
13 users (show)

See Also:
Fixed In Version: sssd-1.13.2-1.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-05-10 16:24:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lsof output from cu rhel 7.2 system (65.76 KB, text/plain)
2016-03-30 10:43 EDT, kludhwan
no flags Details

  None (edit)
Description Jakub Hrozek 2015-10-23 04:27:55 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/2792

After certain period of time process sssd-be tends to run out of system resources, hitting the maximum number of open files. It turns out it leaves behind lots of not properly closed sockets:
{{{
[root@login05 ondrejv]# ls -l /proc/`pgrep sssd_be`/fd/
total 0
lrwx------ 1 root root 64 Sep 16 09:50 0 -> /dev/null
lrwx------ 1 root root 64 Sep 16 09:50 1 -> /dev/null
lr-x------ 1 root root 64 Sep 16 09:50 10 -> inotify
lrwx------ 1 root root 64 Sep 16 09:50 100 -> socket:[158904170]
lrwx------ 1 root root 64 Sep 16 09:50 101 -> socket:[174418463]
lrwx------ 1 root root 64 Sep 16 09:50 102 -> socket:[158905068]
lrwx------ 1 root root 64 Sep 16 09:50 103 -> socket:[158906757]
lrwx------ 1 root root 64 Sep 16 09:50 104 -> socket:[166590564]
lrwx------ 1 root root 64 Sep 16 09:50 105 -> socket:[166990886]
lrwx------ 1 root root 64 Sep 16 09:50 106 -> socket:[174408591]
lrwx------ 1 root root 64 Sep 16 09:50 107 -> socket:[166991626]
lrwx------ 1 root root 64 Sep 16 09:50 108 -> socket:[166992492]
lrwx------ 1 root root 64 Sep 16 09:50 109 -> socket:[166993970]
l-wx------ 1 root root 64 Sep 16 09:50 11 -> /var/log/sssd/sssd_default.log
lrwx------ 1 root root 64 Sep 16 09:50 110 -> socket:[168650278]
lrwx------ 1 root root 64 Sep 16 09:50 111 -> socket:[172443487]
lrwx------ 1 root root 64 Sep 16 09:50 112 -> socket:[187704428]
lrwx------ 1 root root 64 Sep 16 09:50 113 -> socket:[182234435]
lrwx------ 1 root root 64 Sep 16 09:50 114 -> socket:[179190714]
lrwx------ 1 root root 64 Sep 16 09:50 115 -> socket:[184308683]
lrwx------ 1 root root 64 Sep 16 09:50 116 -> socket:[189328893]
lrwx------ 1 root root 64 Sep 16 09:50 117 -> socket:[189353567]
lrwx------ 1 root root 64 Sep 16 09:50 118 -> socket:[189343260]
lrwx------ 1 root root 64 Sep 16 09:50 119 -> socket:[189374973]
lrwx------ 1 root root 64 Sep 16 09:50 12 -> [eventpoll]
lrwx------ 1 root root 64 Sep 16 09:50 120 -> socket:[189367203]
lrwx------ 1 root root 64 Sep 16 09:50 121 -> socket:[189407282]
lrwx------ 1 root root 64 Sep 16 09:50 122 -> socket:[189389422]
lrwx------ 1 root root 64 Sep 16 09:50 123 -> socket:[189398015]
lrwx------ 1 root root 64 Sep 16 09:50 124 -> socket:[189321826]
lrwx------ 1 root root 64 Sep 16 09:50 125 -> socket:[191857375]
lrwx------ 1 root root 64 Sep 16 09:50 126 -> socket:[191669942]
lrwx------ 1 root root 64 Sep 16 09:50 127 -> socket:[189869722]
lrwx------ 1 root root 64 Sep 16 09:50 128 -> socket:[191848106]
lrwx------ 1 root root 64 Sep 16 09:50 129 -> socket:[189870357]
lrwx------ 1 root root 64 Sep 16 09:50 13 -> /var/lib/sss/db/cache_default.ldb
lrwx------ 1 root root 64 Sep 16 09:50 130 -> socket:[195279172]
lrwx------ 1 root root 64 Sep 16 09:50 131 -> socket:[209696954]
lrwx------ 1 root root 64 Sep 16 09:50 132 -> socket:[198582238]
lrwx------ 1 root root 64 Sep 16 09:50 133 -> socket:[202870686]
lrwx------ 1 root root 64 Sep 16 09:50 134 -> socket:[202881663]
lrwx------ 1 root root 64 Sep 16 09:50 135 -> socket:[218798231]
lrwx------ 1 root root 64 Sep 16 09:50 136 -> socket:[215428278]
lrwx------ 1 root root 64 Sep 16 09:50 137 -> socket:[220534921]
lrwx------ 1 root root 64 Sep 16 09:50 138 -> socket:[218807007]
lrwx------ 1 root root 64 Sep 16 09:50 139 -> socket:[218817269]
lrwx------ 1 root root 64 Sep 16 09:50 14 -> socket:[22419983]
lrwx------ 1 root root 64 Sep 16 09:50 140 -> socket:[220525178]
lrwx------ 1 root root 64 Sep 16 09:50 141 -> socket:[220549570]
lrwx------ 1 root root 64 Sep 16 09:50 142 -> socket:[222279094]
lrwx------ 1 root root 64 Sep 16 09:50 143 -> socket:[230783349]
lrwx------ 1 root root 64 Sep 16 09:50 144 -> socket:[225518746]
lrwx------ 1 root root 64 Sep 16 09:50 145 -> socket:[230774051]
lrwx------ 1 root root 64 Sep 16 09:50 146 -> socket:[237821496]
lrwx------ 1 root root 64 Sep 16 09:50 147 -> socket:[237813267]
lrwx------ 1 root root 64 Sep 16 09:50 148 -> socket:[251038692]
lrwx------ 1 root root 64 Sep 16 09:50 149 -> socket:[239610812]
lrwx------ 1 root root 64 Sep 16 09:50 15 -> socket:[22419986]
lrwx------ 1 root root 64 Sep 16 09:50 150 -> socket:[239547005]
lrwx------ 1 root root 64 Sep 16 09:50 151 -> socket:[239621136]
lrwx------ 1 root root 64 Sep 16 09:50 152 -> socket:[242866383]
lrwx------ 1 root root 64 Sep 16 09:50 153 -> socket:[247910227]
lrwx------ 1 root root 64 Sep 16 09:50 154 -> socket:[248831180]
lrwx------ 1 root root 64 Sep 16 09:50 155 -> socket:[248050545]
lrwx------ 1 root root 64 Sep 16 09:50 156 -> socket:[248975173]
lrwx------ 1 root root 64 Sep 16 09:50 157 -> socket:[253786932]
lrwx------ 1 root root 64 Sep 16 09:50 158 -> socket:[253778043]
lrwx------ 1 root root 64 Sep 16 09:50 159 -> socket:[253633523]
lrwx------ 1 root root 64 Sep 16 09:50 16 -> socket:[22419990]
lrwx------ 1 root root 64 Sep 16 09:50 160 -> socket:[253782320]
lrwx------ 1 root root 64 Sep 16 09:50 161 -> socket:[264670302]
lrwx------ 1 root root 64 Sep 16 09:50 162 -> socket:[253946603]
lrwx------ 1 root root 64 Sep 16 09:50 163 -> socket:[258063353]
lrwx------ 1 root root 64 Sep 16 09:50 164 -> socket:[253946837]
lrwx------ 1 root root 64 Sep 16 09:50 165 -> socket:[253947206]
lrwx------ 1 root root 64 Sep 16 09:50 166 -> socket:[253947641]
lrwx------ 1 root root 64 Sep 16 09:50 167 -> socket:[260200790]
lrwx------ 1 root root 64 Sep 16 09:50 168 -> socket:[258786272]
lrwx------ 1 root root 64 Sep 16 09:50 169 -> socket:[259133490]
l-wx------ 1 root root 64 Sep 16 09:50 17 -> /var/log/sssd/ldap_child.log
lrwx------ 1 root root 64 Sep 16 09:50 170 -> socket:[266886828]
lrwx------ 1 root root 64 Sep 16 09:50 171 -> socket:[266890082]
lrwx------ 1 root root 64 Sep 16 09:50 172 -> socket:[270535370]
lrwx------ 1 root root 64 Sep 16 09:50 173 -> socket:[274378300]
lrwx------ 1 root root 64 Sep 16 09:50 174 -> socket:[270631237]
lrwx------ 1 root root 64 Sep 16 09:50 175 -> socket:[270819551]
....
}}}
Comment 1 Jakub Hrozek 2015-10-23 04:29:29 EDT
Dev acking, fix available upstream.
Comment 2 Jakub Hrozek 2015-10-30 02:53:45 EDT
Please add steps to reproduce so that we can qa_ack..
Comment 3 Lukas Slebodnik 2015-10-30 07:36:05 EDT
My reproducer: 
* two active directories with sites; so sometimes sssd connect to server A 
  and sometimes to server B. 
* block connection to one server 
 
[root@host sssd]# iptables -n -L 
Chain INPUT (policy ACCEPT) 
target     prot opt source               destination 
                                                                                Chain FORWARD (policy ACCEPT) 
target     prot opt source               destination                                                      
Chain OUTPUT (policy ACCEPT)             
target     prot opt source               destination 
DROP       tcp  --  0.0.0.0/0            10.12.0.158          tcp dpt:389                                 
 
* force sssd to go offline. Send signals to sssd process to go offline (-USR1) and online (-USR2). 

You might be able to reproduce it even with plain LDAP. So it should not be necessary to test with AD. Moreover it should be simpler to reproduce with LDAP because sssd will not connect to different site in AD and every time will try to connect to blocked ldap port.
 
It might help if set value of options that 
  dns_resolver_timeout < ldap_network_timeout
Comment 4 Lukas Slebodnik 2015-11-05 04:39:48 EST
master:
* a10f67d4c64f3b1243de5d86a996475361adf0ac 

sssd-1-13:
* db2fdba6f3cecd0612439988e61be60d5d8576bf 

sssd-1-12:
* 2136f71c94660bcdde83f80feb83734389d57674
Comment 8 Dan Lavu 2016-03-22 15:17:00 EDT
Verified against sssd-client-1.13.3-19.el6.x86_64, sockets are closing correctly.

[root@sssdqe5 ~]# nslookup ad2.domain.com
Server:		192.168.51.4
Address:	192.168.51.4#53

Name:	ad2.domain.com
Address: 192.168.51.5

[root@sssdqe5 ~]# iptables -A INPUT -s 192.168.51.5 -j DROP

Made the process go offline/online several times, and they're no additional sockets.


[root@sssdqe5 ~]# ls -l /proc/`pgrep sssd_be`/fd/
ls: cannot access 5515: No such file or directory
ls: cannot access 5524: No such file or directory
ls: cannot access 5537/fd/: No such file or directory
/proc/5420:
total 0
dr-xr-xr-x. 2 root root 0 Mar 22 15:11 attr
-rw-r--r--. 1 root root 0 Mar 22 15:11 autogroup
-r--------. 1 root root 0 Mar 22 15:11 auxv
-r--r--r--. 1 root root 0 Mar 22 15:11 cgroup
--w-------. 1 root root 0 Mar 22 15:11 clear_refs
-r--r--r--. 1 root root 0 Mar 22 15:05 cmdline
-rw-r--r--. 1 root root 0 Mar 22 15:11 comm
-rw-r--r--. 1 root root 0 Mar 22 15:11 coredump_filter
-r--r--r--. 1 root root 0 Mar 22 15:11 cpuset
lrwxrwxrwx. 1 root root 0 Mar 22 15:11 cwd -> /
-r--------. 1 root root 0 Mar 22 15:11 environ
lrwxrwxrwx. 1 root root 0 Mar 22 15:11 exe -> /usr/libexec/sssd/sssd_be
dr-x------. 2 root root 0 Mar 22 15:11 fd
dr-x------. 2 root root 0 Mar 22 15:11 fdinfo
-r--------. 1 root root 0 Mar 22 15:11 io
-rw-------. 1 root root 0 Mar 22 15:11 limits
-rw-r--r--. 1 root root 0 Mar 22 15:11 loginuid
-r--r--r--. 1 root root 0 Mar 22 15:11 maps
-rw-------. 1 root root 0 Mar 22 15:11 mem
-r--r--r--. 1 root root 0 Mar 22 15:11 mountinfo
-r--r--r--. 1 root root 0 Mar 22 15:11 mounts
-r--------. 1 root root 0 Mar 22 15:11 mountstats
dr-xr-xr-x. 5 root root 0 Mar 22 15:11 net
dr-x--x--x. 2 root root 0 Mar 22 15:11 ns
-r--r--r--. 1 root root 0 Mar 22 15:11 numa_maps
-rw-r--r--. 1 root root 0 Mar 22 15:11 oom_adj
-r--r--r--. 1 root root 0 Mar 22 15:11 oom_score
-rw-r--r--. 1 root root 0 Mar 22 15:11 oom_score_adj
-r--r--r--. 1 root root 0 Mar 22 15:11 pagemap
-r--r--r--. 1 root root 0 Mar 22 15:11 personality
lrwxrwxrwx. 1 root root 0 Mar 22 15:11 root -> /
-rw-r--r--. 1 root root 0 Mar 22 15:11 sched
-r--r--r--. 1 root root 0 Mar 22 15:11 schedstat
-r--r--r--. 1 root root 0 Mar 22 15:11 sessionid
-r--r--r--. 1 root root 0 Mar 22 15:11 smaps
-r--r--r--. 1 root root 0 Mar 22 15:11 stack
-r--r--r--. 1 root root 0 Mar 22 15:05 stat
-r--r--r--. 1 root root 0 Mar 22 15:11 statm
-r--r--r--. 1 root root 0 Mar 22 15:03 status
-r--r--r--. 1 root root 0 Mar 22 15:11 syscall
dr-xr-xr-x. 3 root root 0 Mar 22 15:11 task
-r--r--r--. 1 root root 0 Mar 22 15:11 wchan
Comment 9 kludhwan 2016-03-30 10:43 EDT
Created attachment 1141778 [details]
lsof output from cu rhel 7.2 system

Hello,  

I have a cu that seems to facing the similar issue on his rhel 7.2 system.

Do we have bugzilla for rhel 7 as well?

Thanks,
Kushal
Comment 10 Jakub Hrozek 2016-03-30 11:15:45 EDT
(In reply to kludhwan from comment #9)
> Created attachment 1141778 [details]
> lsof output from cu rhel 7.2 system
> 
> Hello,  
> 
> I have a cu that seems to facing the similar issue on his rhel 7.2 system.
> 
> Do we have bugzilla for rhel 7 as well?
> 
> Thanks,
> Kushal

https://bugzilla.redhat.com/show_bug.cgi?id=1313014
Comment 12 errata-xmlrpc 2016-05-10 16:24:48 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0782.html

Note You need to log in before you can comment on or make changes to this bug.