Bug 743841 - SSSD can crash due to dbus server removing a UNIX socket
Summary: SSSD can crash due to dbus server removing a UNIX socket
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Stephen Gallagher
QA Contact: IDM QE LIST
URL:
Whiteboard:
Depends On: 746265 754121
Blocks: 748554 748898 749255
TreeView+ depends on / blocked
 
Reported: 2011-10-06 09:38 UTC by Jakub Hrozek
Modified: 2020-05-02 16:27 UTC (History)
6 users (show)

Fixed In Version: sssd-1.5.1-57.el6
Doc Type: Bug Fix
Doc Text:
Cause: SSSD components communicate using the DBus protocol. On initializing the DBus server, the DBus library is given a file name that represents a known interface. Dbus creates the socket on server startup. When server shuts down, it calls a DBus cleanup function which removes the socket. Consequence: In case one of components is restarted, a race condition can cause that the socket is removed by the old component instance after the new instance is already running and connected to it. Fix: Path names that contain the server process's PID are passed to DBus and a symlink with a known and defined path name is pointed to the path name with PID. Clients connect to the well-known symlink paths. Result: When the DBus server exits, it only removes the PID-decorated path name. Clients are still connected to the same path no matter what server is the symlink pointed to.
Clone Of:
: 748898 (view as bug list)
Environment:
Last Closed: 2011-12-06 16:40:56 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
FedoraHosted SSSD 1034 0 None None None Never
Github SSSD sssd issues 2076 0 None closed SSSD can crash due to dbus server removing a UNIX socket 2020-11-03 16:06:45 UTC
Github SSSD sssd issues 2077 0 None closed SSSD can crash due to dbus server removing a UNIX socket 2020-11-03 16:06:44 UTC
Red Hat Product Errata RHBA-2011:1529 0 normal SHIPPED_LIVE sssd bug fix and enhancement update 2011-12-06 00:50:20 UTC

Description Jakub Hrozek 2011-10-06 09:38:07 UTC
Description of problem:
SSSD components comunicate using a low-level DBus protocol. On initializing the server dbus is given a filename that represents a known interface and creates the socket When server shuts down, it calls a dbus cleanup function which removes the socket.

Normally, this is OK, problem arise when:
1) a sssd component (a back end for example) is stopped
2) monitor detects it does not respond, kills it and spawns a new one
3) the old back end calls its shutdown function /after/ the new one has started
4) dbus removes the named socket


Version-Release number of selected component (if applicable):
sssd-1.5.1-53

How reproducible:
always

Steps to Reproduce:
1. start sssd
2. killall -STOP sssd_be
3. wait until monitor sends TERM to sssd_be and spawns a new one
4. killall -CONT sssd_be
5. getent passwd foobar
  
Actual results:
the socket /var/lib/sss/pipes/private/sbus-dp_$domain is removed
sssd_nss crashes

Expected results:
sssd only reconnects to the new back end, the old back end shuts down gracefully

Additional info:
The crash was found by Kaushik Banerjee

Comment 2 Stephen Gallagher 2011-10-06 11:46:54 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1035

Comment 3 Stephen Gallagher 2011-10-06 11:47:58 UTC
Disregard the previous comment, the correct bug is
https://fedorahosted.org/sssd/ticket/1034

Comment 6 Kaushik Banerjee 2011-10-17 12:20:40 UTC
No crashes seen anymore on going through the above steps.

Verified in build:
# rpm -qi sssd | head
Name        : sssd                         Relocations: (not relocatable)
Version     : 1.5.1                             Vendor: Red Hat, Inc.
Release     : 58.el6                        Build Date: Sat 15 Oct 2011 12:14:26 AM IST
Install Date: Mon 17 Oct 2011 12:00:06 PM IST      Build Host: x86-001.build.bos.redhat.com
Group       : Applications/System           Source RPM: sssd-1.5.1-58.el6.src.rpm
Size        : 3599114                          License: GPLv3+
Signature   : (none)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
URL         : http://fedorahosted.org/sssd/
Summary     : System Security Services Daemon

Comment 7 Jan Zeleny 2011-10-27 11:47:12 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: SSSD components communicate using DBus protocol. On initializing the server, DBus is given a file name that represents a known interface and creates the socket. When server shuts down, it calls a DBus cleanup function which removes the socket.
Consequence: In case one of components is restarted, a race condition can cause that the socket is removed by the old component instance after the new instance is already running and connected to it.
Fix: Symlinks are used for components to connect to the socket. When re-spawning, both old and new instance have their respective symlinks.
Result: The race condition doesn't occur any more and SSSD doesn't crash.

Comment 8 Jakub Hrozek 2011-10-27 12:11:46 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1,4 @@
-Cause: SSSD components communicate using DBus protocol. On initializing the server, DBus is given a file name that represents a known interface and creates the socket. When server shuts down, it calls a DBus cleanup function which removes the socket.
+Cause: SSSD components communicate using the DBus protocol. On initializing the DBus server, the DBus library is given a file name that represents a known interface. Dbus creates the socket on server startup. When server shuts down, it calls a DBus cleanup function which removes the socket.
 Consequence: In case one of components is restarted, a race condition can cause that the socket is removed by the old component instance after the new instance is already running and connected to it.
-Fix: Symlinks are used for components to connect to the socket. When re-spawning, both old and new instance have their respective symlinks.
+Fix: Path names that contain the server process's PID are passed to DBus and a symlink with a known and defined path name is pointed to the path name with PID. Clients connect to the well-known symlink paths.
-Result: The race condition doesn't occur any more and SSSD doesn't crash.+Result: When the DBus server exits, it only removes the PID-decorated path name. Clients are still connected to the same path no matter what server is the symlink pointed to.

Comment 9 errata-xmlrpc 2011-12-06 16:40:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1529.html


Note You need to log in before you can comment on or make changes to this bug.