Hide Forgot
Description of problem: SSSD components comunicate using a low-level DBus protocol. On initializing the server dbus is given a filename that represents a known interface and creates the socket When server shuts down, it calls a dbus cleanup function which removes the socket. Normally, this is OK, problem arise when: 1) a sssd component (a back end for example) is stopped 2) monitor detects it does not respond, kills it and spawns a new one 3) the old back end calls its shutdown function /after/ the new one has started 4) dbus removes the named socket Version-Release number of selected component (if applicable): sssd-1.5.1-53 How reproducible: always Steps to Reproduce: 1. start sssd 2. killall -STOP sssd_be 3. wait until monitor sends TERM to sssd_be and spawns a new one 4. killall -CONT sssd_be 5. getent passwd foobar Actual results: the socket /var/lib/sss/pipes/private/sbus-dp_$domain is removed sssd_nss crashes Expected results: sssd only reconnects to the new back end, the old back end shuts down gracefully Additional info: The crash was found by Kaushik Banerjee
Upstream ticket: https://fedorahosted.org/sssd/ticket/1035
Disregard the previous comment, the correct bug is https://fedorahosted.org/sssd/ticket/1034
No crashes seen anymore on going through the above steps. Verified in build: # rpm -qi sssd | head Name : sssd Relocations: (not relocatable) Version : 1.5.1 Vendor: Red Hat, Inc. Release : 58.el6 Build Date: Sat 15 Oct 2011 12:14:26 AM IST Install Date: Mon 17 Oct 2011 12:00:06 PM IST Build Host: x86-001.build.bos.redhat.com Group : Applications/System Source RPM: sssd-1.5.1-58.el6.src.rpm Size : 3599114 License: GPLv3+ Signature : (none) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> URL : http://fedorahosted.org/sssd/ Summary : System Security Services Daemon
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: SSSD components communicate using DBus protocol. On initializing the server, DBus is given a file name that represents a known interface and creates the socket. When server shuts down, it calls a DBus cleanup function which removes the socket. Consequence: In case one of components is restarted, a race condition can cause that the socket is removed by the old component instance after the new instance is already running and connected to it. Fix: Symlinks are used for components to connect to the socket. When re-spawning, both old and new instance have their respective symlinks. Result: The race condition doesn't occur any more and SSSD doesn't crash.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,4 +1,4 @@ -Cause: SSSD components communicate using DBus protocol. On initializing the server, DBus is given a file name that represents a known interface and creates the socket. When server shuts down, it calls a DBus cleanup function which removes the socket. +Cause: SSSD components communicate using the DBus protocol. On initializing the DBus server, the DBus library is given a file name that represents a known interface. Dbus creates the socket on server startup. When server shuts down, it calls a DBus cleanup function which removes the socket. Consequence: In case one of components is restarted, a race condition can cause that the socket is removed by the old component instance after the new instance is already running and connected to it. -Fix: Symlinks are used for components to connect to the socket. When re-spawning, both old and new instance have their respective symlinks. +Fix: Path names that contain the server process's PID are passed to DBus and a symlink with a known and defined path name is pointed to the path name with PID. Clients connect to the well-known symlink paths. -Result: The race condition doesn't occur any more and SSSD doesn't crash.+Result: When the DBus server exits, it only removes the PID-decorated path name. Clients are still connected to the same path no matter what server is the symlink pointed to.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1529.html