Red Hat Bugzilla – Bug 1110369
sssd is started before messagebus, making sssd-ifp fail
Last modified: 2014-10-14 00:48:45 EDT
Description of problem: The /etc/init.d/sssd defines # chkconfig: - 12 88 and /etc/init.d/messagebus defines # chkconfig: 345 22 85 making dbus start after sssd. If the ifp is enabled, it will be stopped immediately during boot or init 3 because dbus is not yet running. Version-Release number of selected component (if applicable): # rpm -qf /etc/init.d/sssd /etc/init.d/messagebus sssd-common-1.11.6-1.el6.x86_64 dbus-1.2.24-7.el6_3.x86_64 How reproducible: Deterministic. Steps to Reproduce: 1. Have IPA-enrolled machine or otherwise get sssd configured and enabled. 2. Check that chkconfig --list | grep -E 'messagebus|sssd' lists both as on. 3. Add ifp to services list in [sssd] section in /etc/sssd/sssd.conf. 4. Run tail -f /var/log/sssd/sssd.log & to see what is going on. 5. Run service messagebus stop ; service sssd stop ; init 3 6. Check if sssd_ifp is running: ps axuw | grep ifp Actual results: Stopping system message bus: [ OK ] Stopping sssd: [ OK ] (Tue Jun 17 09:50:00 2014) [sssd] [mt_svc_exit_handler] (0x0010): Process [ifp], definitely stopped! No ifp process. Expected results: No error message, sssd_ifp still running: root 16070 0.0 0.0 199860 2916 ? S 09:50 0:00 /usr/libexec/sssd/sssd_ifp --debug-to-files Additional info: One possibility is to change the order but maybe messagebus wants sssd to be running? Another possibility is for sssd_ifp not bail out when it does not find dbus running. Yes another possibility is to somehow start sssd in stages -- first the non-dbus stuff, and then after dbus is up the dbus part.
The issue can likely be also observed by rebooting the machine.
We're debating the right solution with the DBus developers. So far, one option that came up was making SSSD poll for the system bus: https://fedorahosted.org/sssd/ticket/2360
The problem with polling is, if you have fast machine, httpd could be started before the next poll and attempt to serve request and dbus call will fail. Polling is good to recover from some errors but for consistent boot sequence, it might be good to also have an explicit chkconfig entry after messagebus but before things like httpd to poke sssd to retry right away. HUP or USR1, maybe?
Stef, do you know of any DBus service that is in the same area as SSSD, that is, both an identity provider and a DBus service? I wish we could simply put a watch on the DBus system bus socket and only start the IFP service then, but inotify can't watch for nonexistent files (obvisously).
Upstream ticket: https://fedorahosted.org/sssd/ticket/2360
I talked to Marius a bit today on IRC and he had an interesting suggestion a bit along the lines of what Jan suggested in comment #5. The proposal was to let DBus (not an initscript entry) start a simple binary that would SIGHUP the sssd and let it know it's time to spawn the dbus service. I'll try to experiment with this and start a thread on sssd-devel. btw the reason I don't like simply starting messagebus before sssd is that sssd users might be included in places like the interface policy XML. Clearly dbus consumes identities at that point, so the identities should be resolvable.
(In reply to Jakub Hrozek from comment #8) > I talked to Marius a bit today on IRC and he had an interesting suggestion a > bit along the lines of what Jan suggested in comment #5. > > The proposal was to let DBus (not an initscript entry) start a simple binary > that would SIGHUP the sssd and let it know it's time to spawn the dbus > service. I'll try to experiment with this and start a thread on sssd-devel. Yes, not a bad idea. Although you could use any other means of communication, such as the usual sssd unix sockets as well.
Hi Jan, given that sssd-dbus is mostly used by Satellite in RHEL6, would the Sat QE be available to help with testing?
Patches are available for review on the upstream sssd-devel list.
Just for reference, here are the commits that fixed the bug: * master: 1a59af8245f183f22d87d067a90197d8e2ea958d 1746e8b8399da2a7a8da4aace186f66055ccfec1 149f40dc2d4ead57811c70b5028648ac83f6a1a7 b76419cf8830440b46c20a15585562343c7b1924 0c1d65998907930678da2d091789446f2c344d5d 1f2507e1fd089f2bf3458cfb4faeaa9669d72f98 4df1a6a977df74420867d9b1daddcca0eea4b2e1 * sssd-1-11: 80af7e9daed52b283af037864bcdd86d96695618 42b0c3442815c0374735377c7f5ced4fe1a00e97 87d3c7d23885b0b6dca3d7cf0494c7b93225429c fbc3f000ca0672bb18797201768bd13e5611eaad 3e57c78c8163f6ee395bdf34b1e2c550cd8467f1 727f4bf4829f2405c978a4c9b960bef3ad86b002 906177a2666bf360a3d85fec55fc942cf9b33163
Verified with version 1.11.6-29.el6 # chkconfig --list | grep -E 'messagebus|sssd' messagebus 0:off 1:off 2:on 3:on 4:on 5:on 6:off sssd 0:off 1:off 2:off 3:on 4:on 5:on 6:off service messagebus stop ; service sssd stop ; init 3 # ps axuw | grep ifp root 13135 0.0 0.3 199972 3368 ? S 05:29 0:00 /usr/libexec/sssd/sssd_ifp --debug-to-files
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1375.html