No Documentation Needed
Description of problem:
The /etc/init.d/sssd defines
# chkconfig: - 12 88
and /etc/init.d/messagebus defines
# chkconfig: 345 22 85
making dbus start after sssd. If the ifp is enabled, it will be stopped immediately during boot or init 3 because dbus is not yet running.
Version-Release number of selected component (if applicable):
# rpm -qf /etc/init.d/sssd /etc/init.d/messagebus
Steps to Reproduce:
1. Have IPA-enrolled machine or otherwise get sssd configured and enabled.
2. Check that chkconfig --list | grep -E 'messagebus|sssd' lists both as on.
3. Add ifp to services list in [sssd] section in /etc/sssd/sssd.conf.
4. Run tail -f /var/log/sssd/sssd.log & to see what is going on.
5. Run service messagebus stop ; service sssd stop ; init 3
6. Check if sssd_ifp is running: ps axuw | grep ifp
Stopping system message bus: [ OK ]
Stopping sssd: [ OK ]
(Tue Jun 17 09:50:00 2014) [sssd] [mt_svc_exit_handler] (0x0010): Process [ifp], definitely stopped!
No ifp process.
No error message, sssd_ifp still running:
root 16070 0.0 0.0 199860 2916 ? S 09:50 0:00 /usr/libexec/sssd/sssd_ifp --debug-to-files
One possibility is to change the order but maybe messagebus wants sssd to be running?
Another possibility is for sssd_ifp not bail out when it does not find dbus running.
Yes another possibility is to somehow start sssd in stages -- first the non-dbus stuff, and then after dbus is up the dbus part.
The issue can likely be also observed by rebooting the machine.
We're debating the right solution with the DBus developers. So far, one option that came up was making SSSD poll for the system bus:
The problem with polling is, if you have fast machine, httpd could be started before the next poll and attempt to serve request and dbus call will fail.
Polling is good to recover from some errors but for consistent boot sequence, it might be good to also have an explicit chkconfig entry after messagebus but before things like httpd to poke sssd to retry right away. HUP or USR1, maybe?
Stef, do you know of any DBus service that is in the same area as SSSD, that is, both an identity provider and a DBus service?
I wish we could simply put a watch on the DBus system bus socket and only start the IFP service then, but inotify can't watch for nonexistent files (obvisously).
I talked to Marius a bit today on IRC and he had an interesting suggestion a bit along the lines of what Jan suggested in comment #5.
The proposal was to let DBus (not an initscript entry) start a simple binary that would SIGHUP the sssd and let it know it's time to spawn the dbus service. I'll try to experiment with this and start a thread on sssd-devel.
btw the reason I don't like simply starting messagebus before sssd is that sssd users might be included in places like the interface policy XML. Clearly dbus consumes identities at that point, so the identities should be resolvable.
(In reply to Jakub Hrozek from comment #8)
> I talked to Marius a bit today on IRC and he had an interesting suggestion a
> bit along the lines of what Jan suggested in comment #5.
> The proposal was to let DBus (not an initscript entry) start a simple binary
> that would SIGHUP the sssd and let it know it's time to spawn the dbus
> service. I'll try to experiment with this and start a thread on sssd-devel.
Yes, not a bad idea. Although you could use any other means of communication, such as the usual sssd unix sockets as well.
given that sssd-dbus is mostly used by Satellite in RHEL6, would the Sat QE be available to help with testing?
Patches are available for review on the upstream sssd-devel list.
Just for reference, here are the commits that fixed the bug:
Verified with version 1.11.6-29.el6
# chkconfig --list | grep -E 'messagebus|sssd'
messagebus 0:off 1:off 2:on 3:on 4:on 5:on 6:off
sssd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
service messagebus stop ; service sssd stop ; init 3
# ps axuw | grep ifp
root 13135 0.0 0.3 199972 3368 ? S 05:29 0:00 /usr/libexec/sssd/sssd_ifp --debug-to-files
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.