Hide Forgot
Description of problem: autofs is configured with sssd. We are having trouble with autofs not mounting filesystems after the first reboot when the system is built. The automounter is not finding any maps from sssd. Version-Release number of selected component (if applicable): sssd-1.11.2-65.el7.x86_64 sssd-ad-1.11.2-65.el7.x86_64 sssd-client-1.11.2-65.el7.x86_64 sssd-common-1.11.2-65.el7.x86_64 sssd-common-pac-1.11.2-65.el7.x86_64 sssd-ipa-1.11.2-65.el7.x86_64 sssd-krb5-1.11.2-65.el7.x86_64 sssd-krb5-common-1.11.2-65.el7.x86_64 sssd-ldap-1.11.2-65.el7.x86_64 sssd-proxy-1.11.2-65.el7.x86_64 Additional info: It appears to that the autofs and sssd services are coming up before NetworkManager completes bringing up the interface. If you look at the messages file you will see that link doesn't come up until about 10-15 seconds after sssd and autofs starts. The sssd services seems to retry, but autofs doesn't. Switching the line in /usr/lib/systemd/system/sssd.service to: After=syslog.target network.target NetworkManager-wait-online.service Seems to have resolved the issue.
From the logs it appears that even if SSSD is able to detect configuration has changed with its netlink integration, automounter only asks for the maps once after startup. Startup is only part of the problem, the same would happen for a machine that would change networks, flaky connection etc. Ian, what about a dispatcher script for NM that would SIGHUP automounter when networking conditions change?
(In reply to Jakub Hrozek from comment #1) > From the logs it appears that even if SSSD is able to detect configuration > has changed with its netlink integration, automounter only asks for the maps > once after startup. > > Startup is only part of the problem, the same would happen for a machine > that would change networks, flaky connection etc. > > Ian, what about a dispatcher script for NM that would SIGHUP automounter > when networking conditions change? Maybe but ... At one point I was working on some patches to make autofs wait until the master map was available at start up. Map re-loads are supposed to ignore fails and use the existing map. I shelved that because I had what I thought worked but the user claimed it didn't and I was unable to get sufficient info to continue. What are your thoughts on me continuing with that? Ian
(In reply to Ian Kent from comment #2) > (In reply to Jakub Hrozek from comment #1) > > From the logs it appears that even if SSSD is able to detect configuration > > has changed with its netlink integration, automounter only asks for the maps > > once after startup. > > > > Startup is only part of the problem, the same would happen for a machine > > that would change networks, flaky connection etc. > > > > Ian, what about a dispatcher script for NM that would SIGHUP automounter > > when networking conditions change? > > Maybe but ... > > At one point I was working on some patches to make autofs wait > until the master map was available at start up. Map re-loads > are supposed to ignore fails and use the existing map. > > I shelved that because I had what I thought worked but the user > claimed it didn't and I was unable to get sufficient info to > continue. > > What are your thoughts on me continuing with that? > > Ian Interesting idea! Would autofs periodically poll until it gets some form of authoritative response? How would you deal with the situation where SSSD is configured to serve automounter maps but not maps are actually present on the LDAP side? Could we simply differentiate between 'search completed but 0 results' and 'could not complete search' ? Please note I will be away until Jun 15th, so my answers will be delayed.
(In reply to Jakub Hrozek from comment #3) > (In reply to Ian Kent from comment #2) > > (In reply to Jakub Hrozek from comment #1) > > > From the logs it appears that even if SSSD is able to detect configuration > > > has changed with its netlink integration, automounter only asks for the maps > > > once after startup. > > > > > > Startup is only part of the problem, the same would happen for a machine > > > that would change networks, flaky connection etc. > > > > > > Ian, what about a dispatcher script for NM that would SIGHUP automounter > > > when networking conditions change? > > > > Maybe but ... > > > > At one point I was working on some patches to make autofs wait > > until the master map was available at start up. Map re-loads > > are supposed to ignore fails and use the existing map. > > > > I shelved that because I had what I thought worked but the user > > claimed it didn't and I was unable to get sufficient info to > > continue. > > > > What are your thoughts on me continuing with that? > > > > Ian > > Interesting idea! > > Would autofs periodically poll until it gets some form of authoritative > response? At startup the master map is a must have so waiting until it gets gets one is sensible enough. > > How would you deal with the situation where SSSD is configured to serve > automounter maps but not maps are actually present on the LDAP side? Could > we simply differentiate between 'search completed but 0 results' and 'could > not complete search' ? Again it's only at startup and it continues to wait if the connection fails. That is different from 'search completed but 0 results' so yes, that's what I was trying to do. I hope I still have the patches ....
Sorry for the late response, I was on vacation for the last 10 days. Is it OK to reassign this bugzilla to autofs, then?
Based on comment #4, I'm moving the component to autofs as it doesn't seem any changes from SSSD are needed. Please correct me if I'm wrong.
(In reply to Jakub Hrozek from comment #6) > Based on comment #4, I'm moving the component to autofs as it doesn't seem > any changes from SSSD are needed. Please correct me if I'm wrong. Yeah, I've been able to locate a reasonably up to date version of the patches I spoke about. It was difficult due to me doing an upgade to F20 and my hard disk failing not too long afterward. So I've been a bit distracted. I've had a look at them and given it some thought. The patches assume that a map source will return a connect failure for the case where the map can't be read at start up. That might not be the the best approach but an empty master map is a valid state so we probably can't use that to identify this case. Ideally sss would return a connection failure until it has successfully connected and read the maps at startup. Clearly, once the maps have been read at start up they should continue to be used when a server becomes unavailable. Perhaps we need a second bug that this bug depends on?
(In reply to Ian Kent from comment #7) > (In reply to Jakub Hrozek from comment #6) > > Based on comment #4, I'm moving the component to autofs as it doesn't seem > > any changes from SSSD are needed. Please correct me if I'm wrong. > > Yeah, I've been able to locate a reasonably up to date > version of the patches I spoke about. > > It was difficult due to me doing an upgade to F20 and my > hard disk failing not too long afterward. So I've been a > bit distracted. > No problem, thank you very much for digging them up! > I've had a look at them and given it some thought. > > The patches assume that a map source will return a connect > failure for the case where the map can't be read at start > up. That might not be the the best approach but an empty > master map is a valid state so we probably can't use that > to identify this case. > > Ideally sss would return a connection failure until it has > successfully connected and read the maps at startup. Clearly, > once the maps have been read at start up they should continue > to be used when a server becomes unavailable. > > Perhaps we need a second bug that this bug depends on? Yes, that sounds reasonable, I will file one. Do you have any particular return code in mind (maybe something that is used by other modules in your patches?)
(In reply to Ian Kent from comment #7) > (In reply to Jakub Hrozek from comment #6) > The patches assume that a map source will return a connect > failure for the case where the map can't be read at start > up. That might not be the the best approach but an empty > master map is a valid state so we probably can't use that > to identify this case. By the way, I think this connection error should only be returned in case the sssd cache is empty. If there are any maps, we should fall back to the cached maps..
The new bugzilla is: https://bugzilla.redhat.com/show_bug.cgi?id=1113639
(In reply to Jakub Hrozek from comment #9) > (In reply to Ian Kent from comment #7) > > (In reply to Jakub Hrozek from comment #6) > > The patches assume that a map source will return a connect > > failure for the case where the map can't be read at start > > up. That might not be the the best approach but an empty > > master map is a valid state so we probably can't use that > > to identify this case. > > By the way, I think this connection error should only be returned in case > the sssd cache is empty. If there are any maps, we should fall back to the > cached maps.. I'm pretty much at the mercy of other subsystems with that. I'll have a look around and see what would be best, a connection fail or refusal is what I'm after, same as what we'd see if the network or server was down.
(In reply to Jakub Hrozek from comment #9) > (In reply to Ian Kent from comment #7) > > (In reply to Jakub Hrozek from comment #6) > > The patches assume that a map source will return a connect > > failure for the case where the map can't be read at start > > up. That might not be the the best approach but an empty > > master map is a valid state so we probably can't use that > > to identify this case. > > By the way, I think this connection error should only be returned in case > the sssd cache is empty. If there are any maps, we should fall back to the > cached maps.. That's not quite what I was saying. An empty map is valid but not knowing if it's empty or not at start up is the case we need to return a connection failure for. Once sss has been able to read the map then it's been cached and should continue to be used even if the server becomes unreachable.
Users of other distributions are running into this problem. I'm marking the BZ as public. We've marked sensitive comments as private anyway.
(In reply to Jakub Hrozek from comment #10) > The new bugzilla is: > https://bugzilla.redhat.com/show_bug.cgi?id=1113639 I see this bug is rhel-7.3 ? and I still believe that I'll need that to make autofs function properly. So I'll need to defer this bug until 7.3 too.
Created attachment 1264704 [details] Patch - work around sss startup delay
Created attachment 1264705 [details] Patch - add sss master map wait config option
(In reply to Ian Kent from comment #32) > Created attachment 1264704 [details] > Patch - work around sss startup delay FYI, the bug was in sssd https://pagure.io/SSSD/sssd/issue/3140 and a little bit related https://pagure.io/SSSD/sssd/issue/3080 Both are already fixed in sssd-1.15.0+ (rhel7.4)
(In reply to Lukas Slebodnik from comment #34) > (In reply to Ian Kent from comment #32) > > Created attachment 1264704 [details] > > Patch - work around sss startup delay > > FYI, the bug was in sssd > https://pagure.io/SSSD/sssd/issue/3140 > and a little bit related https://pagure.io/SSSD/sssd/issue/3080 > Both are already fixed in sssd-1.15.0+ (rhel7.4) I know and that's why the default setting is a timeout that disables it. I included the change because it's 1 part of 2 changes from RHEL-6 and I'd like to keep RHEL-6 and 7 in sync. The change is pretty straight forward and if there is some other unexpected problem that comes up where this can help then it'll be useful.
(In reply to Ian Kent from comment #35) > (In reply to Lukas Slebodnik from comment #34) > > (In reply to Ian Kent from comment #32) > > > Created attachment 1264704 [details] > > > Patch - work around sss startup delay > > > > FYI, the bug was in sssd > > https://pagure.io/SSSD/sssd/issue/3140 > > and a little bit related https://pagure.io/SSSD/sssd/issue/3080 > > Both are already fixed in sssd-1.15.0+ (rhel7.4) > > I know and that's why the default setting is a timeout that > disables it. > > I included the change because it's 1 part of 2 changes from > RHEL-6 and I'd like to keep RHEL-6 and 7 in sync. > Sure > The change is pretty straight forward and if there is some > other unexpected problem that comes up where this can help > then it'll be useful. Agree It was just a FYI :-)
Created attachment 1267441 [details] Patch - fix work around sss startup delay
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2213