Hide Forgot
Created attachment 556891 [details] Output of systemctl --full Description of problem: The system fails to shutdown or reboot if the ipa.service service is started and running. The console just shows some output from systemd related to other services and hangs indefinately. If i hold down ctrl-alt-del then it eventually kills everything and reboots. Shutting down ipa.service manually works with no indication of any errors, including using ipactl stop Version-Release number of selected component (if applicable): freeipa-admintools-2.1.4-2.fc16.x86_64 freeipa-client-2.1.4-2.fc16.x86_64 freeipa-python-2.1.4-2.fc16.x86_64 freeipa-server-2.1.4-2.fc16.x86_64 freeipa-server-selinux-2.1.4-2.fc16.x86_64 systemd-37-3.fc16.x86_64 How reproducible: Ensure ipa.service is running and issue a shutdown or reboot Steps to Reproduce: 1. systemctl start ipa.service 2. reboot 3. Actual results: The system does not shutdown correctly, but hangs indefinately. Expected results: The system should shutdown correctly. Additional info: Please let me know what logs etc you would like and I'm happy to oblige.
I've been told this could be related, I'm not so sure but just in case. https://bugzilla.redhat.com/show_bug.cgi?id=783943
Ooops, pasted wrong link. https://www.redhat.com/archives/freeipa-devel/2012-January/msg00223.html
The workaround for 389-ds you are mentioning in comment #2 does not really help. I'm experimenting with freeipa reboots and still seeing lockups. However, the lockup happens way after ipa.service and its dependencies are stopped. Do you have DNS service managed by the FreeIPA as well? Does your /etc/resolv.conf include this DNS server (itself) as the only resolver?
Upstream ticket: https://fedorahosted.org/freeipa/ticket/2302
Yes I do have a DNS service (named) running on the same server but it's not directly managed by FreeIPA. DNS resolution for local hosts works, both forward and reverse lookups as well as DNS recursion. The resolv.conf contains only itself, it looks like this: nameserver 127.0.0.1 search homenet.lan I also have the servers full qualified name in /etc/hosts, just in case. The server also provides NFS exports with kerberos security. Curiously now <sigh>, it no longer shuts down whether ipa.service is running or not, although the shutdown behaviour is different if ipa.service is running. I've seen this kind of thing sprinkled through the logs: Failed to read PID file /run/sendmail.pid after start. The service might be broken. Failed to read PID file /var/run/httpd/httpd.pid after start. The service might be broken. Failed to read PID file /var/run/ipa_kpasswd.pid after start. The service might be broken. Failed to read PID file /run/sm-client.pid after start. The service might be broken. but the pid files are frequently still there, even though it claims it can't read them.
There seems to be an issue with systemd waiting/locking. When ipa.service is shut down and not running, and I would issue reboot afterwards, systemd locks down with the last statement printed on the console being message from auditd about /lib/systemd/systemd-update-utmp, then after ~10 seconds and switching around consoles I'm getting logind, ntpd, sssd, and auditd shutdowns. After that everything stops again. After pushing ctrl-alt-del, the process proceeds and reboot happens. Unfortunately, I can't get into the system during the lockup as logind/sshd are not available anymore. This is Fedora 16 vm system with everything from updates-testing, including version of systemd 37-10. I'd suggest to move this bug to systemd for investigation.
(In reply to comment #4) > Upstream ticket: > https://fedorahosted.org/freeipa/ticket/2302 The latest comments there say: >> Without any changes from FreeIPA side the problem disappeared on Fedora 17 >> with at least systemd 42. >> >> So we just need to set a min requires on systemd? >> >> For F17 -- yes. With F16 I still need to find out how to reorder PKI shutdown. I don't know what "PKI shutdown" is. Could you explain to me your hypothesis for why the shutdown fails? Is there anything to fix in systemd?
Just as an FYI - this still occurs with systemd-37-15.fc16.x86_64
PKI is pki-cad@.service , instances of certificate authorities, for which pki-cad.target is a target to launch them all. In FreeIPA install there is a single instance, pki-cad@pki-ca.
FreeIPA starts and stops services on its own as their activity is synchronized across multiple nodes. This is done via systemctl as well and generally performed in following way: 1. ipa.service is marked as enabled 2. ipa.service on startup makes sure Directory Server (LDAP) service instance (dirsrv) is up and running. 3. ipa.service consults to LDAP for list of services to maintain and launches them via 'systemctl start'. 4. On shutdown of ipa.service, it performs (3) in the reversed order and then shuts down all dirsrv@.services via dirsrv.target. What happens is that all services are shutdown except PKI one. PKI uses dirsrv LDAP instance to store its own certificate details. When dirsrv.target's shutdown is run, apparently PKI is still running. So PKI is stuck trying to contact LDAP instance and never properly ends shutdown process, therefore systemd never ends shutdown of the system. If you force reboot (ctrl-alt-del, for example), systemd will forcibly kill PKI and the box will go to reboot. One of issues is that we can't maintain service dependencies via systemd due to multi-node setup. Every FreeIPA replica has to keep list of services it should run and there are different replication topologies that could be used by customers. With LDAP replication we are able to transfer information about reordering of services "natively". With systemd we have no such mechanism and consider it to be an insecure approach to set up systemd dependencies after each replication change to FreeIPA services as LDAP instances are run under unprivileged uids.
After some effort to make ipa.service start at all, I managed to reproduce the hang on shutdown in a VM. I directed systemd debugging output to a virtual serial console. At some point during shutting down some process enqueues the job dirsrv.target/start, which pulls basic.target back in via dependencies, so the shutdown does not continue anymore. I made /bin/systemctl into a wrapper script to log its invocation. Apparently a process called "ipactl", with the cmdline of "/usr/bin/python /usr/sbin/ipactl stop" and running in the cgroup of ipa.service, calls "systemctl start dirsrv.target". Is it expected that the target is _started_ during ipactl stop?
dirsrv needs to be started in order to obtain the list of services to stop.
yes, it has to be started, please read comment 10 for detailed explanation.
(In reply to comment #12) > dirsrv needs to be started in order to obtain the list of services to stop. Rob, Alexander, I have been thinking that we should change the way we stop ipa with systemd. With init scripts we had no easy way to track what was going on so we had to check LDAP to find the list of serveices to shut down. But this has always been suboptimal because changing the list could end up not shutting down a serivce or trying to shut down a servie not yet started. With systemd though I think we can change the dependencies. If we can change them when we run ipactl start then we should be able to simple cause all the dependent services to automatically shut down when we systemctl stop dirsrv.service I hope. This would be prefereable because we would always turn off all the services we started at ipactl start wo having to look at LDAP which may have changed in the meanwhile. Of course we need to look at LDAP to perform the start. that does not change. If we could arrange shutdown through systemd dependencies we would solve the problem of starting dirsrv.target at shutdown.
Changing systemd services when running systemctl would require 'systemctl daemon-reload' be issued from within 'systemctl start ipa.service' action. Michal, what are consequences of this action?
(In reply to comment #15) > Changing systemd services when running systemctl would require 'systemctl > daemon-reload' be issued from within 'systemctl start ipa.service' action. > > Michal, what are consequences of this action? It should work fine. If it breaks, it will be systemd bug.
(In reply to comment #10) > FreeIPA starts and stops services on its own as their activity is synchronized > across multiple nodes. This is done via systemctl as well and generally > performed in following way: > > 1. ipa.service is marked as enabled > 2. ipa.service on startup makes sure Directory Server (LDAP) service instance > (dirsrv) is up and running. > 3. ipa.service consults to LDAP for list of services to maintain and launches > them via 'systemctl start'. > 4. On shutdown of ipa.service, it performs (3) in the reversed order and then > shuts down all dirsrv@.services via dirsrv.target. > > What happens is that all services are shutdown except PKI one. PKI uses > dirsrv LDAP instance to store its own certificate details. When > dirsrv.target's shutdown is run, apparently PKI is still running. > > So PKI is stuck trying to contact LDAP instance and never properly ends > shutdown process, therefore systemd never ends shutdown of the system. If you > force reboot (ctrl-alt-del, for example), systemd will forcibly kill PKI and > the box will go to reboot. > > One of issues is that we can't maintain service dependencies via systemd due to > multi-node setup. Every FreeIPA replica has to keep list of services it should > run and there are different replication topologies that could be used by > customers. With LDAP replication we are able to transfer information about > reordering of services "natively". With systemd we have no such mechanism and > consider it to be an insecure approach to set up systemd dependencies after > each replication change to FreeIPA services as LDAP instances are run under > unprivileged uids. hmm just some thought.. Could this issue not be solved via target(s) so the lookup to ldap would only be performed when the service are created and or changed and ipa simply tell systemd to start and stop the target(s)?
(In reply to comment #17) > hmm just some thought.. > > Could this issue not be solved via target(s) so the lookup to ldap would only > be performed when the service are created and or changed and ipa simply tell > systemd to start and stop the target(s)? How do you know the service list changed w/o looking it up in LDAP first ?
(In reply to comment #18) > (In reply to comment #17) > > > hmm just some thought.. > > > > Could this issue not be solved via target(s) so the lookup to ldap would only > > be performed when the service are created and or changed and ipa simply tell > > systemd to start and stop the target(s)? > > How do you know the service list changed w/o looking it up in LDAP first ? I would think that would be irrelevant since ipa.service ( or systemd directly ) could always shutdown the target and all the services beneath it so as long as those service that get created from ldap ( and then later modified if applicable ) would have something like ipa.target.wants in their [Install] section then ipa could just shutdown the "ipa.target" which should shutdown all the services that got created. Basically you would not have to worry about what's modified or other wise exist in the "ipa.target.wants" directory...
Fixed upstream. master: 7f272a39b6b46fac1d548b759f671b75592af7a0 1ef651e7f9f5f940051dc470385aa08eefcd60af 09dbc1f36bbc14c1224e45778cbc59586dfeea75 ipa-3-0: 2eb29f42679632f7eed813638cdf33e60c13a249 f5805379277d0d9a2685aba69db49c95a36a6d1f 5d2a8e8b225766d6e6ee53d4c161c4c1b44ab74a The solution is to keep a list of the services we start in a file and use that list to stop all services via ipactl/ipa.service. It will fall back to the LDAP mechanism if the file is not available.