Hide Forgot
Description of problem: I have created a service called krb5cc_0.service, that is responsible for fetching kerberos credentials with the following settings: [Unit] Description=Provides the credential cache for naming services Before=nslcd.service After=syslog.target network.target time-sync.target NetworkManager-wait-online.service Requires=time-sync.target network.target but the service is still started before nm has changed resolv.conf, which leads to a service fail. Sep 20 08:27:50 mymachine NetworkManager[858]: <info> nameserver '192.168.0.1' Sep 20 08:27:50 mymachine NetworkManager[858]: <info> domain name 'domain.example' Sep 20 08:27:50 mymachine NetworkManager[858]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) scheduled... Sep 20 08:27:50 mymachine NetworkManager[858]: <info> Activation (eth0) Stage 4 of 5 (IP4 Configure Get) complete. Sep 20 08:27:50 mymachine NetworkManager[858]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started... Sep 20 08:27:50 mymachine avahi-daemon[844]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.0.19. Sep 20 08:27:50 mymachine avahi-daemon[844]: New relevant interface eth0.IPv4 for mDNS. Sep 20 08:27:50 mymachine avahi-daemon[844]: Registering new address record for 192.168.0.19 on eth0.IPv4. Sep 20 08:27:58 mymachine modem-manager[936]: <info> (ttyS0) closing serial port... Sep 20 08:27:58 mymachine modem-manager[936]: <info> (ttyS0) serial port closed Sep 20 08:27:58 mymachine modem-manager[936]: <info> (ttyS0) opening serial port... Sep 20 08:27:59 mymachine ntpdate[1193]: name server cannot be used: Temporary failure in name resolution (-3) Sep 20 08:28:04 mymachine modem-manager[936]: <info> (ttyS0) closing serial port... Sep 20 08:28:04 mymachine modem-manager[936]: <info> (ttyS0) serial port closed Sep 20 08:28:19 mymachine ntpdate[1197]: name server cannot be used: Temporary failure in name resolution (-3) Sep 20 08:28:19 mymachine systemd[1]: ntpdate.service: main process exited, code=exited, status=1 Sep 20 08:28:19 mymachine systemd[1]: Unit ntpdate.service entered failed state. Sep 20 08:28:20 mymachine ntpd[1202]: ntpd 4.2.6p3 Fri May 6 16:26:49 UTC 2011 (1) Sep 20 08:28:20 mymachine ntpd[1202]: proto: precision = 1.187 usec Sep 20 08:28:20 mymachine ntpd[1202]: 0.0.0.0 c01d 0d kern kernel time sync enabled Sep 20 08:28:20 mymachine ntpd[1202]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123 Sep 20 08:28:20 mymachine ntpd[1202]: Listen normally on 1 lo 127.0.0.1 UDP 123 Sep 20 08:28:20 mymachine ntpd[1202]: Listen normally on 2 eth0 192.168.0.19 UDP 123 Sep 20 08:28:20 mymachine ntpd[1202]: peers refreshed Sep 20 08:28:20 mymachine ntpd[1202]: Listening on routing socket on fd #19 for interface updates Sep 20 08:28:20 mymachine ntpd[1202]: restrict: error in address '::' on line 7. Ignoring... Sep 20 08:28:20 mymachine ntpd[1202]: restrict: error in address '::1' on line 13. Ignoring... Sep 20 08:28:20 mymachine ntpd[1202]: Deferring DNS for time.ethz.ch 1 Sep 20 08:28:20 mymachine systemd[1]: krb5cc_0.service: main process exited, code=exited, status=1 Sep 20 08:28:20 mymachine ntpd[1202]: Deferring DNS for swisstime.ethz.ch 1 Sep 20 08:28:20 mymachine ntpd[1202]: 0.0.0.0 c016 06 restart Sep 20 08:28:20 mymachine ntpd[1202]: 0.0.0.0 c012 02 freq_set kernel -6.492 PPM Sep 20 08:28:20 mymachine systemd[1]: Unit krb5cc_0.service entered failed state. Sep 20 08:28:20 mymachine nslcd[1213]: version 0.7.13 starting Sep 20 08:28:20 mymachine nslcd[1213]: accepting connections Sep 20 08:28:20 mymachine rpcbind: cannot create socket for udp6 Sep 20 08:28:20 mymachine rpcbind: cannot create socket for tcp6 Sep 20 08:28:20 mymachine dbus: [system] Activating service name='org.freedesktop.ColorManager' (using servicehelper) Sep 20 08:28:20 mymachine dbus: [system] Successfully activated service 'org.freedesktop.ColorManager' Sep 20 08:28:20 mymachine nscd: 1349 Access Vector Cache (AVC) started Sep 20 08:28:20 mymachine NetworkManager[858]: <info> (eth0): device state change: ip-config -> activated (reason 'none') [70 100 0] Sep 20 08:28:20 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink poll: error 4#012: exe="?" sauid=28 hostname=? addr=? terminal=? Sep 20 08:28:20 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink recvfrom: error 1#012: exe="?" sauid=28 hostname=? addr=? terminal=? Sep 20 08:28:20 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink thread: errors encountered, terminating#012: exe="?" sauid=28 hostname=? addr=? terminal=? Sep 20 08:28:20 mymachine NetworkManager[858]: <info> Policy set 'System eth0' (eth0) as default for IPv4 routing and DNS. Sep 20 08:28:20 mymachine NetworkManager[858]: <info> Activation (eth0) successful, device activated. Sep 20 08:28:20 mymachine NetworkManager[858]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete. As seen above, this also applies to ntpdate, which is not able to query time servers, because DNS resolution is not available, yet. Not sure if it's related but I got the following interface configuration setup: IPV6INIT="no" BOOTPROTO="dhcp" DEVICE="eth0" ONBOOT="yes" PEERDNS="no" DNS1="192.168.0.1" DNS2="192.168.0.2" DOMAIN="domain.example"
Created attachment 524043 [details] dmesg.out dmesg output with log_buf_len=1M systemd.log_target=kmsg systemd.log_level=debug set
Created attachment 524045 [details] /var/log/messages during service startup
It seems to be related to nscd.service Stopping LSB: Starts the Name Switch Cache Daemon... has been logged during system startup. Disabling nscd (which is not a good idea in general) let's the other services start correctly.
The wait-online service appears to have timed out. It has a default timeout of 30 seconds.
I have already tried to set it to 60sec, but it only takes longer for the service to fail.
Besides that, everything is fine, if I disable nscd, so I guess it has to be related to that one.
Ok, it seems to work fine again, using these systemd service definitions: # cat nscd.service [Unit] Description=Name Switch Cache Daemon After=syslog.target network.target [Service] Type=forking PIDFile=/run/nscd/nscd.pid EnvironmentFile=-/etc/sysconfig/nscd ExecStart=/usr/sbin/nscd $NSCD_OPTIONS ExecReload=/usr/sbin/nscd -i passwd ExecReload=/usr/sbin/nscd -i group ExecReload=/usr/sbin/nscd -i hosts ExecReload=/usr/sbin/nscd -i services ExecStop=/usr/sbin/nscd -K [Install] WantedBy=multi-user.target # cat /etc/tmpfiles.d/nscd.conf d /run/nscd 0755 root root -
(In reply to comment #7) > Ok, it seems to work fine again, using these systemd service definitions: Nice to hear that. However, that's not the default service file, right? nscd is still using an initscript /etc/init.d/nscd. That should be fixed, I guess. (In reply to comment #6) > Besides that, everything is fine, if I disable nscd, so I guess it has to be > related to that one. There's a bug 700507 (RHEL 6) on nscd due to these lines: Sep 20 15:35:02 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink poll: error 4#012: exe="?" sauid=28 hostname=? addr=? terminal=? Sep 20 15:35:02 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink recvfrom: error 1#012: exe="?" sauid=28 hostname=? addr=? terminal=? Sep 20 15:35:02 mymachine nscd: Can't send to audit system: USER_AVC avc: netlink thread: errors encountered, terminating#012: exe="?" sauid=28 hostname=? addr=? terminal=? It's fixed in RHEL 6, however the fix is not delivered to Fedora 15 yet. Looking into bodhi, the fix is included in glibc-2.14.90-9 (F16) that is in 'testing' state, no valid update for F15. I'm not sure if the bug is connected with the issue here. From NM point of view, there is a big delay (29sec!) between these two lines: Sep 20 15:35:06 mymachine NetworkManager[889]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started... Sep 20 15:36:35 mymachine NetworkManager[889]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete. even if they are logged in the same function: src/nm-device.c:nm_device_activate_stage5_ip_config_commit() nscd somehow badly interferes with NM, probably. It would be useful to find out where NM is stuck. E.g. with: sudo pstack `pidof NetworkManager` Also appending '--log-level=DEBUG' to ExecStart in /lib/systemd/system/NetworkManager.service will help increasing logging level of NM.
Works in more recent Fedora versions