Bug 1409012
Summary: | nfs-server runs before network is ready; "failed to resolve" for all hosts; nothing is exported | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Chris Schanzle <bugzilla> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED ERRATA | QA Contact: | Yongcheng Yang <yoyang> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 7.3 | CC: | bugzilla, chunwang, eguan, jiyin, steved, yoyang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | nfs-utils-1.3.0-0.41.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-08-01 19:48:51 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Schanzle
2016-12-28 22:47:33 UTC
What I don't understand why only your site is seeing this problem... I'm exporting dns hostname is a very common... Plus Requires= network.target in the nfs-server.service should bring DNS up. >Plus Requires= network.target in the nfs-server.service should bring DNS up. Wrong, per the fine manual! network.target means little per the documentation referenced in the unit file -- https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ If you really want DNS resolution to work, you need to wait for an interface to be up and online, not just have the stack initialized. Hence my suggestion for wait-online, but per those docs, perhaps I should have used: After=network-online.target Wants=network-online.target Not that I understand this, but experimentally speaking, with /etc/systemd/system/nscd.service.d/network-online.conf containing: [Unit] After=network-online.target Wants=network-online.target and rebooting, host resolution was not working again. By looking at the journal: journalctl -b -u nscd -u NetworkManager -u network-online.target -u NetworkManager-wait-online.target it was clear systemd started nscd before NetworkManager, well-before the network was even configured. Here are some snippets: Jan 05 12:17:18 localhost.localdomain systemd[1]: Starting Name Service Cache Daemon... Jan 05 12:17:20 localhost.localdomain systemd[1]: Starting Network Manager... Jan 05 12:17:20 localhost.localdomain systemd[1]: Started Name Service Cache Daemon. Jan 05 12:17:22 localhost.localdomain NetworkManager[1360]: <info> [1483636642.0293] NetworkManager (version 1.4.2-2.fc25) is starting... Jan 05 12:17:33 localhost.localdomain dhclient[1540]: bound to 129.6.88.146 -- renewal in 158521 seconds. Jan 05 12:17:33 grad.cam.nist.gov NetworkManager[1360]: <info> [1483636653.4606] device (eno1): Activation: successful, device activated. Jan 05 12:17:33 grad.cam.nist.gov NetworkManager[1360]: <info> [1483636653.4627] connectivity: check for uri 'http://fedoraproject.org/static/hotspot.txt' failed with 'Error resolving 'fedoraproject.org': Name or service not known' ~20 mins passes...I ran "nscd -i hosts" to get resolution working again: Jan 05 12:37:34 grad.cam.nist.gov NetworkManager[1360]: <info> [1483637854.9200] manager: NetworkManager state is now CONNECTED_GLOBAL So it seems After+Wants=network-online doesn't do what is stated in the docs, at least in Fedora 25. Using After+Wants=NetworkManager-wait-online.service seems to do what is desired/expected. But still, it does not explain the issues with nscd. Perhaps worth noting, in one of these reboots, it appeared IPv6 hostname resolution worked, but not IPv4. While our network supports IPv6, the system does not have an IPv6 address. E.g., [root@grad ~]# getent hosts fedoraproject.org 2607:f188::dead:beef:cafe:fed1 fedoraproject.org 2605:bc80:3010:600:dead:beef:cafe:feda fedoraproject.org 2604:1580:fe00:0:dead:beef:cafe:fed1 fedoraproject.org 2605:bc80:3010:600:dead:beef:cafe:fed9 fedoraproject.org 2610:28:3090:3001:dead:beef:cafe:fed3 fedoraproject.org [root@grad ~]# telnet www.google.com telnet: www.google.com: Name or service not known [root@grad ~]# getent hosts www.google.com 2607:f8b0:400c:c0b::69 www.google.com [root@grad ~]# getent hosts www.microsoft.com [root@grad ~]# getent hosts www.amazon.com My apologies, the above comment was meant for bug #1367565, but some overlaps here as well. [humble shrug] This is also needed commit 09e5c6c2a3f8eac91d5353e6d4ff6aee7757ab08 Author: Steve Dickson <steved> Date: Mon Apr 24 11:25:39 2017 -0400 systemd: Afters are also needed for the Wants=network-online.target Seems Chunyu has been tracking this issue in upstream. If possible, please help to generate/update testcase to cover it. As we have never seen this problem within our test environment, just verified the systemd scripts have changed to use "network-online". Moving to VERIFIED and will update testcase to cover it. ----------------------------------------------------------------- [root@ ~]# rpm -q nfs-utils nfs-utils-1.3.0-0.44.el7.x86_64 [root@ ~]# systemctl list-dependencies nfs | grep network-online ● ├─network-online.target [root@ ~]# [root@ ~]# systemctl cat nfs | grep network-online Wants=rpcbind.socket network-online.target After= network-online.target local-fs.target [root@ ~]# [root@ ~]# systemctl cat nfs-mountd.service | grep network-online Wants=network-online.target After=network-online.target local-fs.target [root@ ~]# [root@ ~]# systemctl cat rpc-statd-notify.service | grep network-online Wants=network-online.target After=local-fs.target network-online.target nss-lookup.target [root@ ~]# [root@ ~]# systemctl cat rpc-statd.service | grep network-online Wants=network-online.target After=network-online.target nss-lookup.target rpcbind.socket [root@ ~]# However, there still exits "network.target", don't know whether we should replace all of them with "network-online.target". Will read the doc and test some more to check it. ----------------------------------------------------------------- [root@ ~]# rpm -ql nfs-utils | grep .*service$ | xargs grep "network.target" /usr/lib/systemd/system/nfs-server.service:Requires= network.target proc-fs-nfsd.mount /usr/lib/systemd/system/nfs.service:Requires= network.target proc-fs-nfsd.mount [root@ ~]# systemctl cat nfs | grep 'network.target' Requires= network.target proc-fs-nfsd.mount [root@ ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2233 |