Bug 1422675
| Summary: | nfs v4 with kerberos fails to mount | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | chris vogan <chrisvogan> |
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
| Status: | CLOSED NOTABUG | QA Contact: | Yongcheng Yang <yoyang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.3 | CC: | adam.winberg, agaikwad, bfields, brian, dwysocha, extras-qa, jiyin, jj.sarton, jlayton, mark.crossland, R.Eggermont, rkothiya, roysjosh, steved, tomek, Waheed.barghouthi, xzhou, yoyang |
| Target Milestone: | rc | Keywords: | Reproducer |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1192806 | Environment: | |
| Last Closed: | 2020-01-07 18:49:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
chris vogan
2017-02-15 20:20:19 UTC
I have been hit by this issue on the more current RHEL 7.3. After a system reboot, kerberos mount no longer work until I restart rpcgssd. I am using nfs-utils-1.3.0-0.33.el7.x86_64 Kernel: 3.10.0-514.2.2.el7.x86_64 I was able to resolve the issue by having rpc-gssd.service wait for network-online.target. I have the same issue on some of my systems (and why on those and the other work is a mystery to me). The logged message is a bit different however: Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: unable to resolve 172.28.168.84 to hostname: Name or service not known Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: failed to parse nfs/clnt0/info Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: can't openat nfs/clnt0: No such file or directory Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: unable to resolve 172.28.168.84 to hostname: Name or service not known Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: failed to parse nfs/clnt1/info Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: unable to resolve 172.28.168.84 to hostname: Name or service not known Feb 23 22:39:07 gklab-20-082 rpc.gssd[1537]: ERROR: failed to parse nfs/clnt1/info rpc.gssd restart helps here as well. /etc/hosts only holds 127.0.0.1 and ::1 addresses. dig, host and nslookup correctly resolve the above mentioned address to the nfs server name and forward resolution points back to the same address. The client uses DHCP for interface configuration and I wouldn't be surprised if the problem weas a contention between the DHCP client address assignment to the interface and configuration updates (/etc/resolv.conf) and rpc.gssd starting so early it doesn't get the DNS server addresses from yet non-existent (or old) /etc/resolv.conf. Ok, the easiest way to reproduce: 1. Edit /etc/resolv.conf so it is empty or move it somewhere else. 2. Restart rpc.gssd (I did systemctl restart nfs-secure). 3. Restore /etc/resolv.conf to working condition so the resolution works. 4. Attempt to mount nfs share with kerberos security. 5. The mount fails and the log file contains the above mentioned entries. Apparently rpc.gssd only reads /etc/resolv.conf when it starts and caches this somewhere. I am not sure this is how it is supposed to work and apparently is error prone. I can reproduce this on RHEL 7.4 Clients that use DHCP from a fresh boot. I can confirm that restarting the rpc-gssd service then fixes the issue until the box is rebooted again. I can also reproduce using the steps outlined by Tomasz Kepczynski. JFYI. recently we update some systemd scripts of nfs-utils to resolve the similar issue: https://bugzilla.redhat.com/show_bug.cgi?id=1409012#c2 My 2 cents is that maybe rpc-gssd.service needs that update also. P.S. just checked that current upstream code is the same as RHEL 7.4. Possibly we need some experts to submit an upstream patch firstly. Increasing the bug priority as it has been quiet sometime that the bug is opened and customer needs a fix. (In reply to Rinku from comment #6) > Increasing the bug priority as it has been quiet sometime that the bug is > opened and customer needs a fix. I see this error when rpc.gssd is start w/out a /etc/resolv.conf ERROR: unable to resolve 172.31.1.60 to hostname: Name or service not known ERROR: failed to parse nfs/clntb/info but when I restore /etc/resolv.conf and do a krb5 mount... it works! I see this all the time for IPv6 SLAAC configured hosts. These are typically not going to have hostnames resolved from their IP address. What exactly is rpc.gssd trying to resolve IP addresses into names for? What triggers this resolution to need to happen? To cut the story short - Kerberos HEAVILY depends on name resolution. So TL;DR: rpc.gssd and therefore NFS4, etc. are all going to be quite incompatible with SLAAC unless there is some mechanism in SLAAC to register reverse address records from SLAAC obtained IP addresses. Please could you expand on why this is "NOTABUG"? It fails to work from a fressh boot unless you do a manual restart of the rpc.gssd service. Which means that mounting NFS mounts using keberos also does not work without manual intervantion. meaing that this functionalilty fails to work in a DHCP environment (ours connects to Active Directory) where /etc/resolv.conf is generated by NetworkManager. Agree with Mark Crossland. My most recent question(s) were not even answered. (In reply to Mark Crossland from comment #14) > Please could you expand on why this is "NOTABUG"? It fails to work from a > fressh boot unless you do a manual restart of the rpc.gssd service. Which > means that mounting NFS mounts using keberos also does not work without > manual intervantion. meaing that this functionalilty fails to work in a DHCP > environment (ours connects to Active Directory) where /etc/resolv.conf is > generated by NetworkManager. Can you still reproduce this problem with: - the latest RHEL7 release (7.7) - a working DNS environment This bug was reported 3 years ago, on a very early version of RHEL7 and cloned from a fedora bug. It had no activity for over a year. Normally critical bugs seen by many users have more activity than this. Can you define: - a working DNS environment Does that just mean there is a resolver available or does it mean that it has to have (valid and accurate) reverse mappings for any host that tries to use NFS? The latter is likely not/never (but I never say never) going to exist with things like IPv6 SLAAC. I think it would be great to get an answer on the below questions asked by Brian J. Murrell. What exactly is rpc.gssd trying to resolve IP addresses into names for? What triggers this resolution to need to happen? (In reply to Waheed Barghouthi from comment #18) > I think it would be great to get an answer on the below questions asked by > Brian J. Murrell. > > What exactly is rpc.gssd trying to resolve IP addresses into names for? > > What triggers this resolution to need to happen? I'm also curious, can anyone kindly answer these questions :) |