Bug 569944
Summary: | lockd calls statd with source address of first interface instead of localhost | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ron Wail <ron> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 12 | CC: | jlayton, steved |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-12-03 22:07:41 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ron Wail
2010-03-02 18:57:31 UTC
> rpc.statd[10970]: Call to statd from non-local host <server interface IP>
> rpc.statd[10970]: STAT_FAIL to <server hostname> for SM_MON of <client IP>
This is by design... statd will reject any monitor calls that do
not come from 127.0.0.1
(In reply to comment #1) > > rpc.statd[10970]: Call to statd from non-local host <server interface IP> > > rpc.statd[10970]: STAT_FAIL to <server hostname> for SM_MON of <client IP> > > This is by design... statd will reject any monitor calls that do > not come from 127.0.0.1 If you read the original report, rpcbind or lockd is not sending the call to statd via the loopback interface. It is sending it via the one of its other interfaces, so the source address is not 127.0.0.1. There is no way to make lockd use localhost/loopback as its source address. So this is a bug in rpcbind/lockd for not sending from the loopback address, or it's a bug in statd for not accepting monitor calls from one of localhost's interfaces. I see... thank you for pointing this out.. It must be lockd since rpcbind does not send SM_MON messages... hmm... What kernel version are you using? # uname -a Linux <hostname> 2.6.31.12-174.2.22.fc12.i686.PAE #1 SMP Fri Feb 19 19:10:04 UTC 2010 i686 i686 i386 GNU/Linux # cat /etc/exports /srv/nas *(rw,sync,no_root_squash,insecure_locks) The NFS mount is from VMs running on the host over the virbr0/vnet interfaces. Unfortunately, it's a production box and I don't have a test machine to look at upgrades etc at the moment, but may have soon. One complexity is that this box has the following network interfaces: lo eth1 -> Internet eth0 -> 2 x vlan: vlan1 -> 4 multi-home IPs, vlan2 -> single IP tun1: OpenVPN virbr0: Bridge interface for 8 x VMs vnet[0-8]: host visible vnet interfaces of the VMs It sounds like something is wrong with the routing here. From nsm_create(): struct sockaddr_in sin = { .sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_LOOPBACK), }; struct rpc_create_args args = { .protocol = XPRT_TRANSPORT_UDP, .address = (struct sockaddr *)&sin, .addrsize = sizeof(sin), .servername = "rpc.statd", .program = &nsm_program, .version = NSM_VERSION, .authflavor = RPC_AUTH_NULL, .flags = RPC_CLNT_CREATE_NOPING, }; ...the destination address for NSM calls is hardcoded to INADDR_LOOPBACK. If the source address of those packets is something other than INADDR_LOOPBACK then the routing table is screwy indeed. I suppose it's also possible that statd has a bug and is rejecting calls from INADDR_LOOPBACK when it shouldn't. It might be worthwhile to sniff traffic on 'lo' and see if you can see these requests and what the source address actually is. How often does this happen? Would it be possible to get a network trace of lo without the trace growing to an unmanageable size? The trace command could look something like: yum install wireshark tshark -e lo -o /tmp/data.pcap -R rpc bzip2 /tmp/data.pcap This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. The source of the fault ended up being some relationship of iptables NAT and lockd. In our NAT tables, we had: -A POSTROUTING -j MASQUERADE after a number of SNAT and DNAT entries. Changing that entry to -A POSTROUTING ! -o lo -j MASQUERADE corrected the problem. Sorry for not replying until now, but thanks for the assistance along the way. |