Bug 1640199

Summary: rpcbproc_getaddr_com() returns correct port value but incorrect IP address
Product: Red Hat Enterprise Linux 7 Reporter: Jean-Louis Rochette <jean.louis.rochette314159>
Component: rpcbindAssignee: Steve Dickson <steved>
Status: CLOSED WONTFIX QA Contact: Yongcheng Yang <yoyang>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.5CC: xzhou, yoyang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-15 07:43:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Source file part of rpcbind, with proposed fix. none

Description Jean-Louis Rochette 2018-10-17 14:16:59 UTC
Created attachment 1494872 [details]
Source file part of rpcbind, with proposed fix.

Description of problem:
We use Fedora release 28 on the NFS client (kernel version 4.17.14-202.fc28).  This release makes the 'showmount -e' command use the version 4 of the rpcbind protocol.

While with the portmap protocol the query was "Which port value for service XYZ?", with rpcbind it is now "Which universal address for service XYZ?".
In its request the client gives an 'hint' for the desired universal address; In our case this is the IP address of the targeted NFS server.

We use CentOS Linux release 7.3.1611 as the base OS to host a container that runs our NFS servers.
I have checked the rpcbind code which comes with the latest CentOS 7.5.1804 release and it is the same.
  http://vault.centos.org/7.5.1804/os/Source/SPackage/rpcbind-0.2.0-44.el7.src.rpm (2018-04-25)
Especially the util.c / addrmerge() function is the same.
So, though our product currently uses CentOS 7.3.1611, it is clear that the same issue exists in CentOS 7.5.1804.
After I filed a bug entry at:
  https://bugs.centos.org/view.php?id=15379
the CentOS administrator asked me to also report this issue here.

The problem is that the rpcbind program returns the first IP address that matches the hint IP, i.e.: The first interface found that is UP and on the same network as the hint IP address.
rpcbind scans the interfaces in the same order they are displayed by the 'ip addr' command.

For instance (IP addresses are dummy):
# ip addr
1: lo: <UNKNOWN>
    inet 127.0.0.1/8
    inet6 ::1/128 scope host
2: ens160: <UP>
    inet 11.22.34.161/22  <<< the first interface that matches the hint
    inet 11.22.34.164/22
    inet 11.22.34.163/22
    inet 11.22.34.166/22  <<< NFS server "server_5"
    inet 11.22.34.167/22  <<< NFS server "server_6"
[snipped]

[client] showmount -e 11.22.34.166
=> time out

The network trace shows that the client asks for program MOUNTv3 on TCP on server address 11.22.34.166, but rpcbind returns 11.22.34.161 as the best IP where to do the MOUNT requests!

Frame 8728: Internet Protocol Version 4, Src: 11.22.34.187, Dst: 11.22.34.166
Portmap
    [Program Version: 4]
    [V4 Procedure: GETADDR (3)]
    RPCB
        Program: MOUNT (100005)
        Version: 3
        Network Id: tcp
        Universal Address: 11.22.34.166.0.111  <<<<< request IP 11.22.34.166
        Owner of this Service: libtirpc

Frame 8730: Internet Protocol Version 4, Src: 11.22.34.166, Dst: 11.22.34.187
Portmap
    [Program Version: 4]
    [V4 Procedure: GETADDR (3)]
    Universal Address: 11.22.34.161.4.210 <<<< get IP 11.22.34.161

// Since there is no MOUNT service on 11.22.34.161, the MOUNT call times out.
Frame 8738: Internet Protocol Version 4, Src: 11.22.34.187, Dst: 11.22.34.161
Mount Service
    [Program Version: 3]
    [V3 Procedure: EXPORT (5)]
[time out]

We can see the exact same behavior with IPv6 addresses.



Version-Release number of selected component (if applicable):
rpcbind-0.2.0-44.el7.src.rpm (2018-04-25)


How reproducible:
Always

Steps to Reproduce:
1. Run the 'ip addr' command. You must have at least 2 interfaces on the same network.
2. From a Fedora 28 client, run 'showmout -e' on the second interface.
3. A network trace will show that rpcbind returns the first interface instead.


Additional info:
Our current Software-Defined NAS solution implements several NFS servers in the same network namespace.
It is critical that rpcbind doesn't alter the requested IP address when this interface exists and is valid, because 'showmount -e' would report the NFS exports of another NFS server than the one requested by the client, or even an IP address where no MOUNT service is running, causing 'showmount -e' to time out.

The use of Round Robin DNS to do load balancing between NFS servers could be another use case where we wouldn't want rpcbind to change the requested IP address.

Description of the proposed fix:
addrmerge() latches the first matching interface as before (bestif), but now it goes on looking for an exact match (exactif), and returns this exact match if found.
This doesn't require a rpcbind command option, since if the IP requested by the client isn't found, we still return the best match as before.

Following is the proposed code change, which I have tested for IPv4 and IPv6 interfaces, and it perfectly fits our needs.
The modified util.c file is also uploaded.

--- util.c.orig 2018-10-12 15:51:38.303419838 +0200
+++ util.c      2018-10-12 15:56:44.769379833 +0200
@@ -103,7 +103,7 @@
 addrmerge(struct netbuf *caller, char *serv_uaddr, char *clnt_uaddr,
          char *netid)
 {
-       struct ifaddrs *ifap, *ifp = NULL, *bestif;
+       struct ifaddrs *ifap, *ifp = NULL, *bestif, *exactif;
        struct netbuf *serv_nbp = NULL, *hint_nbp = NULL, tbuf;
        struct sockaddr *caller_sa, *hint_sa, *ifsa, *ifmasksa, *serv_sa;
        struct sockaddr_storage ss;
@@ -157,7 +157,12 @@
         * network portion of its address is equal to that of the client.
         * If so, we have found the interface that we want to use.
         */
-       bestif = NULL;
+       bestif = NULL;  /* first interface UP with same network & family */
+       exactif = NULL; /* the interface requested by the client */
+       u_int8_t maskAllBits[16] = {  /* 16 bytes for IPv6 */
+               0xff, 0xff, 0xff, 0xff,  0xff, 0xff, 0xff, 0xff,
+               0xff, 0xff, 0xff, 0xff,  0xff, 0xff, 0xff, 0xff };
+
        for (ifap = ifp; ifap != NULL; ifap = ifap->ifa_next) {
                ifsa = ifap->ifa_addr;
                ifmasksa = ifap->ifa_netmask;
@@ -175,8 +180,16 @@
                        if (!bitmaskcmp(&SA2SINADDR(ifsa),
                            &SA2SINADDR(hint_sa), &SA2SINADDR(ifmasksa),
                            sizeof(struct in_addr))) {
-                               bestif = ifap;
-                               goto found;
+                               if(!bestif) /* for compatibility with previous code */
+                                   bestif = ifap;
+                               /* Is this an exact match? */
+                               if (!bitmaskcmp(&SA2SINADDR(ifsa),
+                                   &SA2SINADDR(hint_sa), maskAllBits,
+                                   sizeof(struct in_addr))) {
+                                   exactif = ifap;
+                                   goto found;
+                               }
+                               /* else go-on looking for an exact match */
                        }
                        break;
 #ifdef INET6
@@ -197,8 +210,16 @@
                        } else if (!bitmaskcmp(&SA2SIN6ADDR(ifsa),
                            &SA2SIN6ADDR(hint_sa), &SA2SIN6ADDR(ifmasksa),
                            sizeof(struct in6_addr))) {
-                               bestif = ifap;
-                               goto found;
+                               if(!bestif) /* for compatibility with previous code */
+                                   bestif = ifap;
+                               /* Is this an exact match? */
+                               if (!bitmaskcmp(&SA2SIN6ADDR(ifsa),
+                                   &SA2SIN6ADDR(hint_sa), maskAllBits,
+                                   sizeof(struct in6_addr))) {
+                                   exactif = ifap;
+                                   goto found;
+                               }
+                               /* else go-on looking for an exact match */
                        }
                        break;
 #endif
@@ -215,10 +236,13 @@
                    (bestif->ifa_flags & (IFF_LOOPBACK | IFF_POINTOPOINT))))
                        bestif = ifap;
        }
+
        if (bestif == NULL)
                goto freeit;

 found:
+       if(exactif)
+           bestif = exactif;
        /*
         * Construct the new address using the the address from
         * `bestif', and the port number from `serv_uaddr'.

Alternate fix:
Returning an IP address in the reply raises the risk of a man-in-the-middle attack that would change the universal address returned by rpcbind to direct the client to a malware server.
A way to avoid this risk and to fix the wrong IP address issue would be to register only version 2 of the portmapper program.  This would force the clients to use PORTMAPv2, where just the port value is returned.

Complementary optional fix:
Today all services seem to be registered in the port mapper for ** any ** IP address:
$ rpcinfo 11.22.34.166
   program version netid     address                service    owner
    100000    4    tcp6      ::.0.111               portmapper superuser
    100000    3    tcp6      ::.0.111               portmapper superuser
[snipped]
    100005    1    udp       0.0.0.0.4.210          mountd     superuser
    100005    1    udp6      ::.4.210               mountd     superuser
    100005    2    udp       0.0.0.0.4.210          mountd     superuser
    100005    2    udp6      ::.4.210               mountd     superuser
    100005    3    udp       0.0.0.0.4.210          mountd     superuser
    100005    3    udp6      ::.4.210               mountd     superuser
    100005    1    tcp       0.0.0.0.4.210          mountd     superuser
    100005    1    tcp6      ::.4.210               mountd     superuser
    100005    2    tcp       0.0.0.0.4.210          mountd     superuser
    100005    2    tcp6      ::.4.210               mountd     superuser
    100005    3    tcp       0.0.0.0.4.210          mountd     superuser
    100005    3    tcp6      ::.4.210               mountd     superuser
[snipped]
The case of a service that would be registered for a specific IP address, as permitted by RPCBINFv3 RPCBPROC_SET(rpcb) doesn't seem to be handled by rpcbind.
  find_service(rpcprog_t prog, rpcvers_t vers, char *netid [,char *saddr])
should check if the service is registered for an IP that fits the IP requested by the client (saddr).
INADDR_ANY fits any.

Thanks!

Comment 2 Yongcheng Yang 2018-10-18 08:26:09 UTC
Hello Jean-Louis, you might need to send the patch to linux-nfs.org and get more eyes on it.

And we can backport it after it merging to upstream first.

Comment 3 Jean-Louis Rochette 2018-10-22 15:40:40 UTC
Hi Yongcheng,
I have just posted an email to this mailing list.
Thanks!
Jean-Louis.

Comment 4 Steve Dickson 2019-09-26 23:25:46 UTC
(In reply to Jean-Louis Rochette from comment #3)
> Hi Yongcheng,
> I have just posted an email to this mailing list.
> Thanks!
> Jean-Louis.

Which mailing list? I don't see it...

Comment 7 RHEL Program Management 2021-02-15 07:43:34 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 8 Red Hat Bugzilla 2023-09-15 00:12:58 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days