Bug 1940341

Summary: CVE-2021-46828 libtirpc: rpcbind sockets remain ESTABLISHED indefinitely after port scan. [rhel-8.6.0]
Product: Red Hat Enterprise Linux 8 Reporter: Ravindra Patil <ravpatil>
Component: libtirpcAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Zhi Li <yieli>
Severity: medium Docs Contact:
Priority: medium    
Version: 8.3CC: calum.mackay, dai.ngo, dchong, peter.vreman, saroy, steved, xzhou, yoyang
Target Milestone: rcKeywords: Patch, Reproducer, Security, SecurityTracking, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libtirpc-1.1.4-6.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2038662 (view as bug list) Environment:
Last Closed: 2022-05-10 15:24:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2038662, 2109352    
Attachments:
Description Flags
patch for rpcbind
none
patch for libtirpc none

Description Ravindra Patil 2021-03-18 08:57:10 UTC
Description of problem:

rpcbind sockets remain ESTABLISHED indefinitely after port vulnerability scan.

The customer's scan might be keeping the connection open on the client side indefinitely or long enough for a silent firewall to drop the connection and also not send RST to the server.  

In either case, the RHEL7 and RHEL8 test are equal in this behavior yet RHEL7 terminates the connection while RHEL8 does not.  So there is a difference in rpcbind's and seems likely to be in the area referenced by the sourceforge link

https://sourceforge.net/p/libtirpc/mailman/message/34866401/
- removed __svc_clean_idle() from my_svc_run() case when poll() returns "0" (timed out) and current rpcbind only does continue() in my_svc_run() when poll returns 0.

ISSUE:  rpcbind sockets remain ESTABLISHED indefinitely after port scan.
WHERE:  RHEL8  --- didn't happen on RHEL7.

SUMMARY:  It's up to the application, rpcbind in this case, to set use of SO_KEEPALIVE or implement it's own socket timeouts. 
sunrpc uses SO_KEEPALIVE in RHEL7 within xs_tcp_finish_connecting() and RHEL8 does so in xs_tcp_set_socket_timeouts() called from xs_tcp_finish_connecting(), but the kernel sunrpc isn't in control of the rpcbind socket.  

I checked rpcbind source and found no references to KEEPALIVE in RHEL7 nor RHEL8.

Hypothesis:  sunrpc does it's own timeouts to cause shutdown of ESTABLISHED connections in RHEL7 but not in RHEL8, or else there is some configuration variable that needs to be adjusted.

LAB test:

RHEL7:
  [NFS Server]
    kernel 3.10.0-1160.11.1.el7.x86_64
    rpcbind-0.2.0-49.el7.x86_64
    # strace -f -o /tmp/rpcbind.strace -p `pidof rpcbind`
    # iptables -I INPUT 1 --ipv4 -p tcp -s 192.168.122.1 --sport 40052 --dport 111 -j DROP    
  [client] 
    nc -4 192.168.122.18 111
    ... ephemeral port chosen was 40052 ...
    killall nc

  Finding:  rpcbind did not set SO_KEEPALIVE* on the socket, netstat confirmed no keepalive enabled (off) strace showed accept followed by two 30s timed out polls then rpcbind closed the socket,  causing the server to send FIN and initiate closure that went to FIN_WAIT1 then timed out and cleared.

   
RHEL8:
  [NFS server]
    kernel 4.18.0-240.1.1.el8_3.x86_64
    rpcbind-1.2.5-7.el8.x86_64
    # strace -f -o /tmp/rpcbind.strace -p `pidof rpcbind`
  [client]
    nc -4 192.168.122.1 111
    .... ephemeral port chosen 34112 ...
    iptables -I OUTPUT 1 --ipv4 -p tcp -d 192.168.122.1 --sport 34112 --dport 111 -j DROP 
    killall nc

   Finding: rpcbind continues to poll and timeout every 30s but never closes the connection.  Connection remained ESTABLISHED indefinitely.

I have checked with package maintainer and there has been no such change in rpcbind but there might be changes in implementation of the systemd sockets. 
since systemd creates the sockets for rpcbind.

Version-Release number of selected component (if applicable):
rpcbind-1.2.5-7.el8.x86_64

How reproducible:
rpcbind connections are established while scanning the system with vulnerability scanner

Steps to Reproduce:
1. Scan system with vulnerability scanner
2. The scanner opens rpcbind connections to the system
3. Check if rpcbind connections are closed

Actual results:

rpcbind continues to poll and timeout every 30s but never closes the connection.  Connection remained ESTABLISHED indefinitely.

Expected results:

rpcbind established connections should be closed 

Additional info:

Comment 3 Dai Ngo 2021-08-05 18:45:27 UTC
Created attachment 1811286 [details]
patch for rpcbind

Comment 4 Dai Ngo 2021-08-05 18:47:16 UTC
This bug is more serious that what was described above.

When the number of idle ESTABLISHED connections reaches the limit of open file descriptors (ulimit -n) then accept(2) fails with EMFILE.
Currently svc_run (libtirpc) and my_svc_run (rpcbind) do not handle EMFILE error returned from accept(2), it just ignores the error and
continue on. This causes svc_run/my_svc_run to get in a tight loop calling accept(2). Once it gets into this state rpcbind cannot service
any requests, basically taking the RPC service down.  Note that mountd and statd also suffer the same problem. This is a DoS vulnerability
of rpcbind, mountd, statd and any consumers of svc_run in libtirpc. These RPC services are essential for NFSv3 operations.

The problem in libtirpc was introduced by commit:
b2c9430f46c4 Use poll() instead of select() in svc_run()

The problem in rpcbind was introduced by commit:
44bf15b8 rpcbind: don't use obsolete svc_fdset interface of libtirpc

These commits removed the handling EMFILE returned by accept(2) and handling of poll timeout in svc_run/my_svc_run.

The problem can be reproduced using open source tool 'nc' (ncat). One can run this script to take the RPC service down:

#!/bin/sh
  
# Usage: td.sh server dst_port conn_cnt

if [ $# -ne 3 ]; then
        echo "$0: server dst_port conn_cnt"
        exit
fi
server=$1
dport=$2
conn_cnt=$3

echo "dport[$dport] server[$server] conn_cnt[$conn_cnt]"

pcnt=0
while [ $pcnt -lt $conn_cnt ]
do
        echo "connect from $sport"
        nc -v --recv-only $server $dport &
        pcnt=`expr $pcnt + 1`
done

# ./td.sh server 111 1024   /* take down rpcbind service */

Comment 5 Dai Ngo 2021-08-05 18:47:54 UTC
Created attachment 1811287 [details]
patch for libtirpc

Comment 7 Steve Dickson 2021-10-13 14:11:30 UTC
(In reply to Dai Ngo from comment #5)
> Created attachment 1811287 [details]
> patch for libtirpc

I'm a bit confused... was this problem take care of with
commit 86529758570cef4c73fb9b9c4104fdc510f701ed 
Author: Dai Ngo <dai.ngo>
Date:   Sat Aug 21 13:16:23 2021 -0400

    Fix DoS vulnerability in libtirpc

in both libtirpc and rpcbind?

Comment 8 Dai Ngo 2021-10-18 22:29:48 UTC
Yes, this problem was taken care of by the fix in libtirpc with commit 86529758570cef4c73fb9b9c4104fdc510f701ed.
There is no need to do anything with rpcbind.

Comment 9 Yongcheng Yang 2021-10-25 03:10:26 UTC
(In reply to Dai Ngo from comment #8)
> Yes, this problem was taken care of by the fix in libtirpc with commit
> 86529758570cef4c73fb9b9c4104fdc510f701ed.
> There is no need to do anything with rpcbind.

I'm just re-setting this bug to libtirpc according to that. Which may clear some flags.

Comment 15 Zhi Li 2021-12-05 12:11:02 UTC
Moving to VERIFIED according to comment#14.

Comment 17 errata-xmlrpc 2022-05-10 15:24:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libtirpc bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2065

Comment 18 Steve Dickson 2022-08-02 14:48:36 UTC
*** Bug 2109404 has been marked as a duplicate of this bug. ***

Comment 19 Steve Dickson 2022-08-02 14:51:48 UTC
*** Bug 2109403 has been marked as a duplicate of this bug. ***