1896923 – DNS pod /metrics exposed on anonymous http port

Bug 1896923 - DNS pod /metrics exposed on anonymous http port

Summary: DNS pod /metrics exposed on anonymous http port

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.7.0
Assignee:	Miciah Dashiel Butler Masters
QA Contact:	Hongan Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-11-11 21:11 UTC by John McMeeking
Modified:	2022-08-04 22:39 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-24 15:32:36 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-dns-operator pull 210	None	closed	Bug 1896923: Configure CoreDNS metrics plugin to use localhost	2020-12-31 07:55:51 UTC
Github	openshift cluster-ingress-operator pull 490	None	closed	Bug 1896923: Configure operator metrics handler to use localhost	2020-12-31 07:56:25 UTC
Red Hat Product Errata	RHSA-2020:5633	None	None	None	2021-02-24 15:33:22 UTC

Description John McMeeking 2020-11-11 21:11:01 UTC

Description of problem:

dns-default and dns-operator pods /metrics endpoint is accessible via both an anonymous http port and an https port that uses RBAC. I believe these are supposed to be accessible only through the kube-rbac-proxy port.


Version-Release number of selected component (if applicable):


How reproducible:

Always


Steps to Reproduce:

Using the openshift-dns/default-dns daemonset as an example:
1. Get pod IP of a pod
dns-default-q87jh   3/3     Running   0          5d4h   172.30.247.96   10.171.188.47   <none>           <none>

2. Get the http port from the controller deployment or daemonset's kube-rbac-proxy container:
      --upstream=http://127.0.0.1:9153/

3. From another pod curl the metrics endpoint:
curl http://172.30.247.96:9153/metrics


Actual results:

dns-pod :9i53 is accessible from other pods:

/ # curl -sk http://172.30.247.113:9153/metrics
# HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built.
# TYPE coredns_build_info gauge
coredns_build_info{goversion="go1.13.4",revision="",version="1.6.6"} 1

Similarly, dns-operator :60000 is accessible from other pods:

dns-operator-576bc86f5-gwg8f   2/2     Running   0          3d4h   172.30.247.83   10.171.188.47   <none>           <none>

/ # curl -sk http://172.30.247.83:60000/metrics
# HELP controller_runtime_reconcile_errors_total Total number of reconciliation errors per controller
# TYPE controller_runtime_reconcile_errors_total counter
controller_runtime_reconcile_errors_total{controller="dns_controller"} 4


Expected results:

The http port is not supposed to be accessible from another pod; only the port exposed by the kube-rbac-proxy should be accessible.


Additional info:

This was done on an IBM Cloud OpenShift 4.5 cluster.

Using crictl and 'ss -lntp' in the pod netns I see port 9153 bound to all interfaces:

LISTEN      0      128                                                   [::]:9153                                                              [::]:*                   users:(("coredns",pid=21019,fd=6))

I believe the coredns prometheus plugin can be configured with:
        prometheus localhost:9153

The dns-operator http port is also bound to all interfaces:
LISTEN      0      128                                                   [::]:60000                                                             [::]:*                   users:(("dns-operator",pid=17538,fd=5))

Comment 1 John McMeeking 2020-11-12 18:10:04 UTC

Not to make this bug open-ended, but I have these directions for a pen test that scans an entire cluster for pod IPs that have a /metrics endpoint that can be called anonymously. This might be useful for identifying other components that have the same issue.

1. Create a sample deployment
kubectl create deployment --image nginx my-nginx

2. Exec into the pod

3. Execute the following inside the pod to start an nmap scan inside of a tmux session and wait until it completes. It takes about 25 minutes to complete the scan:

apt update && apt install git curl nmap python3-pip net-tools wget tmux -y && git clone https://github.com/maaaaz/nmaptocsv.git && wget https://github.com/projectdiscovery/httpx/releases/download/v1.0.2/httpx_1.0.2_linux_amd64.tar.gz && tar -xvf httpx_1.0.2_linux_amd64.tar.gz && mv httpx /usr/local/bin/httpx && tmux new-session -s podscan -d "nmap -p- --min-rate 1000 --min-hostgroup 64 -T4 -v --open -T4 172.30.0.0/16 -oA portscanresults"

4. Wait for the scan to complete. After the scan completes the tmux session will exit. You can monitor with 'tmux ls' until the session exits.

5. Run the following command from inside the same pod:

python3 nmaptocsv/nmaptocsv.py -i portscanresults.gnmap -d ":" -f ip-port | tr -d '"' | sort -u | httpx -path /metrics -content-length -status-code

The output will show you all http/https services any pod can reach from within the cluster. The output displays the http response code for /metrics endpoint and content-length.

Comment 2 Miciah Dashiel Butler Masters 2020-11-13 01:08:05 UTC

The ingress operator has the same problem:

    % oc -n openshift-ingress-operator rsh -c ingress-operator deploy/ingress-operat
    or ss -ltp
    State     Recv-Q    Send-Q       Local Address:Port        Peer Address:Port
    LISTEN    0         0                        *:9393                   *:*
    LISTEN    0         0                        *:60000                  *:*
     users:(("ingress-operato",pid=1,fd=11))

60000 is the operator, and 9393 is kube-rbac-proxy.  https://github.com/openshift/cluster-ingress-operator/pull/490 changes the listen address from *:60000 to 127.0.0.1:60000.

Comment 3 Miciah Dashiel Butler Masters 2020-11-14 00:24:08 UTC

PRs are posted and approved, just hitting some CI failures, which appear to be flakes.  We'll try to finish this up in the upcoming sprint.

Comment 4 Richard Theis 2020-11-14 02:40:26 UTC

Should we open separate Bugzilla reports for each of the problem identified by the scanner?

Comment 6 Hongan Li 2020-11-26 03:50:58 UTC

Tested with 4.7.0-0.nightly-2020-11-25-114114 and passed.

### dns pod
sh-4.4# ss -lntp
State                Recv-Q               Send-Q                             Local Address:Port                              Peer Address:Port               
LISTEN               0                    128                                    127.0.0.1:9153                                   0.0.0.0:*                  

### dns operator pod
sh-4.4$ ss -lntp
State         Recv-Q         Send-Q                 Local Address:Port                  Peer Address:Port                                                    
LISTEN        0              0                          127.0.0.1:60000                      0.0.0.0:*            users:(("dns-operator",pid=1,fd=7))     

### ingress operator pod
sh-4.4$ ss -lntp
State         Recv-Q        Send-Q               Local Address:Port                 Peer Address:Port                                                        
LISTEN        0             0                        127.0.0.1:60000                     0.0.0.0:*            users:(("ingress-operato",pid=1,fd=11))

Comment 8 Miciah Dashiel Butler Masters 2020-12-31 07:58:23 UTC

(In reply to Richard Theis from comment #4)
> Should we open separate Bugzilla reports for each of the problem identified
> by the scanner?

Sorry for not responding to this earlier.  For this BZ, I've fixed the problem for the ingress and DNS operators.  If the problem is identified in other components, please file separate Bugzilla reports.

Comment 10 errata-xmlrpc 2021-02-24 15:32:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633

Note You need to log in before you can comment on or make changes to this bug.