Description of problem: dns-default and dns-operator pods /metrics endpoint is accessible via both an anonymous http port and an https port that uses RBAC. I believe these are supposed to be accessible only through the kube-rbac-proxy port. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: Using the openshift-dns/default-dns daemonset as an example: 1. Get pod IP of a pod dns-default-q87jh 3/3 Running 0 5d4h 172.30.247.96 10.171.188.47 <none> <none> 2. Get the http port from the controller deployment or daemonset's kube-rbac-proxy container: --upstream=http://127.0.0.1:9153/ 3. From another pod curl the metrics endpoint: curl http://172.30.247.96:9153/metrics Actual results: dns-pod :9i53 is accessible from other pods: / # curl -sk http://172.30.247.113:9153/metrics # HELP coredns_build_info A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built. # TYPE coredns_build_info gauge coredns_build_info{goversion="go1.13.4",revision="",version="1.6.6"} 1 Similarly, dns-operator :60000 is accessible from other pods: dns-operator-576bc86f5-gwg8f 2/2 Running 0 3d4h 172.30.247.83 10.171.188.47 <none> <none> / # curl -sk http://172.30.247.83:60000/metrics # HELP controller_runtime_reconcile_errors_total Total number of reconciliation errors per controller # TYPE controller_runtime_reconcile_errors_total counter controller_runtime_reconcile_errors_total{controller="dns_controller"} 4 Expected results: The http port is not supposed to be accessible from another pod; only the port exposed by the kube-rbac-proxy should be accessible. Additional info: This was done on an IBM Cloud OpenShift 4.5 cluster. Using crictl and 'ss -lntp' in the pod netns I see port 9153 bound to all interfaces: LISTEN 0 128 [::]:9153 [::]:* users:(("coredns",pid=21019,fd=6)) I believe the coredns prometheus plugin can be configured with: prometheus localhost:9153 The dns-operator http port is also bound to all interfaces: LISTEN 0 128 [::]:60000 [::]:* users:(("dns-operator",pid=17538,fd=5))
Not to make this bug open-ended, but I have these directions for a pen test that scans an entire cluster for pod IPs that have a /metrics endpoint that can be called anonymously. This might be useful for identifying other components that have the same issue. 1. Create a sample deployment kubectl create deployment --image nginx my-nginx 2. Exec into the pod 3. Execute the following inside the pod to start an nmap scan inside of a tmux session and wait until it completes. It takes about 25 minutes to complete the scan: apt update && apt install git curl nmap python3-pip net-tools wget tmux -y && git clone https://github.com/maaaaz/nmaptocsv.git && wget https://github.com/projectdiscovery/httpx/releases/download/v1.0.2/httpx_1.0.2_linux_amd64.tar.gz && tar -xvf httpx_1.0.2_linux_amd64.tar.gz && mv httpx /usr/local/bin/httpx && tmux new-session -s podscan -d "nmap -p- --min-rate 1000 --min-hostgroup 64 -T4 -v --open -T4 172.30.0.0/16 -oA portscanresults" 4. Wait for the scan to complete. After the scan completes the tmux session will exit. You can monitor with 'tmux ls' until the session exits. 5. Run the following command from inside the same pod: python3 nmaptocsv/nmaptocsv.py -i portscanresults.gnmap -d ":" -f ip-port | tr -d '"' | sort -u | httpx -path /metrics -content-length -status-code The output will show you all http/https services any pod can reach from within the cluster. The output displays the http response code for /metrics endpoint and content-length.
The ingress operator has the same problem: % oc -n openshift-ingress-operator rsh -c ingress-operator deploy/ingress-operat or ss -ltp State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 *:9393 *:* LISTEN 0 0 *:60000 *:* users:(("ingress-operato",pid=1,fd=11)) 60000 is the operator, and 9393 is kube-rbac-proxy. https://github.com/openshift/cluster-ingress-operator/pull/490 changes the listen address from *:60000 to 127.0.0.1:60000.
PRs are posted and approved, just hitting some CI failures, which appear to be flakes. We'll try to finish this up in the upcoming sprint.
Should we open separate Bugzilla reports for each of the problem identified by the scanner?
Tested with 4.7.0-0.nightly-2020-11-25-114114 and passed. ### dns pod sh-4.4# ss -lntp State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 127.0.0.1:9153 0.0.0.0:* ### dns operator pod sh-4.4$ ss -lntp State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 127.0.0.1:60000 0.0.0.0:* users:(("dns-operator",pid=1,fd=7)) ### ingress operator pod sh-4.4$ ss -lntp State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 127.0.0.1:60000 0.0.0.0:* users:(("ingress-operato",pid=1,fd=11))
(In reply to Richard Theis from comment #4) > Should we open separate Bugzilla reports for each of the problem identified > by the scanner? Sorry for not responding to this earlier. For this BZ, I've fixed the problem for the ingress and DNS operators. If the problem is identified in other components, please file separate Bugzilla reports.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633