Bug 2100465 - cluster-baremetal-validating-webhook-configuration is pointing to a service/path that does not exist work
Summary: cluster-baremetal-validating-webhook-configuration is pointing to a service/p...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.10
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Honza Pokorny
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-23 12:46 UTC by Simon Reber
Modified: 2023-03-09 01:22 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-09 01:22:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 6964520 0 None None None 2022-06-23 12:47:08 UTC

Description Simon Reber 2022-06-23 12:46:40 UTC
Description of problem:

The `cluster-baremetal-validating-webhook-configuration` `ValidatingWebhookConfiguration` is pointing to a port 443 (respectively 9443) and path `/validate-metal3-io-v1alpha1-provisioning` that does not appear to exist.

Even though port 9443 is configured in the `cluster-baremetal-operator` pod, it's nothing listening there and also `/validate-metal3-io-v1alpha1-provisioning` does not appear to be available.

 > $ oc get infrastructure  cluster -o json
 > {
 >     "apiVersion": "config.openshift.io/v1",
 >     "kind": "Infrastructure",
 >     "metadata": {
 >         "creationTimestamp": "2022-06-01T11:36:25Z",
 >         "generation": 1,
 >         "name": "cluster",
 >         "resourceVersion": "588",
 >         "uid": "a53781d1-1017-4c57-a96a-99151cbcf6b3"
 >     },
 >     "spec": {
 >         "cloudConfig": {
 >             "name": ""
 >         },
 >         "platformSpec": {
 >             "type": "None"
 >         }
 >     },
 >     "status": {
 >         "apiServerInternalURI": "https://api-int.example.lab.example.com:6443",
 >         "apiServerURL": "https://api.example.lab.example.com:6443",
 >         "controlPlaneTopology": "HighlyAvailable",
 >         "etcdDiscoveryDomain": "",
 >         "infrastructureName": "example-f9t2l",
 >         "infrastructureTopology": "HighlyAvailable",
 >         "platform": "None",
 >         "platformStatus": {
 >             "type": "None"
 >         }
 >     }
 > }

 > $ oc exec -c cluster-baremetal-operator cluster-baremetal-operator-847d7bddbc-dkd2m -- ss -tulpen
 > Netid State  Recv-Q Send-Q Local Address:Port Peer Address:PortProcess                                                          
 > tcp   LISTEN 0      0                  *:8080            *:*    users:(("cluster-baremet",pid=1,fd=7)) uid:65534 ino:194944 sk:0
 > tcp   LISTEN 0      0                  *:8443            *:*    uid:65534 ino:190901 sk:0                                       

Specifically with https://github.com/openshift/cluster-kube-apiserver-operator/pull/1265 and https://github.com/openshift/cluster-kube-apiserver-operator/pull/1313 in place, this is causing unncessary errors in `kube-apiserver-operator` logs and thus confusion.

Generally, admission plugins are expected to work and be available all the time as none properly working admission plugins can have impact on API performance and stability.

Version-Release number of selected component (if applicable):

 - OpenShift Container Platform 4.10

How reproducible:

 - Always

Steps to Reproduce:
1. Install OpenShift Container Platform 4.10
2. oc exec kube-apiserver-operator-XXXXXXXXX-XXXXX -- curl -vv -k https://cluster-baremetal-webhook-service.openshift-machine-api.svc:443/validate-metal3-io-v1alpha1-provisioning
3. The call will fail as nothing is listening on port 9443 in `cluster-baremetal-operator`

Actual results:

$ oc exec kube-apiserver-operator-XXXXXXXXX-XXXXX -- curl -vv -k https://cluster-baremetal-webhook-service.openshift-machine-api.svc:443/validate-metal3-io-v1alpha1-provisioning will fail with:

 > *   Trying 172.30.225.197...
 > * TCP_NODELAY set
 > * connect to 172.30.225.197 port 443 failed: Connection refused
 > * Failed to connect to cluster-baremetal-webhook-service.openshift-machine-api.svc port 443: Connection refused
 > * Closing connection 0

Expected results:

The call is expected to work when done correctly (the `curl` as shown above) or else if not expected to work, the `ValidatingWebhookConfiguration` should not be configured to prevent impact on the API performance and availability.

Additional info:

Comment 1 Shiftzilla 2023-03-09 01:22:08 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-9334


Note You need to log in before you can comment on or make changes to this bug.