Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2100465

Summary: cluster-baremetal-validating-webhook-configuration is pointing to a service/path that does not exist work
Product: OpenShift Container Platform Reporter: Simon Reber <sreber>
Component: Bare Metal Hardware ProvisioningAssignee: Honza Pokorny <hpokorny>
Bare Metal Hardware Provisioning sub component: cluster-baremetal-operator QA Contact: Amit Ugol <augol>
Status: CLOSED DEFERRED Docs Contact:
Severity: medium    
Priority: medium CC: hpokorny, stbenjam
Version: 4.10Keywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-09 01:22:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simon Reber 2022-06-23 12:46:40 UTC
Description of problem:

The `cluster-baremetal-validating-webhook-configuration` `ValidatingWebhookConfiguration` is pointing to a port 443 (respectively 9443) and path `/validate-metal3-io-v1alpha1-provisioning` that does not appear to exist.

Even though port 9443 is configured in the `cluster-baremetal-operator` pod, it's nothing listening there and also `/validate-metal3-io-v1alpha1-provisioning` does not appear to be available.

 > $ oc get infrastructure  cluster -o json
 > {
 >     "apiVersion": "config.openshift.io/v1",
 >     "kind": "Infrastructure",
 >     "metadata": {
 >         "creationTimestamp": "2022-06-01T11:36:25Z",
 >         "generation": 1,
 >         "name": "cluster",
 >         "resourceVersion": "588",
 >         "uid": "a53781d1-1017-4c57-a96a-99151cbcf6b3"
 >     },
 >     "spec": {
 >         "cloudConfig": {
 >             "name": ""
 >         },
 >         "platformSpec": {
 >             "type": "None"
 >         }
 >     },
 >     "status": {
 >         "apiServerInternalURI": "https://api-int.example.lab.example.com:6443",
 >         "apiServerURL": "https://api.example.lab.example.com:6443",
 >         "controlPlaneTopology": "HighlyAvailable",
 >         "etcdDiscoveryDomain": "",
 >         "infrastructureName": "example-f9t2l",
 >         "infrastructureTopology": "HighlyAvailable",
 >         "platform": "None",
 >         "platformStatus": {
 >             "type": "None"
 >         }
 >     }
 > }

 > $ oc exec -c cluster-baremetal-operator cluster-baremetal-operator-847d7bddbc-dkd2m -- ss -tulpen
 > Netid State  Recv-Q Send-Q Local Address:Port Peer Address:PortProcess                                                          
 > tcp   LISTEN 0      0                  *:8080            *:*    users:(("cluster-baremet",pid=1,fd=7)) uid:65534 ino:194944 sk:0
 > tcp   LISTEN 0      0                  *:8443            *:*    uid:65534 ino:190901 sk:0                                       

Specifically with https://github.com/openshift/cluster-kube-apiserver-operator/pull/1265 and https://github.com/openshift/cluster-kube-apiserver-operator/pull/1313 in place, this is causing unncessary errors in `kube-apiserver-operator` logs and thus confusion.

Generally, admission plugins are expected to work and be available all the time as none properly working admission plugins can have impact on API performance and stability.

Version-Release number of selected component (if applicable):

 - OpenShift Container Platform 4.10

How reproducible:

 - Always

Steps to Reproduce:
1. Install OpenShift Container Platform 4.10
2. oc exec kube-apiserver-operator-XXXXXXXXX-XXXXX -- curl -vv -k https://cluster-baremetal-webhook-service.openshift-machine-api.svc:443/validate-metal3-io-v1alpha1-provisioning
3. The call will fail as nothing is listening on port 9443 in `cluster-baremetal-operator`

Actual results:

$ oc exec kube-apiserver-operator-XXXXXXXXX-XXXXX -- curl -vv -k https://cluster-baremetal-webhook-service.openshift-machine-api.svc:443/validate-metal3-io-v1alpha1-provisioning will fail with:

 > *   Trying 172.30.225.197...
 > * TCP_NODELAY set
 > * connect to 172.30.225.197 port 443 failed: Connection refused
 > * Failed to connect to cluster-baremetal-webhook-service.openshift-machine-api.svc port 443: Connection refused
 > * Closing connection 0

Expected results:

The call is expected to work when done correctly (the `curl` as shown above) or else if not expected to work, the `ValidatingWebhookConfiguration` should not be configured to prevent impact on the API performance and availability.

Additional info:

Comment 1 Shiftzilla 2023-03-09 01:22:08 UTC
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira.

https://issues.redhat.com/browse/OCPBUGS-9334