Bug 2075621

Summary: Cluster upgrade.[sig-mco] Machine config pools complete upgrade
Product: OpenShift Container Platform Reporter: Ken Zhang <kenzhang>
Component: kube-controller-managerAssignee: Maciej Szulik <maszulik>
Status: CLOSED ERRATA QA Contact: zhou ying <yinzhou>
Severity: high Docs Contact:
Priority: high    
Version: 4.10CC: alchan, aos-bugs, lmohanty, maszulik, mfojtik, pfruth, sippy, wking
Target Milestone: ---Keywords: Upgrades
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The new mechanism responsible for tracking job's pod is using finalizers. Consequence: In rare cases it's possible that some pods will not get removed due to that finalizers not being removed. Fix: Disable beta JobTrackingWithFinalizers feature which was enabled by default. Result: There should be no pods left behind.
Story Points: ---
Clone Of:
: 2075831 (view as bug list) Environment:
Last Closed: 2022-08-10 11:07:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2075831    

Description Ken Zhang 2022-04-14 17:40:44 UTC
Cluster upgrade.[sig-mco] Machine config pools complete upgrade

is failing frequently in CI, see:
https://sippy.ci.openshift.org/sippy-ng/tests/4.11/analysis?test=Cluster%20upgrade.%5Bsig-mco%5D%20Machine%20config%20pools%20complete%20upgrade


Specific case: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release[…]e-from-stable-4.10-e2e-aws-ovn-upgrade/1514548265960345600

pod collect-profiles-27498960-h5qfg in namespace openshift-operator-lifecycle-manager does not get deleted. It seems that job track finalizer is holding the pod. 

upstream kube is pulling the feature due to bug: https://github.com/kubernetes/kubernetes/pull/109487

upstream bug: https://github.com/kubernetes/kubernetes/issues/109485

Looks like this is a regression from 4.9 to 4.10

See slack discussion for more details: https://coreos.slack.com/archives/C01CQA76KMX/p1649954805399049

Comment 1 W. Trevor King 2022-04-14 17:44:29 UTC
Machine pools not completing updates can keep the overall OCP-core update from completing, so adding UpgradeBlocker to get this into the assessment pipeline [1].  We're asking the following questions to evaluate whether or not this bug warrants blocking an upgrade edge from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid delivering an update which introduces new risk or reduces cluster functionality in any way. Sample answers are provided to give more context and the ImpactStatementRequested label has been added to this bug. When responding, please remove ImpactStatementRequested and set the ImpactStatementProposed label. The expectation is that the assignee answers these questions.

Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking?
* example: Customers upgrading from 4.y.Z to 4.y+1.z running on GCP with thousands of namespaces, approximately 5% of the subscribed fleet
* example: All customers upgrading from 4.y.z to 4.y+1.z fail approximately 10% of the time

What is the impact? Is it serious enough to warrant blocking edges?
* example: Up to 2 minute disruption in edge routing
* example: Up to 90 seconds of API downtime
* example: etcd loses quorum and you have to restore from backup

How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)?
* example: Issue resolves itself after five minutes
* example: Admin uses oc to fix things
* example: Admin must SSH to hosts, restore from backups, or other non standard admin activities

Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)?
* example: No, it has always been like this we just never noticed
* example: Yes, from 4.y.z to 4.y+1.z Or 4.y.z to 4.y.z+1

[1]: https://github.com/openshift/enhancements/blob/master/enhancements/update/update-blocker-lifecycle/README.md

Comment 2 Maciej Szulik 2022-04-21 11:56:51 UTC
(In reply to W. Trevor King from comment #1)
> Machine pools not completing updates can keep the overall OCP-core update
> from completing, so adding UpgradeBlocker to get this into the assessment
> pipeline [1].  We're asking the following questions to evaluate whether or
> not this bug warrants blocking an upgrade edge from either the previous X.Y
> or X.Y.Z. The ultimate goal is to avoid delivering an update which
> introduces new risk or reduces cluster functionality in any way. Sample
> answers are provided to give more context and the ImpactStatementRequested
> label has been added to this bug. When responding, please remove
> ImpactStatementRequested and set the ImpactStatementProposed label. The
> expectation is that the assignee answers these questions.
> 
> Who is impacted? If we have to block upgrade edges based on this issue,
> which edges would need blocking?

Customers running all versions of 4.9 and 4.10. Although serious I don't think
it should require blocking edges. 


> What is the impact? Is it serious enough to warrant blocking edges?
> * example: Up to 2 minute disruption in edge routing
> * example: Up to 90 seconds of API downtime
> * example: etcd loses quorum and you have to restore from backup

In rare cases, job's pods might be left behind.


> How involved is remediation (even moderately serious impacts might be
> acceptable if they are easy to mitigate)?

Remove the beta functionality responsible for tracking jobs with finalizers.


> Is this a regression (if all previous versions were also vulnerable,
> updating to the new, vulnerable version does not increase exposure)?

Yes, it's a regression present already in 4.9, thus we're also backporting the fix there.

Comment 3 Maciej Szulik 2022-04-21 13:45:37 UTC
(In reply to Maciej Szulik from comment #2)

> Customers running all versions of 4.9 and 4.10. Although serious I don't
> think
> it should require blocking edges. 

This is wrong, this only applies to 4.10.

Comment 5 W. Trevor King 2022-04-21 21:55:29 UTC
Moving Version back to 4.10, so the fact that this impacts 4.10 (comment 3) is reflected in the bug's metadata.

Comment 6 W. Trevor King 2022-04-21 22:16:55 UTC
(In reply to Maciej Szulik from comment #2)
> (In reply to W. Trevor King from comment #1)
> > What is the impact? Is it serious enough to warrant blocking edges?
> > ...
> 
> In rare cases, job's pods might be left behind.

What's the connection to update completion?  Is it:

1. Race trips, pod left behind with a finalizer on it.
2. Machine-config tries to drain a node during update.
3. Drain sticks, waiting on the finalizer, but with the Job controller having forgotten about the job, nobody ever removes the finalizer.
4. Machine-config operator fails to update, because the node is not draining.
5. Cluster-version operator fails to update, because the machine-config operator is not updating.

If that's right, is there a recommended detection/recovery procedure?  Removing JobTrackingWithFinalizers helps folks who haven't been bit yet.  If you have a leaked pod, and you ask to update to 4.10.fixed, will the incoming job controller notice the leaked pod and remove it?  Or do admins have to take manual steps to remove finalizers?  Particular finalizers to look for?  Steps to remove them?

Comment 8 Maciej Szulik 2022-04-22 13:13:48 UTC
(In reply to W. Trevor King from comment #6)
> 
> What's the connection to update completion?  Is it:
> 
> 1. Race trips, pod left behind with a finalizer on it.
> 2. Machine-config tries to drain a node during update.
> 3. Drain sticks, waiting on the finalizer, but with the Job controller
> having forgotten about the job, nobody ever removes the finalizer.
> 4. Machine-config operator fails to update, because the node is not draining.
> 5. Cluster-version operator fails to update, because the machine-config
> operator is not updating.
> 
> If that's right, is there a recommended detection/recovery procedure? 
> Removing JobTrackingWithFinalizers helps folks who haven't been bit yet.  If
> you have a leaked pod, and you ask to update to 4.10.fixed, will the
> incoming job controller notice the leaked pod and remove it?  Or do admins
> have to take manual steps to remove finalizers?  Particular finalizers to
> look for?  Steps to remove them?

The current solution (disabling the feature) won't solve the problem, so the current approach is 
to manually remove both the pod and the job in question. Yes, it has to be performed
by cluster admin. The right approach is being discussed in https://github.com/kubernetes/kubernetes/pull/109486

Comment 12 Lalatendu Mohanty 2022-04-25 14:54:14 UTC
Removing the upgrade blocker as we are not planning to remove 4.9 to 4.10 edges because of this bug as we believe this race condition is not impacting many clusters

Comment 13 pfruth 2022-05-11 23:51:13 UTC
Hello,
I have a client who has a cluster that is being directly impacted by this bug.
As mentioned in comment #6

1. Race trips, pod left behind with a finalizer on it.
2. Machine-config tries to drain a node during update.
3. Drain sticks, waiting on the finalizer, but with the Job controller having forgotten about the job, nobody ever removes the finalizer.
4. Machine-config operator fails to update, because the node is not draining.
5. Cluster-version operator fails to update, because the machine-config operator is not updating.

Now the cluster is stuck in a condition where an upgrade is stalled because of this pod that cannot be deleted (due to the finalizer).
We are unable to delete the pod.
oc delete pod <pod>
Results in an indefinate hang.

We are unable to patch the pod to remove the metadata.finalizers.
oc -n ibm-common-services patch pod <pod> -p '{"metadata":{"finalizers":null}}'
Results in messages;
"Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 22 identical fields\n  \tPriority:         \u00260,\n  \tPreemptionPolicy: \u0026\"PreemptLowerPriority\",\n- \tDNSConfig:        \u0026core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: \"single-request-reopen\"}}},\n+ \tDNSConfig:        nil,\n  \tReadinessGates:   nil,\n  \tRuntimeClassName: nil,\n  \t... // 4 identical fields\n  }\n",


How can we proceed with cleanup of this pod, so that cluster upgrade may proceed?

Comment 14 W. Trevor King 2022-05-12 00:08:23 UTC
> We are unable to patch the pod to remove the metadata.finalizers...

[1] has:

  kubectl patch configmap/mymap \
    --type json \
    --patch='[ { "op": "remove", "path": "/metadata/finalizers" } ]'

For your example, that would be:

  $ oc -n ibm-common-services patch pod <pod> --type json -p '[{"op": "remove", "path": "/metadata/finalizers"}]'

does that fail the same way for pods?  If so, hopefully the workloads folks have some ideas.  Otherwise, controllers who are managing these finalizers must have figured out an incantation that works for pods...

[1]: https://kubernetes.io/blog/2021/05/14/using-finalizers-to-control-deletion/#understanding-finalizers

Comment 15 Maciej Szulik 2022-05-12 12:07:26 UTC
Looks like your patch operation is also trying to modify something else, can I see the output of the patch operation with -v=9?

Ideally it should contain a line like this:

I0512 14:04:43.872199   36505 request.go:1060] Request Body: {"metadata":{"finalizers":null}}

which only clears the finalizers and thus should not kick the validation, which in your case clearly shows it's trying to validate spec which must have been touched
and thus trigger the validation which is failing.

Comment 16 pfruth 2022-05-12 18:49:52 UTC
Thank you very much for your responses and guidance.
The link to the kubernetes.io docs (understanding finalizers) was very informative.

Rather than using oc cli, I've taken the approach of talking to the API server directly using curl (which is what oc cli ultimately does under the covers anyway).

-------------------------------------
#!/bin/bash
API_ENDPOINT=$(oc whoami --show-server)
API_TOKEN=$(oc whoami --show-token)
NAMESPACE="ibm-common-services"
POD="setup-job-q6l59"

curl -k -v -XPATCH ${API_ENDPOINT}/api/v1/namespaces/${NAMESPACE}/pods/${POD} \
     -H "Authorization: Bearer ${API_TOKEN}" \
     -H "Content-Type: application/json-patch+json" \
     -d '[{"op": "remove", "path": "/metadata/finalizers"}]'
-------------------------------------

Following is the resulting output returned from the curl;

*   Trying 134.187.101.9...
* TCP_NODELAY set
* Connected to <redacted> (<ip address redacted>) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=<redacted>
*  start date: Apr 30 03:21:53 2022 GMT
*  expire date: May 30 03:21:54 2022 GMT
*  issuer: OU=openshift; CN=kube-apiserver-lb-signer
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* Using Stream ID: 1 (easy handle 0x2aa765cc410)
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> PATCH /api/v1/namespaces/ibm-common-services/pods/setup-job-q6l59 HTTP/2
> Host: <redacted>:6443
> User-Agent: curl/7.61.1
> Accept: */*
> Authorization: Bearer <redacted>
> Content-Type: application/json-patch+json
> Content-Length: 50
>
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* We are completely uploaded and fine
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* Connection state changed (MAX_CONCURRENT_STREAMS == 2000)!
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
< HTTP/2 422
< audit-id: 14ecf10f-bd5b-4a48-921d-0b948a3f14f6
< cache-control: no-cache, private
< content-type: application/json
< x-kubernetes-pf-flowschema-uid: e2b4c47e-7890-4446-84ca-67c0d19e992f
< x-kubernetes-pf-prioritylevel-uid: e2f9ed23-7e42-479b-a739-051d9a920ee2
< content-length: 1770
< date: Thu, 12 May 2022 16:46:30 GMT
<
* TLSv1.3 (IN), TLS app data, [no content] (0):
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "Pod \"setup-job-q6l59\" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 22 identical fields\n  \tPriority:         \u00260,\n  \tPreemptionPolicy: \u0026\"PreemptLowerPriority\",\n- \tDNSConfig:        \u0026core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: \"single-request-reopen\"}}},\n+ \tDNSConfig:        nil,\n  \tReadinessGates:   nil,\n  \tRuntimeClassName: nil,\n  \t... // 4 identical fields\n  }\n",
  "reason": "Invalid",
  "details": {
    "name": "setup-job-q6l59",
    "kind": "Pod",
    "causes": [
      {
        "reason": "FieldValueForbidden",
        "message": "Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 22 identical fields\n  \tPriority:         \u00260,\n  \tPreemptionPolicy: \u0026\"PreemptLowerPriority\",\n- \tDNSConfig:        \u0026core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: \"single-request-reopen\"}}},\n+ \tDNSConfig:        nil,\n  \tReadinessGates:   nil,\n  \tRuntimeClassName: nil,\n  \t... // 4 identical fields\n  }\n",
        "field": "spec"
      }
    ]
  },
  "code": 422
* Connection #0 to host <redacted> left intact

-------------------------------------

Below is the output of 'oc get pod setup-job-setup-job-q6l59 -o yaml'

apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.129.2.81"
          ],
          "default": true,
          "dns": {}
      }]
    k8s.v1.cni.cncf.io/networks-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.129.2.81"
          ],
          "default": true,
          "dns": {}
      }]
    openshift.io/scc: restricted
    productID: 068a62892a1e4db39641342e592daa25
    productMetric: FREE
    productName: IBM Cloud Platform Common Services
    productVersion: 4.0.0
  creationTimestamp: "2022-04-20T16:36:28Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2022-04-20T16:52:26Z"
  finalizers:
  - batch.kubernetes.io/job-tracking
  generateName: setup-job-
  labels:
    app.kubernetes.io/instance: ibm-zen-setup-job
    app.kubernetes.io/managed-by: ibm-zen-operator
    app.kubernetes.io/name: ibm-zen-setup-job
    controller-uid: 75fe3f41-a2fd-4124-9971-242668e3ebad
    job-name: setup-job
  name: setup-job-q6l59
  namespace: ibm-common-services
  ownerReferences:
  - apiVersion: batch/v1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: setup-job
    uid: 75fe3f41-a2fd-4124-9971-242668e3ebad
  resourceVersion: "9267326"
  uid: d5929b91-dbe7-423b-9b28-21bcc9378b8e
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: beta.kubernetes.io/arch
            operator: In
            values:
            - amd64
            - s390x
            - ppc64le
        weight: 3
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: beta.kubernetes.io/arch
            operator: In
            values:
            - amd64
            - s390x
            - ppc64le
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/instance
              operator: In
              values:
              - setup-job
          topologyKey: kubernetes.io/hostname
        weight: 100
  containers:
  - command:
    - /bin/bash
    - -c
    - |
      /coreapi-server generate-cert meta-tls
      /coreapi-server generate-cert meta-jwt
      /coreapi-server create-secret -g token meta-api-broker-secret
    env:
    - name: ENABLE_JWT_CHECK
      value: "false"
    - name: ICPD_CONTROLPLANE_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/opencloudio/ibm-zen-operator@sha256:ad20054931ceb58f5e01285e8ea0e7038813cf9f25e69a9f22a76742237c3950
    imagePullPolicy: IfNotPresent
    name: create-secrets-job
    resources:
      limits:
        cpu: 500m
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 64Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
        - KILL
        - MKNOD
        - SETGID
        - SETUID
      privileged: false
      readOnlyRootFilesystem: false
      runAsNonRoot: true
      runAsUser: 1000680000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-b4h59
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: ibm-zen-operator-serviceaccount-dockercfg-vgkpc
  nodeName: <redacted>
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: OnFailure
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1000680000
    runAsNonRoot: true
    seLinuxOptions:
      level: s0:c26,c15
  serviceAccount: ibm-zen-operator-serviceaccount
  serviceAccountName: ibm-zen-operator-serviceaccount
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  volumes:
  - name: kube-api-access-b4h59
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          items:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T16:36:28Z"
    reason: PodCompleted
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T16:36:36Z"
    reason: PodCompleted
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T16:36:36Z"
    reason: PodCompleted
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T16:36:28Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://e47c22a1b07378db0978014d1258f1ca2698b923da94137d1f5a0341bd142b64
    image: quay.io/opencloudio/ibm-zen-operator@sha256:ad20054931ceb58f5e01285e8ea0e7038813cf9f25e69a9f22a76742237c3950
    imageID: quay.io/opencloudio/ibm-zen-operator@sha256:4f15bcda5f209dbcefbababb2af19fefd549c96d61b410bb88d655c55c3c3a8f
    lastState: {}
    name: create-secrets-job
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: cri-o://e47c22a1b07378db0978014d1258f1ca2698b923da94137d1f5a0341bd142b64
        exitCode: 0
        finishedAt: "2022-04-20T16:36:35Z"
        reason: Completed
        startedAt: "2022-04-20T16:36:33Z"
  hostIP: 134.187.101.13
  phase: Succeeded
  podIP: 10.129.2.81
  podIPs:
  - ip: 10.129.2.81
  qosClass: Burstable
  startTime: "2022-04-20T16:36:28Z"

-------------------------------------

Other key points worth noting, related to the Pod
- As you can see the pod resource's metadata contains a deletionTimestamp.  This is because of a prior attempt to do 'oc delete pod setup-job-q6l59 --force'
- As you can see the pod resource's metadata contains a ownerReferences.  The owning resource is a Job (name=setup-job).  The setup-job Job no longer exists.  Not sure who/what deleted the Job.  But, 'oc get jobs -n ibm-common-services' returns no jobs.
- We've also tried to reboot the node that this pod was once running on (to no effect).


So, we seem to be in a catch-22 here.  We can't delete the pod, due to the finalizer.  A previous attempt to delete the pod has put the pod resource in a read-only mode so-to-speak.  We can't remove the finalizer for some reason.  And we can't proceed with the cluster upgrade, due to this pod stalling the machine-config operator.

How do we get out of this conundrum?
Is there some [other] way to remove this pod resource?

Thank you for your help.

Comment 17 pfruth 2022-05-12 21:14:46 UTC
I addition to the information I posted above, in comment #16, I asked the client to use the oc cli to patch the Pod resource.

This command;
oc -v9 -n ibm-common-services patch pod setup-job-q6l59 --type json -p '[{"op": "remove", "path": "/metadata/finalizers"}]'


The entire output was more than could be captured in the scroll buffer.
But here is the contents that we were able to capture.

-------------------------------------
I0512 13:03:52.472563  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"certmanager.k8s.io/v1alpha1","resources":[{"name":"challenges","singularName":"challenge","namespaced":true,"kind":"Challenge","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"bWZ1nMKvEBk="},{"name":"certificaterequests","singularName":"certificaterequest","namespaced":true,"kind":"CertificateRequest","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"VVJL9VxvfUo="},{"name":"certificates","singularName":"certificate","namespaced":true,"kind":"Certificate","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["cert","certs"],"storageVersionHash":"x3gJkHOGQto="},{"name":"issuers","singularName":"issuer","namespaced":true,"kind":"Issuer","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"yQaBqq73YD0="},{"name":"clusterissuers","singularName":"clusterissuer","namespaced":false,"kind":"ClusterIssuer","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"reakstCytMM="},{"name":"orders","singularName":"order","namespaced":true,"kind":"Order","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"iAVaDBTPDXA="}]}
I0512 13:03:52.473415  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"objectbucket.io/v1alpha1","resources":[{"name":"objectbuckets","singularName":"objectbucket","namespaced":false,"kind":"ObjectBucket","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["ob","obs"],"storageVersionHash":"NaQ8ZctOcJc="},{"name":"objectbuckets/status","singularName":"","namespaced":false,"kind":"ObjectBucket","verbs":["get","patch","update"]},{"name":"objectbucketclaims","singularName":"objectbucketclaim","namespaced":true,"kind":"ObjectBucketClaim","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["obc","obcs"],"storageVersionHash":"dYJKtqGT4jk="},{"name":"objectbucketclaims/status","singularName":"","namespaced":true,"kind":"ObjectBucketClaim","verbs":["get","patch","update"]}]}
I0512 13:03:52.474458  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"noobaa.io/v1alpha1","resources":[{"name":"backingstores","singularName":"backingstore","namespaced":true,"kind":"BackingStore","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"wkRzwTPgcTk="},{"name":"backingstores/status","singularName":"","namespaced":true,"kind":"BackingStore","verbs":["get","patch","update"]},{"name":"noobaas","singularName":"noobaa","namespaced":true,"kind":"NooBaa","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["nb"],"storageVersionHash":"GgyxOn0bqB8="},{"name":"noobaas/status","singularName":"","namespaced":true,"kind":"NooBaa","verbs":["get","patch","update"]},{"name":"noobaaaccounts","singularName":"noobaaaccount","namespaced":true,"kind":"NooBaaAccount","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"OQCeR5qcmgM="},{"name":"noobaaaccounts/status","singularName":"","namespaced":true,"kind":"NooBaaAccount","verbs":["get","patch","update"]},{"name":"namespacestores","singularName":"namespacestore","namespaced":true,"kind":"NamespaceStore","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"pgjlX29VN2E="},{"name":"namespacestores/status","singularName":"","namespaced":true,"kind":"NamespaceStore","verbs":["get","patch","update"]},{"name":"bucketclasses","singularName":"bucketclass","namespaced":true,"kind":"BucketClass","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"TiW/391OMWo="},{"name":"bucketclasses/status","singularName":"","namespaced":true,"kind":"BucketClass","verbs":["get","patch","update"]}]}
I0512 13:03:52.475230  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"apps.openshift.io/v1","resources":[{"name":"deploymentconfigs","singularName":"","namespaced":true,"kind":"DeploymentConfig","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["dc"],"categories":["all"],"storageVersionHash":"6xyoXGsxfiA="},{"name":"deploymentconfigs/instantiate","singularName":"","namespaced":true,"kind":"DeploymentRequest","verbs":["create"]},{"name":"deploymentconfigs/log","singularName":"","namespaced":true,"kind":"DeploymentLog","verbs":["get"]},{"name":"deploymentconfigs/rollback","singularName":"","namespaced":true,"kind":"DeploymentConfigRollback","verbs":["create"]},{"name":"deploymentconfigs/scale","singularName":"","namespaced":true,"group":"extensions","version":"v1beta1","kind":"Scale","verbs":["get","patch","update"]},{"name":"deploymentconfigs/status","singularName":"","namespaced":true,"kind":"DeploymentConfig","verbs":["get","patch","update"]}]}
I0512 13:03:52.475923  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"oauth.openshift.io/v1","resources":[{"name":"oauthaccesstokens","singularName":"","namespaced":false,"kind":"OAuthAccessToken","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"Hss+eIbWaq4="},{"name":"oauthauthorizetokens","singularName":"","namespaced":false,"kind":"OAuthAuthorizeToken","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"boIUgj6jo8I="},{"name":"oauthclientauthorizations","singularName":"","namespaced":false,"kind":"OAuthClientAuthorization","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"Q0jsEj2AFkk="},{"name":"oauthclients","singularName":"","namespaced":false,"kind":"OAuthClient","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"8NZGiGeCmFo="},{"name":"tokenreviews","singularName":"","namespaced":false,"group":"authentication.k8s.io","version":"v1","kind":"TokenReview","verbs":["create"]},{"name":"useroauthaccesstokens","singularName":"","namespaced":false,"kind":"UserOAuthAccessToken","verbs":["delete","get","list","watch"]}]}
I0512 13:03:52.479052  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"core.automation.ibm.com/v1beta1","resources":[{"name":"cartridges","singularName":"cartridge","namespaced":true,"kind":"Cartridge","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"arl8QZRuF2A="},{"name":"cartridges/status","singularName":"","namespaced":true,"kind":"Cartridge","verbs":["get","patch","update"]},{"name":"automationuiconfigs","singularName":"automationuiconfig","namespaced":true,"kind":"AutomationUIConfig","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"GLBR8jRpzSU="},{"name":"automationuiconfigs/status","singularName":"","namespaced":true,"kind":"AutomationUIConfig","verbs":["get","patch","update"]}]}
I0512 13:03:52.479814  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"appconnect.ibm.com/v1beta1","resources":[{"name":"traces","singularName":"trace","namespaced":true,"kind":"Trace","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"categories":["all","ace","cp4i","integration"],"storageVersionHash":"NbsQ04SfuVU="},{"name":"traces/status","singularName":"","namespaced":true,"kind":"Trace","verbs":["get","patch","update"]},{"name":"integrationservers","singularName":"integrationserver","namespaced":true,"kind":"IntegrationServer","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"categories":["all","ace","cp4i","integration"],"storageVersionHash":"UaMtSxHwIaw="},{"name":"integrationservers/status","singularName":"","namespaced":true,"kind":"IntegrationServer","verbs":["get","patch","update"]},{"name":"integrationservers/scale","singularName":"","namespaced":true,"group":"autoscaling","version":"v1","kind":"Scale","verbs":["get","patch","update"]},{"name":"switchservers","singularName":"switchserver","namespaced":true,"kind":"SwitchServer","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"categories":["all","ace","cp4i","integration"],"storageVersionHash":"G0E1AEp9b3Y="},{"name":"switchservers/status","singularName":"","namespaced":true,"kind":"SwitchServer","verbs":["get","patch","update"]},{"name":"configurations","singularName":"configuration","namespaced":true,"kind":"Configuration","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["isconfig","isconfigs","aceconfig","aceconfigs"],"categories":["all","ace","cp4i"],"storageVersionHash":"CtDlk4IRSm8="},{"name":"configurations/status","singularName":"","namespaced":true,"kind":"Configuration","verbs":["get","patch","update"]},{"name":"dashboards","singularName":"dashboard","namespaced":true,"kind":"Dashboard","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"categories":["all","ace","cp4i","integration"],"storageVersionHash":"ny8z/EoLRMs="},{"name":"dashboards/status","singularName":"","namespaced":true,"kind":"Dashboard","verbs":["get","patch","update"]},{"name":"dashboards/scale","singularName":"","namespaced":true,"group":"autoscaling","version":"v1","kind":"Scale","verbs":["get","patch","update"]},{"name":"designerauthorings","singularName":"designerauthoring","namespaced":true,"kind":"DesignerAuthoring","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"categories":["all","ace","cp4i","integration"],"storageVersionHash":"XsNk5+9/3yg="},{"name":"designerauthorings/status","singularName":"","namespaced":true,"kind":"DesignerAuthoring","verbs":["get","patch","update"]}]}
I0512 13:03:52.480720  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"route.openshift.io/v1","resources":[{"name":"routes","singularName":"","namespaced":true,"kind":"Route","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"categories":["all"],"storageVersionHash":"jo1F7GfnXH4="},{"name":"routes/status","singularName":"","namespaced":true,"kind":"Route","verbs":["get","patch","update"]}]}
I0512 13:03:52.481527  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"snapshot.storage.k8s.io/v1beta1","resources":[{"name":"volumesnapshots","singularName":"volumesnapshot","namespaced":true,"kind":"VolumeSnapshot","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["vs"],"storageVersionHash":"CBkCoNN1OpA="},{"name":"volumesnapshots/status","singularName":"","namespaced":true,"kind":"VolumeSnapshot","verbs":["get","patch","update"]},{"name":"volumesnapshotcontents","singularName":"volumesnapshotcontent","namespaced":false,"kind":"VolumeSnapshotContent","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["vsc","vscs"],"storageVersionHash":"ksBm2JR3lFg="},{"name":"volumesnapshotcontents/status","singularName":"","namespaced":false,"kind":"VolumeSnapshotContent","verbs":["get","patch","update"]},{"name":"volumesnapshotclasses","singularName":"volumesnapshotclass","namespaced":false,"kind":"VolumeSnapshotClass","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["vsclass","vsclasses"],"storageVersionHash":"agjrTDFWe0I="}]}
I0512 13:03:52.482566  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"whereabouts.cni.cncf.io/v1alpha1","resources":[{"name":"ippools","singularName":"ippool","namespaced":true,"kind":"IPPool","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"XbcZnDfxtSo="},{"name":"overlappingrangeipreservations","singularName":"overlappingrangeipreservation","namespaced":true,"kind":"OverlappingRangeIPReservation","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"QAFn0NjLYV0="}]}
I0512 13:03:52.483390  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"helm.openshift.io/v1beta1","resources":[{"name":"projecthelmchartrepositories","singularName":"projecthelmchartrepository","namespaced":true,"kind":"ProjectHelmChartRepository","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"u99qLV2ifAE="},{"name":"projecthelmchartrepositories/status","singularName":"","namespaced":true,"kind":"ProjectHelmChartRepository","verbs":["get","patch","update"]},{"name":"helmchartrepositories","singularName":"helmchartrepository","namespaced":false,"kind":"HelmChartRepository","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"ufyXG8CiCus="},{"name":"helmchartrepositories/status","singularName":"","namespaced":false,"kind":"HelmChartRepository","verbs":["get","patch","update"]}]}
I0512 13:03:52.491041  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"template.openshift.io/v1","resources":[{"name":"brokertemplateinstances","singularName":"","namespaced":false,"kind":"BrokerTemplateInstance","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"Jxj8HlN0pXU="},{"name":"processedtemplates","singularName":"","namespaced":true,"kind":"Template","verbs":["create"]},{"name":"templateinstances","singularName":"","namespaced":true,"kind":"TemplateInstance","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"Q8UwfyPqly4="},{"name":"templateinstances/status","singularName":"","namespaced":true,"kind":"TemplateInstance","verbs":["get","patch","update"]},{"name":"templates","singularName":"","namespaced":true,"kind":"Template","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"utuWisMumJk="}]}
I0512 13:03:52.491864  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"build.openshift.io/v1","resources":[{"name":"buildconfigs","singularName":"","namespaced":true,"kind":"BuildConfig","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["bc"],"categories":["all"],"storageVersionHash":"2ooe95Eo6Y4="},{"name":"buildconfigs/instantiate","singularName":"","namespaced":true,"kind":"BuildRequest","verbs":["create"]},{"name":"buildconfigs/instantiatebinary","singularName":"","namespaced":true,"kind":"BinaryBuildRequestOptions","verbs":["create"]},{"name":"buildconfigs/webhooks","singularName":"","namespaced":true,"kind":"Build","verbs":["create"]},{"name":"builds","singularName":"","namespaced":true,"kind":"Build","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"categories":["all"],"storageVersionHash":"Iv+mFnvyzFc="},{"name":"builds/clone","singularName":"","namespaced":true,"kind":"BuildRequest","verbs":["create"]},{"name":"builds/details","singularName":"","namespaced":true,"kind":"Build","verbs":["update"]},{"name":"builds/log","singularName":"","namespaced":true,"kind":"BuildLog","verbs":["get"]}]}
I0512 13:03:52.492704  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"security.openshift.io/v1","resources":[{"name":"podsecuritypolicyreviews","singularName":"","namespaced":true,"kind":"PodSecurityPolicyReview","verbs":["create"]},{"name":"podsecuritypolicyselfsubjectreviews","singularName":"","namespaced":true,"kind":"PodSecurityPolicySelfSubjectReview","verbs":["create"]},{"name":"podsecuritypolicysubjectreviews","singularName":"","namespaced":true,"kind":"PodSecurityPolicySubjectReview","verbs":["create"]},{"name":"rangeallocations","singularName":"","namespaced":false,"kind":"RangeAllocation","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"8/BJx/M8Ga0="},{"name":"securitycontextconstraints","singularName":"","namespaced":false,"kind":"SecurityContextConstraints","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["scc"]}]}
I0512 13:03:52.495123  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"packages.operators.coreos.com/v1","resources":[{"name":"packagemanifests","singularName":"","namespaced":true,"kind":"PackageManifest","verbs":["get","list"]},{"name":"packagemanifests/icon","singularName":"","namespaced":true,"kind":"PackageManifest","verbs":["get"]}]}
I0512 13:03:52.497106  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"integration.ibm.com/v1beta1","resources":[{"name":"platformnavigators","singularName":"platformnavigator","namespaced":true,"kind":"PlatformNavigator","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["pn","pns","navigator","navigators"],"categories":["all","cp4i","integration"],"storageVersionHash":"pn3HZKA9wOE="},{"name":"platformnavigators/status","singularName":"","namespaced":true,"kind":"PlatformNavigator","verbs":["get","patch","update"]}]}
I0512 13:03:52.498144  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"machine.openshift.io/v1beta1","resources":[{"name":"machinesets","singularName":"machineset","namespaced":true,"kind":"MachineSet","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"p7ZEIilFKZ0="},{"name":"machinesets/status","singularName":"","namespaced":true,"kind":"MachineSet","verbs":["get","patch","update"]},{"name":"machinesets/scale","singularName":"","namespaced":true,"group":"autoscaling","version":"v1","kind":"Scale","verbs":["get","patch","update"]},{"name":"machines","singularName":"machine","namespaced":true,"kind":"Machine","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"storageVersionHash":"GLEf6/W/4xY="},{"name":"machines/status","singularName":"","namespaced":true,"kind":"Machine","verbs":["get","patch","update"]},{"name":"machinehealthchecks","singularName":"machinehealthcheck","namespaced":true,"kind":"MachineHealthCheck","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["mhc","mhcs"],"storageVersionHash":"qvQyPZn5SE8="},{"name":"machinehealthchecks/status","singularName":"","namespaced":true,"kind":"MachineHealthCheck","verbs":["get","patch","update"]}]}
I0512 13:03:52.499497  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"mq.ibm.com/v1beta1","resources":[{"name":"queuemanagers","singularName":"queuemanager","namespaced":true,"kind":"QueueManager","verbs":["delete","deletecollection","get","list","patch","create","update","watch"],"shortNames":["qmgr"],"categories":["all","integration","cp4i"],"storageVersionHash":"jssIpoBPi44="},{"name":"queuemanagers/status","singularName":"","namespaced":true,"kind":"QueueManager","verbs":["get","patch","update"]}]}
I0512 13:03:52.501251  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"authorization.openshift.io/v1","resources":[{"name":"clusterrolebindings","singularName":"","namespaced":false,"kind":"ClusterRoleBinding","verbs":["create","delete","get","list","patch","update"]},{"name":"clusterroles","singularName":"","namespaced":false,"kind":"ClusterRole","verbs":["create","delete","get","list","patch","update"]},{"name":"localresourceaccessreviews","singularName":"","namespaced":true,"kind":"LocalResourceAccessReview","verbs":["create"]},{"name":"localsubjectaccessreviews","singularName":"","namespaced":true,"kind":"LocalSubjectAccessReview","verbs":["create"]},{"name":"resourceaccessreviews","singularName":"","namespaced":false,"kind":"ResourceAccessReview","verbs":["create"]},{"name":"rolebindingrestrictions","singularName":"","namespaced":true,"kind":"RoleBindingRestriction","verbs":["create","delete","deletecollection","get","list","patch","update","watch"]},{"name":"rolebindings","singularName":"","namespaced":true,"kind":"RoleBinding","verbs":["create","delete","get","list","patch","update"]},{"name":"roles","singularName":"","namespaced":true,"kind":"Role","verbs":["create","delete","get","list","patch","update"]},{"name":"selfsubjectrulesreviews","singularName":"","namespaced":true,"kind":"SelfSubjectRulesReview","verbs":["create"]},{"name":"subjectaccessreviews","singularName":"","namespaced":false,"kind":"SubjectAccessReview","verbs":["create"]},{"name":"subjectrulesreviews","singularName":"","namespaced":true,"kind":"SubjectRulesReview","verbs":["create"]}]}
I0512 13:03:52.505799  258184 request.go:1181] Response Body: {"kind":"APIResourceList","apiVersion":"v1","groupVersion":"user.openshift.io/v1","resources":[{"name":"groups","singularName":"","namespaced":false,"kind":"Group","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"YhnYEx+IpBw="},{"name":"identities","singularName":"","namespaced":false,"kind":"Identity","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"/S8uaPXlz9c="},{"name":"useridentitymappings","singularName":"","namespaced":false,"kind":"UserIdentityMapping","verbs":["create","delete","get","patch","update"]},{"name":"users","singularName":"","namespaced":false,"kind":"User","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"storageVersionHash":"Hbwv6GYPwoQ="}]}
I0512 13:03:52.597848  258184 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json" -H "User-Agent: oc/4.10.0 (linux/s390x) kubernetes/6de42bd" -H "Authorization: Bearer <masked>" 'https://<redacted>:6443/api/v1/namespaces/ibm-common-services/pods/setup-job-q6l59'
I0512 13:03:52.609416  258184 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 11 ms Duration 11 ms
I0512 13:03:52.609428  258184 round_trippers.go:577] Response Headers:
I0512 13:03:52.609435  258184 round_trippers.go:580]     Audit-Id: 608e8a06-2e53-4711-bfd0-1ec6c90b5204
I0512 13:03:52.609444  258184 round_trippers.go:580]     Cache-Control: no-cache, private
I0512 13:03:52.609456  258184 round_trippers.go:580]     Content-Type: application/json
I0512 13:03:52.609465  258184 round_trippers.go:580]     X-Kubernetes-Pf-Flowschema-Uid: e2b4c47e-7890-4446-84ca-67c0d19e992f
I0512 13:03:52.609473  258184 round_trippers.go:580]     X-Kubernetes-Pf-Prioritylevel-Uid: e2f9ed23-7e42-479b-a739-051d9a920ee2
I0512 13:03:52.609486  258184 round_trippers.go:580]     Date: Thu, 12 May 2022 20:03:52 GMT
I0512 13:03:52.609584  258184 request.go:1181] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"setup-job-q6l59","generateName":"setup-job-","namespace":"ibm-common-services","uid":"d5929b91-dbe7-423b-9b28-21bcc9378b8e","resourceVersion":"9267326","creationTimestamp":"2022-04-20T16:36:28Z","deletionTimestamp":"2022-04-20T16:52:26Z","deletionGracePeriodSeconds":0,"labels":{"app.kubernetes.io/instance":"ibm-zen-setup-job","app.kubernetes.io/managed-by":"ibm-zen-operator","app.kubernetes.io/name":"ibm-zen-setup-job","controller-uid":"75fe3f41-a2fd-4124-9971-242668e3ebad","job-name":"setup-job"},"annotations":{"k8s.v1.cni.cncf.io/network-status":"[{\n    \"name\": \"openshift-sdn\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.129.2.81\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]","k8s.v1.cni.cncf.io/networks-status":"[{\n    \"name\": \"openshift-sdn\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.129.2.81\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]","openshift.io/scc":"restricted","productID":"068a62892a1e4db39641342e592daa25","productMetric":"FREE","productName":"IBM Cloud Platform Common Services","productVersion":"4.0.0"},"ownerReferences":[{"apiVersion":"batch/v1","kind":"Job","name":"setup-job","uid":"75fe3f41-a2fd-4124-9971-242668e3ebad","controller":true,"blockOwnerDeletion":true}],"finalizers":["batch.kubernetes.io/job-tracking"],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2022-04-20T16:36:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:productID":{},"f:productMetric":{},"f:productName":{},"f:productVersion":{}},"f:finalizers":{".":{},"v:\"batch.kubernetes.io/job-tracking\"":{}},"f:generateName":{},"f:labels":{".":{},"f:app.kubernetes.io/instance":{},"f:app.kubernetes.io/managed-by":{},"f:app.kubernetes.io/name":{},"f:controller-uid":{},"f:job-name":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"75fe3f41-a2fd-4124-9971-242668e3ebad\"}":{}}},"f:spec":{"f:affinity":{".":{},"f:nodeAffinity":{".":{},"f:preferredDuringSchedulingIgnoredDuringExecution":{},"f:requiredDuringSchedulingIgnoredDuringExecution":{}},"f:podAntiAffinity":{".":{},"f:preferredDuringSchedulingIgnoredDuringExecution":{}}},"f:containers":{"k:{\"name\":\"create-secrets-job\"}":{".":{},"f:command":{},"f:env":{".":{},"k:{\"name\":\"ENABLE_JWT_CHECK\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"ICPD_CONTROLPLANE_NAMESPACE\"}":{".":{},"f:name":{},"f:valueFrom":{".":{},"f:fieldRef":{}}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{".":{},"f:limits":{".":{},"f:cpu":{},"f:memory":{}},"f:requests":{".":{},"f:cpu":{},"f:memory":{}}},"f:securityContext":{".":{},"f:allowPrivilegeEscalation":{},"f:capabilities":{".":{},"f:drop":{}},"f:privileged":{},"f:readOnlyRootFilesystem":{},"f:runAsNonRoot":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{".":{},"f:runAsNonRoot":{}},"f:serviceAccount":{},"f:serviceAccountName":{},"f:terminationGracePeriodSeconds":{}}}},{"manager":"multus","operation":"Update","apiVersion":"v1","time":"2022-04-20T16:36:33Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{"f:k8s.v1.cni.cncf.io/network-status":{},"f:k8s.v1.cni.cncf.io/networks-status":{}}}},"subresource":"status"},{"manager":"Go-http-client","operation":"Update","apiVersion":"v1","time":"2022-04-20T16:36:36Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:reason":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:reason":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:reason":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:phase":{},"f:podIP":{},"f:podIPs":{".":{},"k:{\"ip\":\"10.129.2.81\"}":{".":{},"f:ip":{}}},"f:startTime":{}}},"subresource":"status"}]},"spec":{"volumes":[{"name":"kube-api-access-b4h59","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}},{"configMap":{"name":"openshift-service-ca.crt","items":[{"key":"service-ca.crt","path":"service-ca.crt"}]}}],"defaultMode":420}}],"containers":[{"name":"create-secrets-job","image":quay.io/opencloudio/ibm-zen-operator@sha256:ad20054931ceb58f5e01285e8ea0e7038813cf9f25e69a9f22a76742237c3950,"command":["/bin/bash","-c","/coreapi-server generate-cert meta-tls\n/coreapi-server generate-cert meta-jwt\n/coreapi-server create-secret -g token meta-api-broker-secret\n"],"env":[{"name":"ENABLE_JWT_CHECK","value":"false"},{"name":"ICPD_CONTROLPLANE_NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}}],"resources":{"limits":{"cpu":"500m","memory":"128Mi"},"requests":{"cpu":"100m","memory":"64Mi"}},"volumeMounts":[{"name":"kube-api-access-b4h59","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent","securityContext":{"capabilities":{"drop":["ALL","KILL","MKNOD","SETGID","SETUID"]},"privileged":false,"runAsUser":1000680000,"runAsNonRoot":true,"readOnlyRootFilesystem":false,"allowPrivilegeEscalation":false}}],"restartPolicy":"OnFailure","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"ibm-zen-operator-serviceaccount","serviceAccount":"ibm-zen-operator-serviceaccount","nodeName":"<redatcted>","securityContext":{"seLinuxOptions":{"level":"s0:c26,c15"},"runAsNonRoot":true,"fsGroup":1000680000},"imagePullSecrets":[{"name":"ibm-zen-operator-serviceaccount-dockercfg-vgkpc"}],"affinity":{"nodeAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"beta.kubernetes.io/arch","operator":"In","values":["amd64","s390x","ppc64le"]}]}]},"preferredDuringSchedulingIgnoredDuringExecution":[{"weight":3,"preference":{"matchExpressions":[{"key":"beta.kubernetes.io/arch","operator":"In","values":["amd64","s390x","ppc64le"]}]}}]},"podAntiAffinity":{"preferredDuringSchedulingIgnoredDuringExecution":[{"weight":100,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"app.kubernetes.io/instance","operator":"In","values":["setup-job"]}]},"topologyKey":"kubernetes.io/hostname"}}]}},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/memory-pressure","operator":"Exists","effect":"NoSchedule"}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Succeeded","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-04-20T16:36:28Z","reason":"PodCompleted"},{"type":"Ready","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-04-20T16:36:36Z","reason":"PodCompleted"},{"type":"ContainersReady","status":"False","lastProbeTime":null,"lastTransitionTime":"2022-04-20T16:36:36Z","reason":"PodCompleted"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2022-04-20T16:36:28Z"}],"hostIP":"<redacted>","podIP":"10.129.2.81","podIPs":[{"ip":"10.129.2.81"}],"startTime":"2022-04-20T16:36:28Z","containerStatuses":[{"name":"create-secrets-job","state":{"terminated":{"exitCode":0,"reason":"Completed","startedAt":"2022-04-20T16:36:33Z","finishedAt":"2022-04-20T16:36:35Z","containerID":"cri-o://e47c22a1b07378db0978014d1258f1ca2698b923da94137d1f5a0341bd142b64"}},"lastState":{},"ready":false,"restartCount":0,"image":quay.io/opencloudio/ibm-zen-operator@sha256:ad20054931ceb58f5e01285e8ea0e7038813cf9f25e69a9f22a76742237c3950,"imageID":quay.io/opencloudio/ibm-zen-operator@sha256:4f15bcda5f209dbcefbababb2af19fefd549c96d61b410bb88d655c55c3c3a8f,"containerID":"cri-o://e47c22a1b07378db0978014d1258f1ca2698b923da94137d1f5a0341bd142b64","started":false}],"qosClass":"Burstable"}}
I0512 13:03:52.610239  258184 request.go:1181] Request Body: [{"op":"remove","path":"/metadata/finalizers"}]
I0512 13:03:52.610298  258184 round_trippers.go:466] curl -v -XPATCH  -H "Accept: application/json" -H "User-Agent: oc/4.10.0 (linux/s390x) kubernetes/6de42bd" -H "Content-Type: application/json-patch+json" -H "Authorization: Bearer <masked>" 'https://<redacted>:6443/api/v1/namespaces/ibm-common-services/pods/setup-job-q6l59?fieldManager=kubectl-patch'
I0512 13:03:52.643480  258184 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 33 ms Duration 33 ms
I0512 13:03:52.643491  258184 round_trippers.go:577] Response Headers:
I0512 13:03:52.643499  258184 round_trippers.go:580]     Date: Thu, 12 May 2022 20:03:52 GMT
I0512 13:03:52.643509  258184 round_trippers.go:580]     Audit-Id: 77aed7bd-c3b5-4e4d-bceb-e28cb626bf71
I0512 13:03:52.643531  258184 round_trippers.go:580]     Cache-Control: no-cache, private
I0512 13:03:52.643540  258184 round_trippers.go:580]     Content-Type: application/json
I0512 13:03:52.643552  258184 round_trippers.go:580]     X-Kubernetes-Pf-Flowschema-Uid: e2b4c47e-7890-4446-84ca-67c0d19e992f
I0512 13:03:52.643561  258184 round_trippers.go:580]     X-Kubernetes-Pf-Prioritylevel-Uid: e2f9ed23-7e42-479b-a739-051d9a920ee2
I0512 13:03:52.643572  258184 round_trippers.go:580]     Content-Length: 1668
I0512 13:03:52.643599  258184 request.go:1181] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod \"setup-job-q6l59\" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 22 identical fields\n  \tPriority:         \u00260,\n  \tPreemptionPolicy: \u0026\"PreemptLowerPriority\",\n- \tDNSConfig:        \u0026core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: \"single-request-reopen\"}}},\n+ \tDNSConfig:        nil,\n  \tReadinessGates:   nil,\n  \tRuntimeClassName: nil,\n  \t... // 4 identical fields\n  }\n","reason":"Invalid","details":{"name":"setup-job-q6l59","kind":"Pod","causes":[{"reason":"FieldValueForbidden","message":"Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)\n  core.PodSpec{\n  \t... // 22 identical fields\n  \tPriority:         \u00260,\n  \tPreemptionPolicy: \u0026\"PreemptLowerPriority\",\n- \tDNSConfig:        \u0026core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: \"single-request-reopen\"}}},\n+ \tDNSConfig:        nil,\n  \tReadinessGates:   nil,\n  \tRuntimeClassName: nil,\n  \t... // 4 identical fields\n  }\n","field":"spec"}]},"code":422}
F0512 13:03:52.643843  258184 helpers.go:118] The Pod "setup-job-q6l59" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)
  core.PodSpec{
        ... // 22 identical fields
        Priority:         &0,
        PreemptionPolicy: &"PreemptLowerPriority",
-       DNSConfig:        &core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: "single-request-reopen"}}},
+       DNSConfig:        nil,
        ReadinessGates:   nil,
        RuntimeClassName: nil,
        ... // 4 identical fields
  }

goroutine 1 [running]:
k8s.io/klog/v2.stacks(0x1)
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:1140 +0xbe
k8s.io/klog/v2.(*loggingT).output(0x85628160, 0x3, 0x0, 0xc000522850, 0x2, {0x845fae41, 0xa}, 0x76, 0x0)
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:1088 +0x6da
k8s.io/klog/v2.(*loggingT).printDepth(0x85628160, 0x3, 0x0, {0x0, 0x0}, 0x2, {0xc0006ce890, 0x1, 0x1})
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:735 +0x200
k8s.io/klog/v2.FatalDepth(...)
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:1628
k8s.io/kubectl/pkg/cmd/util.fatal({0xc000b90b00, 0x2bf}, 0x1)
        /go/src/github.com/openshift/oc/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:96 +0xfa
k8s.io/kubectl/pkg/cmd/util.checkErr({0x83e07340, 0xc000c4a8c0}, 0x83afe0a8)
        /go/src/github.com/openshift/oc/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:160 +0x362
k8s.io/kubectl/pkg/cmd/util.CheckErr(...)
        /go/src/github.com/openshift/oc/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:118
k8s.io/kubectl/pkg/cmd/patch.NewCmdPatch.func1(0xc000bbf400, {0xc000810360, 0x2, 0x9})
        /go/src/github.com/openshift/oc/vendor/k8s.io/kubectl/pkg/cmd/patch/patch.go:122 +0xd8
github.com/spf13/cobra.(*Command).execute(0xc000bbf400, {0xc0008102d0, 0x9, 0x9})
        /go/src/github.com/openshift/oc/vendor/github.com/spf13/cobra/command.go:860 +0x762
github.com/spf13/cobra.(*Command).ExecuteC(0xc00077af00)
        /go/src/github.com/openshift/oc/vendor/github.com/spf13/cobra/command.go:974 +0x4c8
github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/github.com/openshift/oc/vendor/github.com/spf13/cobra/command.go:902
k8s.io/component-base/cli.run(0xc00077af00)
        /go/src/github.com/openshift/oc/vendor/k8s.io/component-base/cli/run.go:146 +0x390
k8s.io/component-base/cli.RunNoErrOutput(...)
        /go/src/github.com/openshift/oc/vendor/k8s.io/component-base/cli/run.go:84
main.main()
        /go/src/github.com/openshift/oc/cmd/oc/oc.go:79 +0x504

goroutine 6 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x85628160)
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:1283 +0x90
created by k8s.io/klog/v2.init.0
        /go/src/github.com/openshift/oc/vendor/k8s.io/klog/v2/klog.go:420 +0x120

goroutine 25 [select]:
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x83afdf48, {0x83e07640, 0xc000c25590}, 0x1, 0xc0000a6360)
        /go/src/github.com/openshift/oc/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:167 +0x174
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x83afdf48, 0x12a05f200, 0x0, 0x1, 0xc0000a6360)
        /go/src/github.com/openshift/oc/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x92
k8s.io/apimachinery/pkg/util/wait.Until(...)
        /go/src/github.com/openshift/oc/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90
k8s.io/apimachinery/pkg/util/wait.Forever(0x83afdf48, 0x12a05f200)
        /go/src/github.com/openshift/oc/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:81 +0x48
created by k8s.io/component-base/logs.InitLogs
        /go/src/github.com/openshift/oc/vendor/k8s.io/component-base/logs/logs.go:179 +0x80

goroutine 24 [select]:
io.(*pipe).Read(0xc000aacc60, {0xc000c5e000, 0x1000, 0x1000})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/io/pipe.go:57 +0xb0
io.(*PipeReader).Read(0xc000362618, {0xc000c5e000, 0x1000, 0x1000})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/io/pipe.go:134 +0x42
bufio.(*Scanner).Scan(0xc000b58080)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/bufio/scan.go:215 +0x9fc
github.com/openshift/oc/pkg/cli/admin/mustgather.newPrefixWriter.func1(0xc000b58080, {0x83e08ec0, 0xc000010018}, {0x837a596e, 0x17})
        /go/src/github.com/openshift/oc/pkg/cli/admin/mustgather/mustgather.go:527 +0x108
created by github.com/openshift/oc/pkg/cli/admin/mustgather.newPrefixWriter
        /go/src/github.com/openshift/oc/pkg/cli/admin/mustgather/mustgather.go:526 +0x23c

goroutine 33 [IO wait]:
internal/poll.runtime_pollWait(0x3ff976a2498, 0x72)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/runtime/netpoll.go:234 +0xb4
internal/poll.(*pollDesc).wait(0xc0004a7e18, 0x72, 0x0)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/internal/poll/fd_poll_runtime.go:84 +0x3c
internal/poll.(*pollDesc).waitRead(...)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0004a7e00, {0xc000a48000, 0x785a, 0x785a})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/internal/poll/fd_unix.go:167 +0x23a
net.(*netFD).Read(0xc0004a7e00, {0xc000a48000, 0x785a, 0x785a})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/net/fd_posix.go:56 +0x42
net.(*conn).Read(0xc000362010, {0xc000a48000, 0x785a, 0x785a})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/net/net.go:183 +0x52
crypto/tls.(*atLeastReader).Read(0xc00088c8b8, {0xc000a48000, 0x785a, 0x785a})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/crypto/tls/conn.go:777 +0x5e
bytes.(*Buffer).ReadFrom(0xc000be5af8, {0x83e03100, 0xc00088c8b8})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/bytes/buffer.go:204 +0xb8
crypto/tls.(*Conn).readFromUntil(0xc000be5880, {0x83e089e0, 0xc000362010}, 0x5)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/crypto/tls/conn.go:799 +0x100
crypto/tls.(*Conn).readRecordOrCCS(0xc000be5880, 0x0)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/crypto/tls/conn.go:606 +0xf4
crypto/tls.(*Conn).readRecord(...)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/crypto/tls/conn.go:574
crypto/tls.(*Conn).Read(0xc000be5880, {0xc00001d000, 0x1000, 0x1000})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/crypto/tls/conn.go:1277 +0x166
bufio.(*Reader).Read(0xc000433a40, {0xc00013aac0, 0x9, 0x9})
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/bufio/bufio.go:227 +0x26a
io.ReadAtLeast({0x83e02e60, 0xc000433a40}, {0xc00013aac0, 0x9, 0x9}, 0x9)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/io/io.go:328 +0xa6
io.ReadFull(...)
        /opt/rh/go-toolset-1.17/root/usr/lib/go-toolset-1.17-golang/src/io/io.go:347
golang.org/x/net/http2.readFrameHeader({0xc00013aac0, 0x9, 0x9}, {0x83e02e60, 0xc000433a40})
        /go/src/github.com/openshift/oc/vendor/golang.org/x/net/http2/frame.go:237 +0x5e
golang.org/x/net/http2.(*Framer).ReadFrame(0xc00013aa80)
        /go/src/github.com/openshift/oc/vendor/golang.org/x/net/http2/frame.go:498 +0xac
golang.org/x/net/http2.(*clientConnReadLoop).run(0xc000997fb8)
        /go/src/github.com/openshift/oc/vendor/golang.org/x/net/http2/transport.go:2101 +0x156
golang.org/x/net/http2.(*ClientConn).readLoop(0xc000051080)
        /go/src/github.com/openshift/oc/vendor/golang.org/x/net/http2/transport.go:1997 +0x5a
created by golang.org/x/net/http2.(*Transport).newClientConn
        /go/src/github.com/openshift/oc/vendor/golang.org/x/net/http2/transport.go:725 +0xbdc

Comment 18 Maciej Szulik 2022-05-13 14:15:25 UTC
This piece you've pointed out:

-       DNSConfig:        &core.PodDNSConfig{Options: []core.PodDNSConfigOption{{Name: "single-request-reopen"}}},
+       DNSConfig:        nil,

might be the root cause of the problem. Does the customer have some admission webhooks enabled which modify PodSpec,
specifically as presented above the DNSConfig? That seems like something they must have enabled recently and the 
problematic pod is falling on it.

Comment 19 pfruth 2022-05-16 17:21:30 UTC
> Does the customer have some admission webhooks enabled

Yes, there is a MutatingWebhookConfiguration defined... called - ibm-common-service-webhook-configuration
I had the client make backup of that webhook and then delete it.
Good news!! That has finally unblocked the log jam.  The pod that was in terminating status is now terminated/gone.  And, the cluster upgrade has finally completed.

Thank you for all your help.

Going forward, it appears the fix for the race condition that ultimately got us to this point (ie. the switch to using finalizers to track job completion status in the kube Job controller code) is incorporated in OCP v4.10.13.

Can you please confirm that is the case?

If true, we will wait for v4.10.13 to become available in the stable-4.10 channel and then upgrade.

Comment 20 W. Trevor King 2022-05-16 19:31:28 UTC
[1] shows the fix for this series shipping in 4.10.13.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2075831#c8

Comment 23 errata-xmlrpc 2022-08-10 11:07:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069