Bug 2071941 - cronjob collect-profiles failed leads node reach to OutOfpods status
Summary: cronjob collect-profiles failed leads node reach to OutOfpods status
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.10
Hardware: All
OS: Linux
Target Milestone: ---
: 4.10.z
Assignee: Per da Silva
QA Contact: Jian Zhang
Depends On: 2055861
Blocks: 2079082
TreeView+ depends on / blocked
Reported: 2022-04-05 09:33 UTC by Per da Silva
Modified: 2023-09-15 01:23 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2055861
Last Closed: 2022-04-25 19:51:43 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift operator-framework-olm pull 277 0 None open [release-4.10] Bug 2071941: Replace collect-profile jobs that haven't completed 2022-04-05 14:32:51 UTC
Red Hat Product Errata RHBA-2022:1431 0 None None None 2022-04-25 19:52:01 UTC

Comment 5 Jian Zhang 2022-04-14 04:22:41 UTC
1, Create an OCP 4.10 cluster that contains the fixed PR.
mac:~ jianzhang$ oc adm release info registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-04-13-214142 -a .dockerconfigjson --commits|grep olm
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         1cb0c9a578ffcc6d471b483ab34b627430677f09
  operator-registry                              https://github.com/openshift/operator-framework-olm                         1cb0c9a578ffcc6d471b483ab34b627430677f09

mac:~ jianzhang$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-04-13-214142   True        False         96s     Cluster version is 4.10.0-0.nightly-2022-04-13-214142

2, Cordon the worker nodes that the collect-profiles job pods running on.
mac:~ jianzhang$ oc adm cordon ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w  ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9  ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l 
node/ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w cordoned
node/ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9 cordoned
node/ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l cordoned
mac:~ jianzhang$ oc get nodes
NAME                                       STATUS                     ROLES    AGE   VERSION
ci-ln-tpy5pqb-72292-8ww7d-master-0         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-master-1         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-master-2         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071

3, Check if more collect-profiles pods are generated.
mac:~ jianzhang$ oc get pods -n openshift-operator-lifecycle-manager
NAME                                      READY   STATUS      RESTARTS      AGE
catalog-operator-68558fff4b-rf79c         1/1     Running     0             61m
collect-profiles-27498450-wvj57           0/1     Completed   0             51m
collect-profiles-27498495-9qq9n           0/1     Pending     0             6m32s
olm-operator-5c6f5df9f6-dzw4p             1/1     Running     0             61m

As above, only one pod is pending after the job run twice. LGTM, verify it.

Comment 9 errata-xmlrpc 2022-04-25 19:51:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.11 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Comment 10 Red Hat Bugzilla 2023-09-15 01:23:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.