2071941 – cronjob collect-profiles failed leads node reach to OutOfpods status

Bug 2071941 - cronjob collect-profiles failed leads node reach to OutOfpods status

Summary: cronjob collect-profiles failed leads node reach to OutOfpods status

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	OLM
Sub Component:
Version:	4.10
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.10.z
Assignee:	Per da Silva
QA Contact:	Jian Zhang
Docs Contact:
URL:
Whiteboard:
Depends On:	2055861
Blocks:	2079082
TreeView+	depends on / blocked

Reported:	2022-04-05 09:33 UTC by Per da Silva
Modified:	2023-09-15 01:23 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	2055861
Environment:
Last Closed:	2022-04-25 19:51:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift operator-framework-olm pull 277	0	None	open	[release-4.10] Bug 2071941: Replace collect-profile jobs that haven't completed	2022-04-05 14:32:51 UTC
Red Hat Product Errata	RHBA-2022:1431	0	None	None	None	2022-04-25 19:52:01 UTC

Comment 1 Per da Silva 2022-04-05 10:10:41 UTC

PR: https://github.com/openshift/operator-framework-olm/pull/276

Comment 5 Jian Zhang 2022-04-14 04:22:41 UTC

1, Create an OCP 4.10 cluster that contains the fixed PR.
mac:~ jianzhang$ oc adm release info registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-04-13-214142 -a .dockerconfigjson --commits|grep olm
  operator-lifecycle-manager                     https://github.com/openshift/operator-framework-olm                         1cb0c9a578ffcc6d471b483ab34b627430677f09
  operator-registry                              https://github.com/openshift/operator-framework-olm                         1cb0c9a578ffcc6d471b483ab34b627430677f09

mac:~ jianzhang$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-04-13-214142   True        False         96s     Cluster version is 4.10.0-0.nightly-2022-04-13-214142

2, Cordon the worker nodes that the collect-profiles job pods running on.
mac:~ jianzhang$ oc adm cordon ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w  ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9  ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l 
node/ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w cordoned
node/ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9 cordoned
node/ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l cordoned
mac:~ jianzhang$ oc get nodes
NAME                                       STATUS                     ROLES    AGE   VERSION
ci-ln-tpy5pqb-72292-8ww7d-master-0         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-master-1         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-master-2         Ready                      master   21m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-a-vp62w   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-b-gdjg9   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071
ci-ln-tpy5pqb-72292-8ww7d-worker-c-fjd7l   Ready,SchedulingDisabled   worker   11m   v1.23.5+9ce5071

3, Check if more collect-profiles pods are generated.
mac:~ jianzhang$ oc get pods -n openshift-operator-lifecycle-manager
NAME                                      READY   STATUS      RESTARTS      AGE
catalog-operator-68558fff4b-rf79c         1/1     Running     0             61m
collect-profiles-27498450-wvj57           0/1     Completed   0             51m
collect-profiles-27498495-9qq9n           0/1     Pending     0             6m32s
olm-operator-5c6f5df9f6-dzw4p             1/1     Running     0             61m
..

As above, only one pod is pending after the job run twice. LGTM, verify it.

Comment 9 errata-xmlrpc 2022-04-25 19:51:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.11 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1431

Comment 10 Red Hat Bugzilla 2023-09-15 01:23:04 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.