Bug 1838352 - OperatorExited, Pending marketplace-operator-... pod for several weeks [NEEDINFO]
Summary: OperatorExited, Pending marketplace-operator-... pod for several weeks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.7.0
Assignee: Alexander Greene
QA Contact: Tom Buskey
URL:
Whiteboard:
: 1888383 (view as bug list)
Depends On:
Blocks: 1892382
TreeView+ depends on / blocked
 
Reported: 2020-05-21 01:17 UTC by W. Trevor King
Modified: 2021-02-24 15:13 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The Marketplace operator was written to report that the services it offered were degraded whenever the pod exited gracefully. This would happen during routine cluster upgrades. Consequence: The marketplace pod reported a degraded during normal upgrades, this information was ultimately surfaced in Telemetry and caused confusion for both cluster admins and customer experience teams. Fix: The marketplace operator no longer reports that it is degraded when it exits gracefully. Result: The marketplace operator is no longer flagged by Telemeter as degraded, reducing confusion for customers and customer experience teams.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:12:13 UTC
Target Upstream Version:
ggore: needinfo? (agreene)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-marketplace pull 354 0 None closed Bug 1838352: Don't report OperatorExited to ClusterOperator 2021-02-18 21:51:23 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:13:33 UTC

Description W. Trevor King 2020-05-21 01:17:54 UTC
Seen in the marketplace ClusterOperator early in a 4.3.18 -> 4.3.19 update:

version: operator 4.3.18
conditions:
  2020-05-05T23:31:20Z Progressing=False OperatorExited: The operator has exited
  2020-05-19T06:39:39Z Available=False OperatorExited: The operator has exited
  2020-05-05T23:31:00Z Upgradeable=True OperatorExited: Marketplace is upgradeable
  2019-12-05T18:28:36Z Degraded=False OperatorExited: The operator has exited

Checking out the operator pod:

$ jq -r '.status | .startTime + " " + .phase + "\n" + ([.conditions[] | .lastTransitionTime + " " + (.lastProbeTime // "-") + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")] | join("\n"))' config/pod/openshift-marketplace/marketplace-operator-794975cff-h7m5f
2020-05-05T23:30:12Z Pending
2020-05-05T23:30:12Z - Initialized=True -: -
2020-05-19T06:39:42Z - Ready=False ContainersNotReady: containers with unready status: [marketplace-operator]
2020-05-19T06:39:42Z - ContainersReady=False ContainersNotReady: containers with unready status: [marketplace-operator]
2020-05-05T23:30:12Z - PodScheduled=True -: -

so it has been Pending with an unready container (and no restarts) for two weeks.

Comment 11 Evan Cordell 2020-10-26 13:49:12 UTC
*** Bug 1888383 has been marked as a duplicate of this bug. ***

Comment 19 W. Trevor King 2020-12-11 20:19:57 UTC
The PR closing this bug [1] just teaches the marketplace operator to not set OperatorExited on graceful shutdowns.  That's good, it's the cluster-version operator's job to complain if the marketplace operator fails to come back up.  But it does not address why the marketplace operator was unable to come back up (i.e. the "stuck" portion of this bug).  If anyone has a cluster where the marketplace operator is not coming back up or sticking an update, regardless of OperatorExited conditions, please file a new bug and point us at a must-gather, and we'll dig in.

[1]: https://github.com/operator-framework/operator-marketplace/pull/354

Comment 27 errata-xmlrpc 2021-02-24 15:12:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.