Bug 1686067 - Revision prune controller fails with "revision-status-0 not found"
Summary: Revision prune controller fails with "revision-status-0 not found"
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Master
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.1.0
Assignee: Mike Dame
QA Contact: zhou ying
Depends On:
TreeView+ depends on / blocked
Reported: 2019-03-06 16:14 UTC by Mike Dame
Modified: 2019-06-04 10:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-06-04 10:45:04 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:45:10 UTC

Description Mike Dame 2019-03-06 16:14:06 UTC
Description of problem:
Seeing errors in logs for pruning such as "E0306 16:03:15.892155       1 prune_controller.go:303] key failed with : unable to set pruner pod ownerrefs: configmap "revision-status-0" not found"

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Check logs of any static pod operator (eg, kube-apiserver-operator)

Actual results:
Error message exists

Expected results:
Error does not exist

Additional info:
May be related to revisions failing to prune based on previous investigation that this failure could be getting stuck in backoff/retry logic that prevents future pruning attempts to run. Regardless, there is never a "revision-status-0" map and it should not be attempting to find one, as there is not explicitly a revision "0"

Comment 1 Mike Dame 2019-03-06 17:50:20 UTC
Previous attempt to fix this for reference: https://github.com/openshift/library-go/pull/291

Comment 2 Mike Dame 2019-03-06 21:39:44 UTC
Opened https://github.com/openshift/library-go/pull/303 to try and address this again but on closer inspection the errors mentioned above should not be coming from line 303 in prune_controller.go because after the above mentioned fix (https://github.com/openshift/library-go/pull/291) that error would have been bumped down 2 lines, and in the most recent code it is still on line 305. This leads me to believe that the payload image is perhaps not updated. The vendored library-go code in kube-apiserver-operator on master is up to date with the fix.

@David could you try to verify again if you can get this message from where it currently is in the code? I don't know any other way to verify the running version of the operator itself, but this is definitely a signal

Comment 4 Mike Dame 2019-03-13 19:53:16 UTC
Release image building was fixed last week and confirmed that this is still an issue. Testing https://github.com/openshift/cluster-kube-apiserver-operator/pull/335 (with the library-go fix above) cleared up this problem, so hopefully once that merges this can move to qa

Comment 5 Mike Dame 2019-03-15 15:56:07 UTC
PR has merged and this should be addressed now

Comment 6 zhou ying 2019-03-20 02:25:00 UTC
Confirmed with OCP,still could see errors like:
[root@192 ~]# oc version --short
Client Version: v4.0.22
Server Version: v1.12.4+befe71b

Payload: 4.0.0-0.nightly-2019-03-19-004004

[root@192 ~]# oc logs -f po/kube-apiserver-operator-6764545587-qhbft  -n openshift-kube-apiserver-operator |grep ownerrefs
I0319 09:21:21.254250       1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"b75e0411-49f4-11e9-a3be-02b842adcc06", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'InstallerPodFailed' Failed to create installer pod for revision 19 on node "ip-172-31-131-4.ap-northeast-2.compute.internal": unable to set installer pod ownerrefs: configmap "revision-status-19" not found
E0319 09:21:21.315681       1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found
E0319 09:21:21.317350       1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found

Comment 7 Mike Dame 2019-03-20 13:02:32 UTC
Thanks for verifying. I believe the issue you're seeing is actually related to a separate bug (see https://bugzilla.redhat.com/show_bug.cgi?id=1686070).

We saw these errors with "revision-status-0" frequently and regularly, and the issue here was that revision-status-0 should not exist. The issue you're seeing is more likely that configmaps which *should* exist are being deleted, with different revision numbers. That is also related to the installer pod, where the problem in this bug was related to the pruner pod.

Could you please verify that the "revision-status-0" issue is resolved, and if necessary add this new information to the bug I mentioned above? Thanks!

Comment 8 zhou ying 2019-03-21 01:53:46 UTC
Since could not see the error about revision-status-0, will verify this.

Comment 10 errata-xmlrpc 2019-06-04 10:45:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.