Bug 1686067
| Summary: | Revision prune controller fails with "revision-status-0 not found" | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Dame <mdame> |
| Component: | Master | Assignee: | Mike Dame <mdame> |
| Status: | CLOSED ERRATA | QA Contact: | zhou ying <yinzhou> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.1.0 | CC: | aos-bugs, deads, jokerman, mfojtik, mmccomas |
| Target Milestone: | --- | ||
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-06-04 10:45:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Mike Dame
2019-03-06 16:14:06 UTC
Previous attempt to fix this for reference: https://github.com/openshift/library-go/pull/291 Opened https://github.com/openshift/library-go/pull/303 to try and address this again but on closer inspection the errors mentioned above should not be coming from line 303 in prune_controller.go because after the above mentioned fix (https://github.com/openshift/library-go/pull/291) that error would have been bumped down 2 lines, and in the most recent code it is still on line 305. This leads me to believe that the payload image is perhaps not updated. The vendored library-go code in kube-apiserver-operator on master is up to date with the fix. @David could you try to verify again if you can get this message from where it currently is in the code? I don't know any other way to verify the running version of the operator itself, but this is definitely a signal Release image building was fixed last week and confirmed that this is still an issue. Testing https://github.com/openshift/cluster-kube-apiserver-operator/pull/335 (with the library-go fix above) cleared up this problem, so hopefully once that merges this can move to qa PR has merged and this should be addressed now Confirmed with OCP,still could see errors like:
[root@192 ~]# oc version --short
Client Version: v4.0.22
Server Version: v1.12.4+befe71b
Payload: 4.0.0-0.nightly-2019-03-19-004004
[root@192 ~]# oc logs -f po/kube-apiserver-operator-6764545587-qhbft -n openshift-kube-apiserver-operator |grep ownerrefs
I0319 09:21:21.254250 1 event.go:221] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-kube-apiserver-operator", Name:"kube-apiserver-operator", UID:"b75e0411-49f4-11e9-a3be-02b842adcc06", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'InstallerPodFailed' Failed to create installer pod for revision 19 on node "ip-172-31-131-4.ap-northeast-2.compute.internal": unable to set installer pod ownerrefs: configmap "revision-status-19" not found
E0319 09:21:21.315681 1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found
E0319 09:21:21.317350 1 installer_controller.go:762] key failed with : unable to set installer pod ownerrefs: configmap "revision-status-19" not found
Thanks for verifying. I believe the issue you're seeing is actually related to a separate bug (see https://bugzilla.redhat.com/show_bug.cgi?id=1686070). We saw these errors with "revision-status-0" frequently and regularly, and the issue here was that revision-status-0 should not exist. The issue you're seeing is more likely that configmaps which *should* exist are being deleted, with different revision numbers. That is also related to the installer pod, where the problem in this bug was related to the pruner pod. Could you please verify that the "revision-status-0" issue is resolved, and if necessary add this new information to the bug I mentioned above? Thanks! Since could not see the error about revision-status-0, will verify this. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |