Hide Forgot
Created attachment 1538395 [details] network_operator_log Description of problem: The new added network clusteroperator.config.openshift.io can monitor the network operator and report back the status in real time. But it may take long time to report that the operator is available after fix the config from a problem. Version-Release number of selected component (if applicable): v4.0.0-0.177.0 How reproducible: always Steps to Reproduce: 1. Setup ocp cluster 2. Check the network clusteroperator.config.openshift.io NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network True False False 7s 3. Make some problem in the network.config.openshift.io 4. Check that the clusteroperator.config.openshift.io is reporting FAILING NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network False False True 9s 5. Fix the problem in the network.config.openshift.io 6. Watch the clusteroperator Actual results: It may take about 5 mins to report the cluster available again. NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network False False False 4m Expected results: Should report the operator status in real time. Additional info: Full log of the network operator attached.
The fix for this also changes the behavior of the status conditions a bit. In particular, now when you break the configuration, the operator Failing status will become True, but the Available status will not change; the operator will report that it is both Failing and Available. Then when you fix the config again, it should report Failing False, but you won't need to wait for Available to change, because it never changed in the first place.
(In reply to Dan Winship from comment #1) > The fix for this also changes the behavior of the status conditions a bit. > In particular, now when you break the configuration, the operator Failing > status will become True, but the Available status will not change; the > operator will report that it is both Failing and Available. Then when you > fix the config again, it should report Failing False, but you won't need to > wait for Available to change, because it never changed in the first place. Yep, as noticed on 4.0.0-0.nightly-2019-03-13-233958 , now it seems like True,False,True after bad config # oc get clusteroperators.config.openshift.io | grep "network\|NAME" NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network 4.0.0-0.nightly-2019-03-13-233958 True False True 125m and ~20 seconds post correct config, its shows True,False,False # oc get clusteroperators.config.openshift.io | grep "network\|NAME" NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network 4.0.0-0.nightly-2019-03-13-233958 True False False 129m
Checked on OCP 4.0.0-0.nightly-2019-03-19-004004 The status of the network operator at the beginning: # oc get clusteroperator network -o wide NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network 4.0.0-0.nightly-2019-03-19-004004 True False False 84m When I making the problem in network.config.openshift.io/cluster: # oc get clusteroperator network NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network 4.0.0-0.nightly-2019-03-19-004004 True False True 86m After I fix the problem above, it will refresh the status in a few seconds: # oc get clusteroperator network NAME VERSION AVAILABLE PROGRESSING FAILING SINCE network 4.0.0-0.nightly-2019-03-19-004004 True False False 86m Mark the bug as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758