From an Insights tarball from a 4.3.18 -> 4.3.19 update: $ tar -xOz config/clusteroperator/monitoring <20200519062637-32ad8cfe89fd45ddb28f2eda2c34936d | jq -r '.status.conditions[] | " " + .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + " " +.message' 2020-05-19T02:34:40Z Available=True RollOutDone Successfully rolled out the stack. 2020-05-19T02:34:40Z Progressing=False 2020-05-19T02:34:40Z Degraded=False 2020-05-19T04:26:32Z Upgradeable=True RollOutInProgress Rollout of the monitoring stack is in progress. Please wait until it finishes. The Upgradeable=True with RollOutInProgress really sounds like it's progressing, and yet, Progressing=False. Also, Upgradeable=True plus a "Please wait" message is a pretty odd. If you wanted folks to wait, I'd expect Upgradeable=False. Possibly the reason and message are just not getting reset to some "all is well" placeholders when the transition completes? Also, the timestamps on the conditions are all well before the 4.3.18 -> 4.3.19 update itself. From a later must-gather: $ yaml2json <cluster-scoped-resources/config.openshift.io/clusterversions/version.yaml | jq -r '.status.history[] | .startedTime + " " + .completionTime + " " + .version + " " + .state + " " + (.verified | tostring)' | head -n2 2020-05-19T06:32:38Z null 4.3.19 Partial true 2020-05-05T22:08:18Z 2020-05-05T23:35:40Z 4.3.18 Completed true So not clear to me why the monitoring operator would be poking around with conditions at 04:26:32Z. Possibly in response to an autoscaler or other node activity.
As per: https://coreos.slack.com/archives/C0VMT03S5/p1589961556398800?thread_ts=1589952447.394700&cid=C0VMT03S5 > lili I understood we should not be setting Upgradeable=False ? Can you advise Trevor? Until clarified setting low severity.
Sounds like monitoring doesn't have anything that would call for Upgradeable=False and "you can't bump minor version 4.y -> 4.(y+1) because $THIS would break". So fix is probably pick a reason ("AsExpected" or similar) and message ("This is fine" or similar) and always set those instead of the current "RollOutInProgress" and "Rollout of the monitoring stack is in progress. Please wait until it finishes". No functional impact, so low priority is appropriate, but seems like a straightforward fix and folks like me with my admin hat on would be less confused once the reason/message makes sense with the Upgradeable=False type/status.
We want to modify this in 4.6 onwards, created a task to not forget https://issues.redhat.com/browse/MON-1126. Closing as agreed on slack.