Description of problem: It was observed that the first master was marked unschedulable which resulted in apiserver pod not schedulable and resulting in failure. 1. Unable to apply 4.4.0-rc.6: the cluster operator openshift-apiserver is degraded. If we mark master schedulable it gets reversed and later below operator failes. 2. Unable to apply 4.4.0-rc.6: the cluster operator machine-config has not yet successfully rolled out. Message: Unable to apply 4.4.0-rc.6: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for rendered-master-f90dab41073f1445d44ec27c32c353b1 expected a7b13759061f645a76f03c04d385d275bbbd0c02 has ab4d62a3bf3774b77b6f9b04a2028faec1568aca, retrying 3.Master-0 marked unschedulabe which prevent etcd-quorum to schedule. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Fails to upgrade to 4.4 Expected results: Should succeed Additional info: 3fc43ae7-1201-4c5c-ab50-51583a209081 Must gather times out.
Hi jkaur, Since there is no must-gather we can't find the root cause. We need the following information: - what is the infrastructure - aws, gcp? - did it happen more than once? - is the cluster usable at all after this happens? is it possible for you to give us access to the cluster once it happens? - can you give us the release image URLs for upgrade from -> to. We want to provision a cluster and kick off an upgrade and try to reproduce the issue.
Assiging it to "Machine Config Operator" as looks like it could be an issue with the machine config. If you find otherwise please feel free to assign it back to apiserver.
@Jaspreet is this an error that you've encountered more than once? What version were you upgrading from?
*** This bug has been marked as a duplicate of bug 1817455 ***
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days