Description of problem: Masters failing on starter-us-east-2. Filing a bz after #libra-ops discussion. Version-Release number of selected component (if applicable): OpenShift Master: v3.7.23 (online version 3.7.2.1) The masters on starter-us-east-2 seem to be failing to come up or are in a bad state. Looking at atomic-openshift-master-controllers logs there is the following error: Apr 05 11:52:30 ip-172-31-67-105.us-east-2.compute.internal atomic-openshift-master-controllers[25092]: F0405 11:52:30.145529 25092 plugins.go:150] Invalid configuration: Predicate type not found for CheckVolumeBinding Apr 05 11:52:30 ip-172-31-67-105.us-east-2.compute.internal systemd[1]: atomic-openshift-master-controllers.service: main process exited, code=exited, status=255/n/a Apr 05 11:52:30 ip-172-31-67-105.us-east-2.compute.internal systemd[1]: Unit atomic-openshift-master-controllers.service entered failed state. Predicate CheckVolumeBinding seems only to be referenced in 3.9 documentation, suggesting that a 3.9 config had been applied to this cluster. There was maintenance peformed yesterday evening preceding this issue.
Right now I dont know the root cause, but my suggestion is to remove CheckVolumeBinding from scheduler's policy file, and it wont effect anything because CheckVolumeBinding is a default predicate and is registered by default anyway, so there is not need to put it again in the policy file where it is failing. Also the feature VolumeScheduling is disabled in 3.9 (alpha) which is used with CheckVolumeBinding, so the predicate CheckVolumeBinding does not do anything anyway.
I will keep looking into it meanwhile why it is happening but i know that it does not happen always.
I did not realize that the version was 3.7, i thought it was 3.9. The predicate CheckVolumeBinding was added in 3.9 so should not affect 3.7. Please check why the scheduler policy file has CheckVolumeBinding in 3.7?
This should now be resolved.
can we close this bz if its resolved?