Seems expected clusterrolebindings are missing for both sriov and ptp operators. For sriov, clusterolebinding for openshift-sriov-operator is missing: [kni@ran-vcl01-installer ~]$ oc get pods -n openshift-sriov-network-operator No resources found in openshift-sriov-network-operator namespace. [kni@ran-vcl01-installer ~]$ oc get clusterrolebindings -A | grep sriov sriov-network-operator.4.8.0-202110121407-65d74b9b84 ClusterRole/sriov-network-operator.4.8.0-202110121407-65d74b9b84 5h55m [kni@ran-vcl01-installer ~]$ oc describe clusterrolebindings sriov-network-operator.4.8.0-202110121407-65d74b9b84 | grep Service olm.owner.kind=ClusterServiceVersion ServiceAccount sriov-network-config-daemon openshift-sriov-network-operator For ptp, clusterrolebinding for ptp config daemon is missing while the operator crb exists: kni@ran-vcl01-installer ~]$ oc get clusterrolebindings -A | grep ptp ptp-operator.4.8.0-202110121407-6d46bbc9f5 ClusterRole/ptp-operator.4.8.0-202110121407-6d46bbc9f5 6h7m [kni@ran-vcl01-installer ~]$ oc describe clusterrolebindings ptp-operator.4.8.0-202110121407-6d46bbc9f5 | grep Service olm.owner.kind=ClusterServiceVersion ServiceAccount ptp-operator openshift-ptp For comprison, following is from a working cluster, which should have two clusterrolebindings (one for operator and one for daemon) for ptp and sriov in order for appropriate service accounts to be created. Healthy SRIOV: [yliu1@yliu1 ~]$ oc get clusterrolebindings -A | grep sriov sriov-network-operator.4.9.0-202110121402-77656d84dd ClusterRole/sriov-network-operator.4.9.0-202110121402-77656d84dd 5d4h sriov-network-operator.4.9.0-202110121402-7b777bd698 ClusterRole/sriov-network-operator.4.9.0-202110121402-7b777bd698 5d4h [yliu1@yliu1 ~]$ oc describe clusterrolebindings sriov-network-operator.4.9.0-202110121402-77656d84dd | grep ServiceAcc ServiceAccount sriov-network-operator openshift-sriov-network-operator [yliu1@yliu1 ~]$ oc describe clusterrolebindings sriov-network-operator.4.9.0-202110121402-7b777bd698 |grep ServiceAcc ServiceAccount sriov-network-config-daemon openshift-sriov-network-operator Healthy PTP: [yliu1@yliu1 ~]$ oc get clusterrolebindings -A | grep ptp ptp-operator.4.8.0-202110011559-8bb895549 ClusterRole/ptp-operator.4.8.0-202110011559-8bb895549 5d7h ptp-operator.4.8.0-202110011559-b875d6955 ClusterRole/ptp-operator.4.8.0-202110011559-b875d6955 5d7h [yliu1@yliu1 ~]$ oc describe clusterrolebindings ptp-operator.4.8.0-202110011559-8bb895549 | grep ServiceA ServiceAccount linuxptp-daemon openshift-ptp [yliu1@yliu1 ~]$ oc describe clusterrolebindings ptp-operator.4.8.0-202110011559-b875d6955 | grep ServiceA ServiceAccount ptp-operator openshift-ptp Operator versions used is 4.8.0-202110121407 along with ocp 4.8.15: [kni@ran-vcl01-installer ~]$ oc get csv -A |grep -E "sriov|ptp" openshift-ptp performance-addon-operator.v4.8.2 Performance Addon Operator 4.8.2 Succeeded openshift-ptp ptp-operator.4.8.0-202110121407 PTP Operator 4.8.0-202110121407 Pending openshift-sriov-network-operator performance-addon-operator.v4.8.2 Performance Addon Operator 4.8.2 Succeeded openshift-sriov-network-operator sriov-network-operator.4.8.0-202110121407 SR-IOV Network Operator 4.8.0-202110121407 Pending
Workaround: Delete affected pending csv, subs and installplans. So the proper clusterrolebindings will get created in new installplan.
Jeff, What is the frequency of this occuring? /KenY
(In reply to Ken Young from comment #5) > Jeff, > > What is the frequency of this occuring? > > /KenY When I was actively trying to run tests on the system in question, it was something of order half the time or so. Yang, does that sound about right? It was frequent enough to be a real roadblock. It was also observed on at least one other system.
Yes we deployed cnfocto2 with 4.8.13&4.8.15 a few times (~4-5) and encountered this at least 2 times as I can remember. I also encountered this issue on a qe cluster on 4.9, but it is much more rare than cnfocto2. Couple of differences between cnfocto2 and qe cluster (although not sure if they contribute to the frequency of this issue): 1) cnfocto2 is co-located in same lab with the hub cluster, 2) cnfocto2 has proper sriov and ptp configs.
Ian, I am assuming this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2021151. Correct? /KenY
Yes. This BZ has the same root cause as BZ 2021151 but presents with different symptoms (operators fail to install) and workaround for those symptoms.
*** This bug has been marked as a duplicate of bug 2021151 ***