Bug 2099499

Summary: CCM failed to downgrade from 4.11 to 4.10 on Alicloud and IBMCloud
Product: OpenShift Container Platform Reporter: sunzhaohua <zhsun>
Component: Cloud ComputeAssignee: dmoiseev
Cloud Compute sub component: Cloud Controller Manager QA Contact: sunzhaohua <zhsun>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low    
Version: 4.11   
Target Milestone: ---   
Target Release: 4.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-07-07 11:40:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2051457    
Bug Blocks:    

Description sunzhaohua 2022-06-21 06:28:24 UTC
Description of problem:
Downgrade from 4.11.0-0.nightly-2022-06-15-222801 to 4.10.0-0.nightly-2022-06-08-150219 on Alicloud and IBMCloud, cloud-controller-manager degraded.

Version-Release number of selected component (if applicable):
4.11.0-0.nightly-2022-06-15-222801

How reproducible:
Always

Steps to Reproduce:
1. Set up a cluster on Alicloud or IBMCloud
2. Downgrade cluster from 4.11.0-0.nightly-2022-06-15-222801 to 4.10.0-0.nightly-2022-06-08-150219 
$ oc adm upgrade  --allow-explicit-upgrade --to-image  registry.ci.openshift.org/ocp/release:4.10.0-0.nightly-2022-06-08-150219   --force
3.

Actual results:
Downgrade failed on cloud-controller-manager.
$ oc get clusterversion                                                                           [11:25:40]
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-06-15-222801   True        True          28m     Working towards 4.10.0-0.nightly-2022-06-08-150219: 143 of 771 done (18% complete), waiting up to 40 minutes on cloud-controller-manager

Alicloud
$ oc get co                                                                           
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.11.0-0.nightly-2022-06-15-222801   True        False         False      46m
baremetal                                  4.11.0-0.nightly-2022-06-15-222801   True        False         False      64m
cloud-controller-manager                   4.10.0-0.nightly-2022-06-08-150219   True        False         True       67m     Failed to resync for operator: 4.10.0-0.nightly-2022-06-08-150219 because &{{{%!e(string=) %!e(string=)} {%!e(string=) %!e(string=) %!e(string=) %!e(*int64=<nil>)} %!e(string=Failure) %!e(string=Deployment.apps "alibaba-cloud-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"alibaba-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable) %!e(v1.StatusReason=Invalid) %!e(*v1.StatusDetails=&{alibaba-cloud-controller-manager apps Deployment  [{FieldValueInvalid Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"alibaba-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable spec.selector}] 0}) %!e(int32=422)}}

$ oc logs -f cluster-cloud-controller-manager-operator-675dcd89f-xcmfj -n openshift-cloud-controller-manager-operator -c cluster-cloud-controller-manager
E0621 03:28:14.702771       1 clusteroperator_controller.go:114] Unable to sync operands: Deployment.apps "alibaba-cloud-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"alibaba-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
E0621 03:28:14.714956       1 controller.go:317] CCMOperator/controller/clusteroperator "msg"="Reconciler error" "error"="Deployment.apps \"alibaba-cloud-controller-manager\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"app\":\"alibaba-cloud-controller-manager\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable" "name"="cloud-controller-manager" "namespace"="" "reconciler group"="config.openshift.io" "reconciler kind"="ClusterOperator"
E0621 03:28:14.717576       1 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cloud-controller-manager.16fa84723a937e51", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"ClusterOperator", Namespace:"", Name:"cloud-controller-manager", UID:"c82b76f6-9eef-45c0-bdab-36fbadd9428e", APIVersion:"config.openshift.io/v1", ResourceVersion:"46315", FieldPath:""}, Reason:"Status degraded", Message:"Deployment.apps \"alibaba-cloud-controller-manager\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"app\":\"alibaba-cloud-controller-manager\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable", Source:v1.EventSource{Component:"cloud-controller-manager-operator", Host:""}, FirstTimestamp:time.Date(2022, time.June, 21, 3, 17, 19, 155961425, time.Local), LastTimestamp:time.Date(2022, time.June, 21, 3, 28, 14, 703861030, time.Local), Count:18, Type:"Warning", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "cloud-controller-manager.16fa84723a937e51" is forbidden: User "system:serviceaccount:openshift-cloud-controller-manager-operator:cluster-cloud-controller-manager" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)

IBMCloud
$ oc get co                                                                                                                                                             [11:55:43]
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.11.0-0.nightly-2022-06-15-222801   True        False         False      48m
baremetal                                  4.11.0-0.nightly-2022-06-15-222801   True        False         False      77m
cloud-controller-manager                   4.10.0-0.nightly-2022-06-08-150219   True        False         True       89m     Failed to resync for operator: 4.10.0-0.nightly-2022-06-08-150219 because &{{{%!e(string=) %!e(string=)} {%!e(string=) %!e(string=) %!e(string=) %!e(*int64=<nil>)} %!e(string=Failure) %!e(string=Deployment.apps "ibm-cloud-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"k8s-app":"ibm-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable) %!e(v1.StatusReason=Invalid) %!e(*v1.StatusDetails=&{ibm-cloud-controller-manager apps Deployment  [{FieldValueInvalid Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"k8s-app":"ibm-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable spec.selector}] 0}) %!e(int32=422)}}

E0621 03:49:04.081143       1 clusteroperator_controller.go:114] Unable to sync operands: Deployment.apps "ibm-cloud-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"k8s-app":"ibm-cloud-controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
E0621 03:49:04.093988       1 controller.go:317] CCMOperator/controller/clusteroperator "msg"="Reconciler error" "error"="Deployment.apps \"ibm-cloud-controller-manager\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"k8s-app\":\"ibm-cloud-controller-manager\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable" "name"="cloud-controller-manager" "namespace"="" "reconciler group"="config.openshift.io" "reconciler kind"="ClusterOperator"
E0621 03:49:04.103720       1 event.go:267] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cloud-controller-manager.16fa84fc800ad74a", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"ClusterOperator", Namespace:"", Name:"cloud-controller-manager", UID:"4bfc15ac-9ad5-4bf7-bfe5-994ea2824387", APIVersion:"config.openshift.io/v1", ResourceVersion:"43672", FieldPath:""}, Reason:"Status degraded", Message:"Deployment.apps \"ibm-cloud-controller-manager\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"k8s-app\":\"ibm-cloud-controller-manager\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable", Source:v1.EventSource{Component:"cloud-controller-manager-operator", Host:""}, FirstTimestamp:time.Date(2022, time.June, 21, 3, 27, 13, 26897738, time.Local), LastTimestamp:time.Date(2022, time.June, 21, 3, 49, 4, 81220572, time.Local), Count:22, Type:"Warning", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "cloud-controller-manager.16fa84fc800ad74a" is forbidden: User "system:serviceaccount:openshift-cloud-controller-manager-operator:cluster-cloud-controller-manager" cannot patch resource "events" in API group "" in the namespace "default"' (will not retry!)

Expected results:
Downgrade can be successful.

Additional info:
Alicloud must-gather: https://drive.google.com/file/d/1EWxVnCSqCXguHjDmbcVL2SARGnhkHI0S/view?usp=sharing
IBMcloud must-gather:https://drive.google.com/file/d/1MSOdFMKhf4TS270iI87ycOzAB7-qVddv/view?usp=sharing

Comment 1 Joel Speed 2022-06-21 08:33:18 UTC
We could fix this by backporting https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/174/commits/1bf44f1801d1f33338e7f165b7029726fdfdefdc

Not sure if downgrades are supported normally though, especially as both Alibaba and IBM cloud are considered as tech preview platforms in both 4.10 and 4.11. Let me check with PM, if they're tech preview I'd vote to close this wont fix

Comment 2 Joel Speed 2022-06-21 10:38:09 UTC
*** Bug 2099496 has been marked as a duplicate of this bug. ***

Comment 7 sunzhaohua 2022-06-30 03:07:30 UTC
Verified
clsuterversion: 4.10.21

Downgrade from 4.11.0-0.nightly-2022-06-28-160049 to 4.10.21 on Alicloud and IBMcloud, cloud-controller-manager doesn't degrade.
Alicloud
$ oc get co                                                          [11:04:26]
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.21                              True        False         False      69m
baremetal                                  4.10.21                              True        False         False      91m
cloud-controller-manager                   4.10.21                              True        False         False      93m

IBMcloud
$ oc get co                                         [11:02:32]
NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.21                              True        False         False      54m
baremetal                                  4.10.21                              True        False         False      77m
cloud-controller-manager                   4.10.21                              True        False         False      80m

Comment 9 errata-xmlrpc 2022-07-07 11:40:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.21 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5428