Bug 1779348
Summary: | MCO should be updated based on the backport of reserved-cpus feature | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Vladik Romanovsky <vromanso> | |
Component: | Machine Config Operator | Assignee: | Sinny Kumari <skumari> | |
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | |
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.3.0 | CC: | amurdaca, augol, dblack, dshchedr, eparis, fsimonce, ksinny, msivak, mvirgil, schoudha, skumari, smilner | |
Target Milestone: | --- | |||
Target Release: | 4.3.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1782893 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-23 11:17:46 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1775826, 1782893 | |||
Bug Blocks: | 1771572 |
Description
Vladik Romanovsky
2019-12-03 19:42:34 UTC
It appears that MCO is following openshift/kubernetes/origin-4.3-kubernetes-1.16.0 I've opened a backport[1] of reserved-cpus[2] patch there as well. Once this is merged, I'll be able to post a PR to MCO to update it's vendor dir. [1] https://github.com/openshift/kubernetes/pull/101 [2] https://github.com/kubernetes/kubernetes/pull/83592 After several discussion, I've been told that the correct path is to backport the reserved-cpus PR[1] to the origin/master branch. This is the only way to update openshift/kubernetes/origin-4.3-kubernetes-1.16.2 branch. I've opened a PR to address this[2] Once [2] is merged I will need to push a PR to MCO to retarget the openshift/kubernetes branch from 1.16.0 to 1.16.2 [1] https://github.com/kubernetes/kubernetes/pull/83592 [2] https://github.com/openshift/origin/pull/24257 As far as I see in 4.3 branch of Machine Config Operator, it vendors v1.16.0-beta.0.0.20190913145653 from github.com/openshift/kubernetes repo https://github.com/openshift/machine-config-operator/blob/release-4.3/go.mod#L107. How exactly making changes to openshift/origin/master branch flows into openshift/kubernetes/origin-4.3-kubernetes-1.16.2 branch ? Is there any documentation which explains this and can be used as reference? Also, dwe may need to update MCO vendor directory in master branch first to make sure changes are working fine. MCO master branch vendors kubernetes v1.16.0-beta.0.0.20190913145653 as well. (In reply to Sinny Kumari from comment #3) > As far as I see in 4.3 branch of Machine Config Operator, it vendors > v1.16.0-beta.0.0.20190913145653 from github.com/openshift/kubernetes repo > https://github.com/openshift/machine-config-operator/blob/release-4.3/go. > mod#L107. How exactly making changes to openshift/origin/master branch flows > into openshift/kubernetes/origin-4.3-kubernetes-1.16.2 branch ? Is there any > documentation which explains this and can be used as reference? > > Also, dwe may need to update MCO vendor directory in master branch first to > make sure changes are working fine. MCO master branch vendors kubernetes > v1.16.0-beta.0.0.20190913145653 as well. In BZ1775826 we've backported a reserved-cpus PR to origin/release-4.3[1]. However, this code wasn't in origin/master and wasn't copied to the openshift/kubernetes branch. Therefore, I've opened a backport[2] to the origin/master branch and once it'll get merged [1] will be copied to openshift/kubernetes/origin-4.3-kubernetes-1.16.2 Once this happens we will need to make 4.3 branch of Machine Config Operator to vendor the 1.16.2 branch from github.com/openshift/kubernetes I'm not aware of any documentation on the subject. The above steps have been taken following a conversation with @rphillips @sjenning @eparis Vladik [1] https://github.com/openshift/origin/pull/24224 [2] https://github.com/openshift/origin/pull/24257 Being relatively new with OpenShift process, it is a bit confusing to me that why we need to update openshift/origin first instead of directly updating openshift/kubernetes required branch. Thanks Vladik for explanation. Verified on nightly build of 4.3 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2019-12-18-145749 True False 19m Cluster version is 4.3.0-0.nightly-2019-12-18-145749 $ cat ../kubeletconfig.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: cpumanager-enabled spec: machineConfigPoolSelector: matchLabels: custom-kubelet: enabled kubeletConfig: reservedSystemCPUs: 0,2 cpuManagerPolicy: static cpuManagerReconcilePeriod: 5s $ oc label mcp master custom-kubelet=enabled machineconfigpool.machineconfiguration.openshift.io/master labeled $ oc get mcp/master --show-labels NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT LABELS master rendered-master-38cf6c0a1cec739719f5bf7b7bebc7ad True False False 3 3 3 0 custom-kubelet=enabled,machineconfiguration.openshift.io/mco-built-in=,operator.machineconfiguration.openshift.io/required-for-upgrade= $ oc apply -f ../kubeletconfig.yaml kubeletconfig.machineconfiguration.openshift.io/cpumanager-enabled created $ oc get kubeletconfig NAME AGE cpumanager-enabled 5s $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-157.ec2.internal Ready worker 28m v1.16.2 ip-10-0-141-196.ec2.internal Ready,SchedulingDisabled master 39m v1.16.2 ip-10-0-147-46.ec2.internal Ready worker 28m v1.16.2 ip-10-0-155-125.ec2.internal Ready master 39m v1.16.2 ip-10-0-162-50.ec2.internal Ready master 38m v1.16.2 ip-10-0-169-122.ec2.internal Ready worker 28m v1.16.2 [mnguyen@pet30 4.3]$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-157.ec2.internal Ready worker 30m v1.16.2 ip-10-0-141-196.ec2.internal Ready,SchedulingDisabled master 41m v1.16.2 ip-10-0-147-46.ec2.internal Ready worker 30m v1.16.2 ip-10-0-155-125.ec2.internal Ready master 40m v1.16.2 ip-10-0-162-50.ec2.internal Ready master 40m v1.16.2 ip-10-0-169-122.ec2.internal Ready worker 30m v1.16.2 [mnguyen@pet30 4.3]$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-157.ec2.internal Ready worker 33m v1.16.2 ip-10-0-141-196.ec2.internal Ready master 45m v1.16.2 ip-10-0-147-46.ec2.internal Ready worker 33m v1.16.2 ip-10-0-155-125.ec2.internal Ready,SchedulingDisabled master 44m v1.16.2 ip-10-0-162-50.ec2.internal Ready master 44m v1.16.2 ip-10-0-169-122.ec2.internal Ready worker 33m v1.16.2 $ oc debug node/ip-10-0-141-196.ec2.internal Starting pod/ip-10-0-141-196ec2internal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. chroot /host sh-4.4# cat /etc/kubernetes/kubelet.conf {"kind":"KubeletConfiguration","apiVersion":"kubelet.config.k8s.io/v1beta1","staticPodPath":"/etc/kubernetes/manifests","syncFrequency":"0s","fileCheckFrequency":"0s","httpCheckFrequency":"0s","rotateCertificates":true,"serverTLSBootstrap":true,"authentication":{"x509":{"clientCAFile":"/etc/kubernetes/kubelet-ca.crt"},"webhook":{"cacheTTL":"0s"},"anonymous":{"enabled":false}},"authorization":{"webhook":{"cacheAuthorizedTTL":"0s","cacheUnauthorizedTTL":"0s"}},"clusterDomain":"cluster.local","clusterDNS":["172.30.0.10"],"streamingConnectionIdleTimeout":"0s","nodeStatusUpdateFrequency":"0s","nodeStatusReportFrequency":"0s","imageMinimumGCAge":"0s","volumeStatsAggPeriod":"0s","cgroupDriver":"systemd","cpuManagerPolicy":"static","cpuManagerReconcilePeriod":"5s","runtimeRequestTimeout":"0s","maxPods":250,"kubeAPIQPS":50,"kubeAPIBurst":100,"serializeImagePulls":false,"evictionPressureTransitionPeriod":"0s","featureGates":{"LegacyNodeRoleBehavior":false,"NodeDisruptionExclusion":true,"RotateKubeletServerCertificate":true,"ServiceNodeExclusion":true,"SupportPodPidsLimit":true},"containerLogMaxSize":"50Mi","systemReserved":{"cpu":"500m","memory":"500Mi"},"reservedSystemCPUs":"0,2"} sh-4.4# journalctl -t hyperkube | grep reserv Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.381830 2047 flags.go:33] FLAG: --kube-reserved="" Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.381839 2047 flags.go:33] FLAG: --kube-reserved-cgroup="" Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.382209 2047 flags.go:33] FLAG: --qos-reserved="" Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.382300 2047 flags.go:33] FLAG: --reserved-cpus="" Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.382530 2047 flags.go:33] FLAG: --system-reserved="" Dec 18 18:24:14 ip-10-0-141-196 hyperkube[2047]: I1218 18:24:14.382538 2047 flags.go:33] FLAG: --system-reserved-cgroup="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147085 2302 flags.go:33] FLAG: --kube-reserved="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147091 2302 flags.go:33] FLAG: --kube-reserved-cgroup="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147311 2302 flags.go:33] FLAG: --qos-reserved="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147363 2302 flags.go:33] FLAG: --reserved-cpus="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147494 2302 flags.go:33] FLAG: --system-reserved="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.147500 2302 flags.go:33] FLAG: --system-reserved-cgroup="" Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.452529 2302 server.go:679] Option --reserved-cpus is specified, it will overwrite the cpu setting in KubeReserved="map[]", SystemReserved="map[cpu:500m memory:500Mi]". Dec 18 19:08:19 ip-10-0-141-196 hyperkube[2302]: I1218 19:08:19.462598 2302 policy_static.go:110] [cpumanager] reserved 2 CPUs ("0,2") not available for exclusive assignment sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |