Bug 1987083 - excludeMastersFromLB in Azure Cloud Config prevents service controller from adding masters
Summary: excludeMastersFromLB in Azure Cloud Config prevents service controller from a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.9
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.0
Assignee: Patrick Dillon
QA Contact: Shu Wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-28 20:42 UTC by Patrick Dillon
Modified: 2021-10-18 17:43 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:43:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5111 0 None open Bug 1987083: Azure: cloud provider config excludeMastersFromStandardLB -> false 2021-07-28 20:45:31 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:43:15 UTC

Description Patrick Dillon 2021-07-28 20:42:02 UTC
With the move to out-of-tree providers in Azure (4.10) and Azure Stack Hub(4.9), the excludeMastersFromLB: true value in the cloud provider config has created an issue where if a master node restarts the service controller will not add it back to the load balancer. 

This value should be set to false.

Comment 2 Shu Wang 2021-08-02 07:40:29 UTC
Verified fixed.
Verified with 4.9 nightly build: 4.9.0-0.nightly-2021-08-01-223336, after restarting the master, the service controller added it back to the load balancer.
Created a related test case: 
https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-4317

Comment 3 Shu Wang 2021-08-02 08:59:32 UTC
updated the test case link: 
https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-43176

Comment 4 Johnny Liu 2021-08-27 07:41:27 UTC
Add some more verification steps (per 4.9.0-0.nightly-2021-08-26-040328 build) based on comment 2.

[root@preserve-jialiu-ansible ~]# oc debug node/qeci-26032-h5ngk-master-0
Starting pod/qeci-26032-h5ngk-master-0-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.0.7
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# ps -ef|grep kubelet
root        1988       1 19 04:20 ?        00:37:05 kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node-labels=node-role.kubernetes.io/master,node.openshift.io/os_id=rhcos --node-ip= --minimum-container-ttl-duration=6m0s --cloud-provider=azure --volume-plugin-dir=/etc/kubernetes/kubelet-plugins/volume/exec --cloud-config=/etc/kubernetes/cloud.conf --hostname-override= --register-with-taints=node-role.kubernetes.io/master=:NoSchedule --pod-infra-container-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:347702e4f91395e1f3d4cbae92248fd164e58f577da8c453a9d0b225f867426b --system-reserved=cpu=500m,memory=1Gi --v=2
sh-4.4# cat /etc/kubernetes/cloud.conf
{
	"cloud": "AzurePublicCloud",
	"tenantId": "6047c7e9-b2ad-488d-a54e-dc3f6be6a7ee",
	"aadClientId": "",
	"aadClientSecret": "",
	"aadClientCertPath": "",
	"aadClientCertPassword": "",
	"useManagedIdentityExtension": true,
	"userAssignedIdentityID": "",
	"subscriptionId": "53b8f551-f0fc-4bea-8cba-6d1fefd54c8a",
	"resourceGroup": "qeci-26032-h5ngk-rg",
	"location": "centralus",
	"vnetName": "qeci-26032-h5ngk-vnet",
	"vnetResourceGroup": "qeci-26032-h5ngk-rg",
	"subnetName": "qeci-26032-h5ngk-worker-subnet",
	"securityGroupName": "qeci-26032-h5ngk-nsg",
	"routeTableName": "qeci-26032-h5ngk-node-routetable",
	"primaryAvailabilitySetName": "",
	"vmType": "",
	"primaryScaleSetName": "",
	"cloudProviderBackoff": true,
	"cloudProviderBackoffRetries": 0,
	"cloudProviderBackoffExponent": 0,
	"cloudProviderBackoffDuration": 6,
	"cloudProviderBackoffJitter": 0,
	"cloudProviderRateLimit": false,
	"cloudProviderRateLimitQPS": 0,
	"cloudProviderRateLimitBucket": 0,
	"cloudProviderRateLimitQPSWrite": 0,
	"cloudProviderRateLimitBucketWrite": 0,
	"useInstanceMetadata": true,
	"loadBalancerSku": "standard",
	"excludeMasterFromStandardLB": false,
	"disableOutboundSNAT": null,
	"maximumLoadBalancerRuleCount": 0
}sh-4.4# cat /etc/kubernetes/cloud.conf|grep excludeMasterFromStandardLB
	"excludeMasterFromStandardLB": false,
sh-4.4# exit

Comment 7 errata-xmlrpc 2021-10-18 17:43:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.