Description of problem: Wrong interface is selected to connect to the worker. The openstack servers created are: $ openstack server list +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------+--------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------+--------------------+--------+ | c9eb09a2-adf5-4906-b57b-d0cdd0835ee9 | ostest-xp2wj-worker-0-vtnvj | ACTIVE | StorageNFS=172.17.5.213; ostest-xp2wj-openshift=10.196.1.42 | ostest-xp2wj-rhcos | | | 50dee8e8-9bd4-46d2-89f4-1939108e9a48 | ostest-xp2wj-worker-0-8kp42 | ACTIVE | StorageNFS=172.17.5.181; ostest-xp2wj-openshift=10.196.1.247 | ostest-xp2wj-rhcos | | | b6c16079-8117-48db-a777-2e10545587e9 | ostest-xp2wj-worker-0-5nbxp | ACTIVE | StorageNFS=172.17.5.199; ostest-xp2wj-openshift=10.196.1.151 | ostest-xp2wj-rhcos | | | 7c43bc0a-bcca-429c-bbd3-fabe9901dd35 | ostest-xp2wj-master-2 | ACTIVE | ostest-xp2wj-openshift=10.196.3.145 | ostest-xp2wj-rhcos | | | 3cbb090c-96c5-4f0b-98a8-75707504d3d7 | ostest-xp2wj-master-1 | ACTIVE | ostest-xp2wj-openshift=10.196.0.41 | ostest-xp2wj-rhcos | | | a6bc10d9-866f-4864-b9c6-e54b5853d0ed | ostest-xp2wj-master-0 | ACTIVE | ostest-xp2wj-openshift=10.196.2.254 | ostest-xp2wj-rhcos | | +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------+--------------------+--------+ The workers have two IPs, one for StorageNFS (for manila), and the other one for regular machine Subnet. However, one of the workers are using the StorageNFS network to create the debug pod on it (and failing): $ oc debug node/ostest-xp2wj-worker-0-vtnvj Starting pod/ostest-xp2wj-worker-0-vtnvj-debug ... To use host binaries, run `chroot /host` Pod IP: 172.17.5.213 If you don't see a command prompt, try pressing enter. Removing debug pod ... Error from server: error dialing backend: dial tcp 172.17.5.213:10250: i/o timeout Moreover, the pods running on that problematic worker cannot be accessed: $ oc get pods -n demo -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES demo-7897db69cc-4zlvj 1/1 Running 0 3h3m 10.131.0.26 ostest-xp2wj-worker-0-8kp42 <none> <none> demo-7897db69cc-d2g2n 1/1 Running 0 3h3m 10.129.2.46 ostest-xp2wj-worker-0-vtnvj <none> <none> demo-7897db69cc-zdngv 1/1 Running 0 3h3m 10.128.2.13 ostest-xp2wj-worker-0-5nbxp <none> <none> (shiftstack) [stack@undercloud-0 ~]$ oc rsh -n demo demo-7897db69cc-d2g2n Error from server: error dialing backend: dial tcp 172.17.5.213:10250: i/o timeout (shiftstack) [stack@undercloud-0 ~]$ The other two workers work fine: $ oc debug node/ostest-xp2wj-worker-0-5nbxp Starting pod/ostest-xp2wj-worker-0-5nbxp-debug ... To use host binaries, run `chroot /host` Pod IP: 10.196.1.151 If you don't see a command prompt, try pressing enter. sh-4.4# I observed they have their IPs defined in different order in their status section: - Problematic worker: (shiftstack) [stack@undercloud-0 ~]$ oc get node/ostest-xp2wj-worker-0-vtnvj -o json | jq .status.addresses [ { "address": "172.17.5.213", "type": "InternalIP" }, { "address": "10.196.1.42", "type": "InternalIP" }, { "address": "ostest-xp2wj-worker-0-vtnvj", "type": "Hostname" } ] - the other one: $ oc get node/ostest-xp2wj-worker-0-5nbxp -o json | jq .status.addresses [ { "address": "10.196.1.151", "type": "InternalIP" }, { "address": "172.17.5.199", "type": "InternalIP" }, { "address": "ostest-xp2wj-worker-0-5nbxp", "type": "Hostname" } ] It is also observed that node has only one interface on its status when externallCCM is not enabled: $ openstack server list +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------+--------------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------+--------------------+--------+ | 6192993a-cc9d-4e65-b0e3-ddf4828e2e24 | ostest-ngz6v-worker-0-f7xzq | ACTIVE | StorageNFS=172.17.5.158; ostest-ngz6v-openshift=10.196.1.154 | ostest-ngz6v-rhcos | | and: $ oc get node/ostest-ngz6v-worker-0-f7xzq -o json | jq .status.addresses [ { "address": "10.196.1.154", "type": "InternalIP" }, { "address": "ostest-ngz6v-worker-0-f7xzq", "type": "Hostname" } ] Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2021-12-06-123512 How reproducible: Random Steps to Reproduce: Install OCP4.10 enabling externalCCM. Actual results: worker node cannot be reached. Expected results: worker node and the pods running on it can be accessed. Additional info: must gather: http://file.rdu.redhat.com/rlobillo/must-gather-install.tar.gz
This bug feels like a can of worms. Firstly I don't actually know how oc debug works, but to make my assumptions explicit, I am guessing it: * Starts a pod on a node with host networking * Uses the PodIP assigned to that pod by kubelet * Connects to the PodIP internally from... presumably an apiserver. The docs [1] describe InternalIP as "Typically the IP address of the node that is routable only within the cluster", but also "The usage of these fields varies depending on your cloud provider or bare metal configuration.". This seems a bit vague, but my interpretation is that it lists endpoint addresses for this node for internal communication, and a storage network exposed on a subset of nodes probably doesn't meet that criterion. The comment on GetNodeHostIPs() in kubernetes/kubernetes [2] suggests it's the immediate source of oc debug's errant IP. We should separate the 2 different lists of Addresses here: The Machine object has a list of Addresses. These are not directly used by kubernetes, but define a list of IP addresses which will be approved if requested in a CSR generated by the kubelet running on the host. This list is generated by CAPO. The Node object has a list of Addresses. These are the actual addresses used by kubernetes. These must be a (non-strict) subset of the Addresses defined on the Machine object, or kubelet will fail to come up when its CSR is not approved. These addresses are written by either kubelet (legacy cloud provider) or the CCM (external cloud provider). This bug concerns the list of addresses on the Node object, and is therefore a cloud provider issue, not a CAPO issue. For investigation: * What's the implementation difference between legacy cloud provider (OpenStack) and CCM (OpenStack) * What metadata is available to CCM to distinguish 'cluster' network(s) from infra networks? [1] https://kubernetes.io/docs/concepts/architecture/nodes/#addresses [2] https://github.com/kubernetes/kubernetes/blob/cc6f12583f2b611e9469a6b2e0247f028aae246b/pkg/util/node/node.go#L89-L93
Setting the priority as "medium". This must be properly investigated (and possibly resolved) before GA, which is not imminent.
NodeAddresses are generated quite differently on legacy vs external cloud provider. Legacy cloud provider: https://github.com/kubernetes/legacy-cloud-providers/blob/1a100831c5a0718b3ef6ae88bf506d383d387b45/openstack/openstack.go#L565-L626 External cloud provider: https://github.com/kubernetes/cloud-provider-openstack/blob/d46aa87534042ad1e26b812d1ef1aa140317a25e/pkg/openstack/instances.go#L458-L565 where interfaces is provided by: https://github.com/kubernetes/cloud-provider-openstack/blob/d46aa87534042ad1e26b812d1ef1aa140317a25e/pkg/openstack/instances.go#L611-L629
From git spelunking, this seems relevant: https://github.com/kubernetes/cloud-provider-openstack/issues/407
Ok, we need to fix this by setting internal-network-name in cloud.conf. We currently don't have a mechanism to customise cloud.conf, so it's not yet possible to fix. However, we need to allow this anyway as a matter of urgency before we can GA this feature. Once we have the ability to customise cloud.conf this should be a relatively simple fix.
Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing
*** Bug 2045493 has been marked as a duplicate of this bug. ***
*** Bug 2043659 has been marked as a duplicate of this bug. ***
Verified on 4.11.0-0.nightly-2022-04-08-205307 on top of RHOS-16.2-RHEL-8-20220311.n.1 installing OCP cluster with eCCM enabled on featureGate: $ oc get featureGate/cluster -o yaml apiVersion: config.openshift.io/v1 kind: FeatureGate metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" creationTimestamp: "2022-04-11T13:29:12Z" generation: 1 name: cluster resourceVersion: "1379" uid: 5766ece9-3dcc-4982-b355-f4c37d739ab9 spec: customNoUpgrade: enabled: - ExternalCloudProvider featureSet: CustomNoUpgrade $ oc get pods -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE openstack-cloud-controller-manager-7f7f67c5f8-lb4wx 1/1 Running 0 65m openstack-cloud-controller-manager-7f7f67c5f8-sh9h9 1/1 Running 0 65m --------------------------- Once the cluster is deployed through IPI, all the machines includes the primarySubnet: $ oc get machine -n openshift-machine-api -o json | jq .items[].spec.providerSpec.value.primarySubnet "f94ecb70-604a-447f-896b-6fc40b045e4c" "f94ecb70-604a-447f-896b-6fc40b045e4c" "f94ecb70-604a-447f-896b-6fc40b045e4c" "f94ecb70-604a-447f-896b-6fc40b045e4c" "f94ecb70-604a-447f-896b-6fc40b045e4c" "f94ecb70-604a-447f-896b-6fc40b045e4c" And the workers have two networks defined: $ oc get machines -n openshift-machine-api -l machine.openshift.io/cluster-api-machine-role=worker -o json | jq .items[].spec.providerSpec.value.networks [ { "filter": {}, "subnets": [ { "filter": {}, "uuid": "f94ecb70-604a-447f-896b-6fc40b045e4c" } ] }, { "filter": {}, "noAllowedAddressPairs": true, "uuid": "b37bbd3d-e5f9-45ce-a9f9-6934f8f7d0ac" } ] [ { "filter": {}, "subnets": [ { "filter": {}, "uuid": "f94ecb70-604a-447f-896b-6fc40b045e4c" } ] }, { "filter": {}, "noAllowedAddressPairs": true, "uuid": "b37bbd3d-e5f9-45ce-a9f9-6934f8f7d0ac" } ] [ { "filter": {}, "subnets": [ { "filter": {}, "uuid": "f94ecb70-604a-447f-896b-6fc40b045e4c" } ] }, { "filter": {}, "noAllowedAddressPairs": true, "uuid": "b37bbd3d-e5f9-45ce-a9f9-6934f8f7d0ac" } ] where the primarySubnet is: $ openstack subnet list | grep f94ecb70-604a-447f-896b-6fc40b045e4c | f94ecb70-604a-447f-896b-6fc40b045e4c | restricted_subnet | 059e58b8-fd1c-41d1-b44c-d7fced04d078 | 172.16.0.0/24 | and the secondary network is the one used for integrating with Manila: $ openstack subnet list | grep b37bbd3d-e5f9-45ce-a9f9-6934f8f7d0ac | 5dbe57da-73ea-457f-b044-8f05459d9368 | StorageNFSSubnet | b37bbd3d-e5f9-45ce-a9f9-6934f8f7d0ac | 172.17.5.0/24 | As expected, nodes only include the IP defined as primary: $ oc get nodes -o json | jq '.items[].status.addresses[]' { "address": "172.16.0.67", "type": "InternalIP" } { "address": "ostest-f97g7-master-0", "type": "Hostname" } { "address": "172.16.0.50", "type": "InternalIP" } { "address": "ostest-f97g7-master-1", "type": "Hostname" } { "address": "172.16.0.87", "type": "InternalIP" } { "address": "ostest-f97g7-master-2", "type": "Hostname" } { "address": "172.16.0.59", "type": "InternalIP" } { "address": "ostest-f97g7-worker-0-5m8x2", "type": "Hostname" } { "address": "172.16.0.210", "type": "InternalIP" } { "address": "ostest-f97g7-worker-0-7tzcq", "type": "Hostname" } { "address": "172.16.0.203", "type": "InternalIP" } { "address": "ostest-f97g7-worker-0-rh8x8", "type": "Hostname" } and the openstack instances have the two interfaces attached: $ openstack server list -c Name -c Networks +-----------------------------+-----------------------------------------------------------------------------------+ | Name | Networks | +-----------------------------+-----------------------------------------------------------------------------------+ | ostest-f97g7-worker-0-7tzcq | StorageNFS=172.17.5.220; restricted_network=172.16.0.210 | | ostest-f97g7-worker-0-5m8x2 | StorageNFS=172.17.5.175; restricted_network=172.16.0.59 | | ostest-f97g7-worker-0-rh8x8 | StorageNFS=172.17.5.162; restricted_network=172.16.0.203 | | ostest-f97g7-master-2 | restricted_network=172.16.0.87 | | ostest-f97g7-master-1 | restricted_network=172.16.0.50 | | ostest-f97g7-master-0 | restricted_network=172.16.0.67 | | installer_host | installer_host-network=172.16.40.208, 10.46.44.182; restricted_network=172.16.0.3 | +-----------------------------+-----------------------------------------------------------------------------------+ the cluster is fully operational, as observed running the tests. The same is working with all the networkTypes: OpenShiftSDN, Kuryr and OVNKubernetes.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069