1941612 – Openshift-kni-infra coredns error with remote worker nodes - Error: No interface nor address found for the given VIPs

Bug 1941612 - Openshift-kni-infra coredns error with remote worker nodes - Error: No interface nor address found for the given VIPs

Summary: Openshift-kni-infra coredns error with remote worker nodes - Error: No interf...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.6.z
Assignee:	Angus Thomas
QA Contact:	Victor Voronkov
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-22 13:34 UTC by Alex Krzos
Modified:	2021-08-12 01:36 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-08-12 01:36:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift baremetal-runtimecfg pull 133	0	None	open	Bug 1942506: 4.6 EUS node-ip handling extravaganza	2021-05-05 15:56:47 UTC
Red Hat Product Errata	RHBA-2021:3008	0	None	None	None	2021-08-12 01:36:51 UTC

Description Alex Krzos 2021-03-22 13:34:49 UTC

Description of problem:
With remote worker nodes infrastructure relying on openshift-kni-infra coredns will result in an error (With remote nodes) with the "render-config-coredns" init container in that it will be unable to render the coredns configuration to resolve the cluster's api-int address.  This will prevent the node from joining the cluster.



Version-Release number of selected component (if applicable):
I believe all versions are affects however we discovered the issue with 4.6.16

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:
# oc logs -n openshift-kni-infra coredns-f19-h07-000-r640 -c render-config-coredns
Error: No interface nor address found for the given VIPs
Usage:                                                                                                           
  runtimecfg render [path to kubeconfig] [paths to render]...
                        If there is one single path and it is a directory, it renders the .tmpl files in it [flags]

Flags:
      --api-port uint16          Port where the OpenShift API listens at (default 6443)
      --api-vip ip               Virtual IP Address to reach the OpenShift API
  -c, --cluster-config string    Path to cluster-config ConfigMap to retrieve ControlPlane info
      --dns-vip ip               Virtual IP Address to reach an OpenShift node resolving DNS server
  -h, --help                     help for render
      --ingress-vip ip           Virtual IP Address to reach the OpenShift Ingress Routers
      --lb-port uint16           Port where the API HAProxy LB will listen at (default 9445)
  -o, --out-dir string           Directory where the templates will be rendered
  -r, --resolvconf-path string   Optional path to a resolv.conf file to use to get upstream DNS servers (default "/etc/resolv.conf")
      --stat-port uint16         Port where the HAProxy stats API will listen at (default 50000)
      --verbose                  Display extra information about the rendering

time="2021-03-09T20:53:42Z" level=fatal msg="Error executing runtimecfg: No interface nor address found for the given VIPs"

Expected results:


Additional info:

As a workaround you can overwrite the existing static pod configuration with a new coredns static pod without the rendering script and with a specific configuration to resolve your api-int address.

Comment 1 Rom Freiman 2021-03-22 13:39:28 UTC

This issue affects a RWN deployment, in case the vip is in different subnet than the rwn.
Also, it might be obsolete by those changes, assuming they will be introduced in 4.8:

https://github.com/openshift/machine-config-operator/pull/2374
https://github.com/openshift/machine-config-operator/pull/2410

Comment 2 Ben Nemec 2021-05-04 20:16:22 UTC

This is because 4.6 did not support deployment across multiple subnets. It will be fixed by the backport in https://github.com/openshift/baremetal-runtimecfg/pull/133, although it's not strictly a duplicate of that bug.

Comment 5 Ben Nemec 2021-07-07 15:28:43 UTC

The PR for this has merged so it should be fixed now.

Comment 9 errata-xmlrpc 2021-08-12 01:36:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.42 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3008

Note You need to log in before you can comment on or make changes to this bug.