Bug 1941612

Summary: Openshift-kni-infra coredns error with remote worker nodes - Error: No interface nor address found for the given VIPs
Product: OpenShift Container Platform Reporter: Alex Krzos <akrzos>
Component: NetworkingAssignee: Angus Thomas <athomas>
Networking sub component: runtime-cfg QA Contact: Victor Voronkov <vvoronko>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: beth.white, bnemec, rfreiman
Version: 4.6Keywords: Triaged
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-12 01:36:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Krzos 2021-03-22 13:34:49 UTC
Description of problem:
With remote worker nodes infrastructure relying on openshift-kni-infra coredns will result in an error (With remote nodes) with the "render-config-coredns" init container in that it will be unable to render the coredns configuration to resolve the cluster's api-int address.  This will prevent the node from joining the cluster.



Version-Release number of selected component (if applicable):
I believe all versions are affects however we discovered the issue with 4.6.16

How reproducible:
Always

Steps to Reproduce:
1.
2.
3.

Actual results:
# oc logs -n openshift-kni-infra coredns-f19-h07-000-r640 -c render-config-coredns
Error: No interface nor address found for the given VIPs
Usage:                                                                                                           
  runtimecfg render [path to kubeconfig] [paths to render]...
                        If there is one single path and it is a directory, it renders the .tmpl files in it [flags]

Flags:
      --api-port uint16          Port where the OpenShift API listens at (default 6443)
      --api-vip ip               Virtual IP Address to reach the OpenShift API
  -c, --cluster-config string    Path to cluster-config ConfigMap to retrieve ControlPlane info
      --dns-vip ip               Virtual IP Address to reach an OpenShift node resolving DNS server
  -h, --help                     help for render
      --ingress-vip ip           Virtual IP Address to reach the OpenShift Ingress Routers
      --lb-port uint16           Port where the API HAProxy LB will listen at (default 9445)
  -o, --out-dir string           Directory where the templates will be rendered
  -r, --resolvconf-path string   Optional path to a resolv.conf file to use to get upstream DNS servers (default "/etc/resolv.conf")
      --stat-port uint16         Port where the HAProxy stats API will listen at (default 50000)
      --verbose                  Display extra information about the rendering

time="2021-03-09T20:53:42Z" level=fatal msg="Error executing runtimecfg: No interface nor address found for the given VIPs"

Expected results:


Additional info:

As a workaround you can overwrite the existing static pod configuration with a new coredns static pod without the rendering script and with a specific configuration to resolve your api-int address.

Comment 1 Rom Freiman 2021-03-22 13:39:28 UTC
This issue affects a RWN deployment, in case the vip is in different subnet than the rwn.
Also, it might be obsolete by those changes, assuming they will be introduced in 4.8:

https://github.com/openshift/machine-config-operator/pull/2374
https://github.com/openshift/machine-config-operator/pull/2410

Comment 2 Ben Nemec 2021-05-04 20:16:22 UTC
This is because 4.6 did not support deployment across multiple subnets. It will be fixed by the backport in https://github.com/openshift/baremetal-runtimecfg/pull/133, although it's not strictly a duplicate of that bug.

Comment 5 Ben Nemec 2021-07-07 15:28:43 UTC
The PR for this has merged so it should be fixed now.

Comment 9 errata-xmlrpc 2021-08-12 01:36:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.42 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3008