1757777 – hostIP and podIP differs with hostNetwork: true

Bug 1757777 - hostIP and podIP differs with hostNetwork: true

Summary: hostIP and podIP differs with hostNetwork: true

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.z
Hardware:	x86_64
OS:	All
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.3.0
Assignee:	Dan Winship
QA Contact:	zhaozhanqi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-10-02 12:29 UTC by Radim Vansa
Modified:	2020-05-13 21:26 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-05-13 21:26:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:0062	0	None	None	None	2020-05-13 21:26:27 UTC

Description Radim Vansa 2019-10-02 12:29:48 UTC

Description of problem:

OCP 4.1.14 UPI on bare-metal with multiple NICs. By setting up reverse hostname lookup in DNS properly I can make all nodes use the correct NICs (host IPs are in my 192.168.0.x range). However, it seems that when there's a pod with `hostNetwork: true` this gets PodIP that matches the default route (at least that's how it seems) instead of matching the HostIP. This is incorrect: the default route goes to the public internet and these public IPs on some nodes are not reachable from another nodes. That causes issues, e.g. not being able to scrape openshift-monitoring/node-exporters. The public IPs should be used exclusively for reaching outside world (downloading images...), internally OCP should use the interface bound to node IP.

Example info:
```
> oc get po -o wide
NAME                                           READY   STATUS    RESTARTS   AGE   IP             NODE           NOMINATED NODE   READINESS GATES
node-exporter-87t54                            2/2     Running   0          12s   192.168.0.22   master2        <none>           <none>
node-exporter-j76g8                            2/2     Running   0          11s   10.1.184.117   benchserver4   <none>           <none>
node-exporter-mbwt8                            2/2     Running   0          13s   10.1.184.154   benchserver6   <none>           <none>
node-exporter-ml7pr                            2/2     Running   0          11s   192.168.0.23   master3        <none>           <none>
node-exporter-zbcz7                            2/2     Running   0          5s    192.168.0.21   master1        <none>           <none>
...
```

```
> oc get node -o wide
NAME           STATUS   ROLES    AGE   VERSION             INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                                                   KERNEL-VERSION               CONTAINER-RUNTIME
benchserver4   Ready    worker   9d    v1.13.4+3bd346709   192.168.0.80    <none>        OpenShift Enterprise                                       3.10.0-1062.1.1.el7.x86_64   cri-o://1.13.11-0.1.dev.rhaos4.1.git59b6bdb.el7-dev
benchserver6   Ready    worker   13d   v1.13.4+3bd346709   192.168.0.100   <none>        Red Hat Enterprise Linux CoreOS 410.8.20190830.0 (Ootpa)   4.18.0-80.7.2.el8_0.x86_64   cri-o://1.13.11-0.4.dev.rhaos4.1.git59b6bdb.el8-dev
master1        Ready    master   55d   v1.13.4+3bd346709   192.168.0.21    <none>        Red Hat Enterprise Linux CoreOS 410.8.20190830.0 (Ootpa)   4.18.0-80.7.2.el8_0.x86_64   cri-o://1.13.11-0.4.dev.rhaos4.1.git59b6bdb.el8-dev
master2        Ready    master   55d   v1.13.4+3bd346709   192.168.0.22    <none>        Red Hat Enterprise Linux CoreOS 410.8.20190830.0 (Ootpa)   4.18.0-80.7.2.el8_0.x86_64   cri-o://1.13.11-0.4.dev.rhaos4.1.git59b6bdb.el8-dev
master3        Ready    master   55d   v1.13.4+3bd346709   192.168.0.23    <none>        Red Hat Enterprise Linux CoreOS 410.8.20190830.0 (Ootpa)   4.18.0-80.7.2.el8_0.x86_64   cri-o://1.13.11-0.4.dev.rhaos4.1.git59b6bdb.el8-dev
```

Version-Release number of selected component (if applicable):

4.1.14

Actual results:

Nodes on benchserver4 and benchserver6 get IPs from range 10.1.184.x.

Expected results:

Nodes on benchserver4 and benchserver6 should get IPs 192.168.0.80 and 192.168.0.100, respectively.

Comment 1 Ben Bennett 2019-10-02 12:56:27 UTC

This is not a regression in 4.2.0 (and I suspect it's behaved like this forever).  Pushing to 4.3.0 to consider the solution, and then we can decide if it merits a backport.

Comment 2 Casey Callendrello 2019-11-13 13:00:59 UTC

Dan, I think you fixed this with the latest CRIO changes for ipv6, right?

Comment 3 Peter Hunt 2019-11-13 13:47:05 UTC

Yes, this has been fixed with the changes that recently got merged into 1.16

Comment 5 zhaozhanqi 2019-11-14 05:15:56 UTC

Verified this bug on 4.3.0-0.nightly-2019-11-13-233341 

<none>
openshift-monitoring                                    node-exporter-6k6dh                                               2/2     Running     0          161m   192.168.0.18          <none>           <none>
openshift-monitoring                                    node-exporter-9kmdk                                               2/2     Running     0          74m    192.168.0.29      <none>           <none>
openshift-monitoring                                    node-exporter-dgjxm                                               2/2     Running     0          161m   192.168.0.16      <none>           <none>
openshift-monitoring                                    node-exporter-h9nr4                                               2/2     Running     0          161m   192.168.0.13          <none>           <none>
openshift-monitoring                                    node-exporter-hnwtt                                               2/2     Running     0          161m   192.168.0.22          <none>           <none>
openshift-monitoring                                    node-exporter-kxc8c                                               2/2     Running     0          161m   192.168.0.20      <none>           <none>
openshift-monitoring                                    node-exporter-x6mvr                                               2/2     Running     0          161m   192.168.0.14      <none>           <none>

Comment 7 errata-xmlrpc 2020-05-13 21:26:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062

Note You need to log in before you can comment on or make changes to this bug.