Bug 1723924

Summary:	Unexpected loss of pod hostports
Product:	OpenShift Container Platform	Reporter:	Steven Walter <stwalter>
Component:	Networking	Assignee:	Alexander Constantinescu <aconstan>
Networking sub component:	openshift-sdn	QA Contact:	zhaozhanqi <zzhao>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	unspecified	CC:	aos-bugs, bbennett, cdc, dsquirre, dzhukous, huirwang, mfuruta, nbhatt, rhowe, rvokal
Version:	3.7.1
Target Milestone:	---
Target Release:	3.11.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1740731 1740741 (view as bug list)		Environment:
Last Closed:	2019-10-18 01:34:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1740741
Bug Blocks:	1698629

Description Steven Walter 2019-06-25 18:46:03 UTC

Description of problem:
Pods with hostports somehow lose these hostports in iptables.

Version-Release number of selected component (if applicable):
3.7.72-1

How reproducible:
Unconfirmed

Actual results:
See attachments


Additional info:
Similar issue to https://bugzilla.redhat.com/show_bug.cgi?id=1629419  in 3.6 but was not resolved by updating to 3.7

Note that we have the following time range for the loss of the rules: 11Jun 12:00 GMT - 12Jun 02:15 GMT.

We see the atomic-openshift-node service restarted at that time on all 3 nodes (cause currently unknown):

```
[TEST1][user@ose3-int-a-minion-i11 ~]$ systemctl status atomic-openshift-node
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d
           └─openshift-sdn-ovs.conf
   Active: active (running) since Tue 2019-06-11 20:04:39 UTC; 6 days ago
     Docs: https://github.com/openshift/origin
 Main PID: 17382 (openshift)
   Memory: 177.1M
   CGroup: /system.slice/atomic-openshift-node.service
           ├─17382 /usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --loglevel=2
           └─17614 journalctl -k -f

[TEST1][user@ose3-int-a-minion-i12 ~]$ systemctl status atomic-openshift-node
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d
           └─openshift-sdn-ovs.conf
   Active: active (running) since Tue 2019-06-11 20:04:40 UTC; 6 days ago
     Docs: https://github.com/openshift/origin
 Main PID: 13695 (openshift)
   Memory: 572.4M
   CGroup: /system.slice/atomic-openshift-node.service
           ├─13695 /usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --loglevel=2
           └─13943 journalctl -k -f

[TEST1][user@ose3-int-a-minion-i13 ~]$ systemctl status atomic-openshift-node
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d
           └─openshift-sdn-ovs.conf
   Active: active (running) since Tue 2019-06-11 20:04:39 UTC; 6 days ago
     Docs: https://github.com/openshift/origin
 Main PID: 5004 (openshift)
   Memory: 186.3M
   CGroup: /system.slice/atomic-openshift-node.service
           ├─5004 /usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --loglevel=2
           └─5280 journalctl -k -f
```

It seems probable this is when the hostport rules were lost, i.e. didn't get reapplied on restart of the node.

Comment 6 Steven Walter 2019-06-25 21:38:27 UTC

I think I reproduced.

Two applications:
ruby-ex listening on port 8888
killme listening on port 9999

The goal was, at first, to kill the "killme" pods to see if losing their hostport would inadvertently affect the ruby-ex port. I find this did not happen.
However, restarting the atomic-openshift-node service DOES result in the ruby-ex pod losing its hostport.

Here's my data.

Terminal 1 shows my work. In terminal 2, I have a loop checking iptables-save for rules around ports 8888 and 9999

# while true; do iptables-save | grep -e 8888 -e 9999 ; date; sleep 10; done


=========================

//Scale down
[quicklab@master-0 ~]$ oc get pod
NAME              READY     STATUS      RESTARTS   AGE
killme-4-grgpx    1/1       Running     0          21s
ruby-ex-4-nhlf8   1/1       Running     0          6m
[quicklab@master-0 ~]$ oc scale dc killme --replicas=0
deploymentconfig "killme" scaled
[quicklab@master-0 ~]$ date
Tue Jun 25 17:18:50 EDT 2019

We see port 9999 disappear after the scaledown, as expected. Port 8888 remains.

-A KUBE-HOSTPORTS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp --dport 8888 -j KUBE-HP-QI2R4GYTHZW5JXWS
-A KUBE-HOSTPORTS -p tcp -m comment --comment "killme-4-grgpx_ruby hostport 9999" -m tcp --dport 9999 -j KUBE-HP-3LMI74HSZVOC4LTK
-A KUBE-HP-3LMI74HSZVOC4LTK -s 10.129.0.11/32 -m comment --comment "killme-4-grgpx_ruby hostport 9999" -j KUBE-MARK-MASQ
-A KUBE-HP-3LMI74HSZVOC4LTK -p tcp -m comment --comment "killme-4-grgpx_ruby hostport 9999" -m tcp -j DNAT --to-destination 10.129.0.11:8080
-A KUBE-HP-QI2R4GYTHZW5JXWS -s 10.129.0.9/32 -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -j KUBE-MARK-MASQ
-A KUBE-HP-QI2R4GYTHZW5JXWS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp -j DNAT --to-destination 10.129.0.9:8080
Tue Jun 25 17:18:42 EDT 2019
-A KUBE-HOSTPORTS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp --dport 8888 -j KUBE-HP-QI2R4GYTHZW5JXWS
-A KUBE-HP-QI2R4GYTHZW5JXWS -s 10.129.0.9/32 -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -j KUBE-MARK-MASQ
-A KUBE-HP-QI2R4GYTHZW5JXWS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp -j DNAT --to-destination 10.129.0.9:8080
Tue Jun 25 17:18:52 EDT 2019

=========================

Now, we'll restart the atomic-openshift-node service and check.

[quicklab@node-0 ~]$ sudo systemctl restart atomic-openshift-node
[quicklab@node-0 ~]$ date
Tue Jun 25 17:21:48 EDT 2019
[quicklab@node-0 ~]$ sudo systemctl status atomic-openshift-node
● atomic-openshift-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d
           └─openshift-sdn-ovs.conf
   Active: active (running) since Tue 2019-06-25 17:21:35 EDT; 29s ago

-A KUBE-HOSTPORTS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp --dport 8888 -j KUBE-HP-QI2R4GYTHZW5JXWS
-A KUBE-HP-QI2R4GYTHZW5JXWS -s 10.129.0.9/32 -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -j KUBE-MARK-MASQ
-A KUBE-HP-QI2R4GYTHZW5JXWS -p tcp -m comment --comment "ruby-ex-4-nhlf8_ruby hostport 8888" -m tcp -j DNAT --to-destination 10.129.0.9:8080
Tue Jun 25 17:21:32 EDT 2019
Tue Jun 25 17:21:42 EDT 2019
Tue Jun 25 17:21:52 EDT 2019

=========================

Now, I scale up killme to see if hostports will be re-added. Note that ruby-ex-4-nhlf8 is still around, even though we no longer see its iptables entry.

$ oc get pod ; date
NAME              READY     STATUS              RESTARTS   AGE
killme-1-build    0/1       Completed           0          28m
killme-4-64tk9    0/1       ContainerCreating   0          4s
ruby-ex-4-nhlf8   1/1       Running             0          12m
Tue Jun 25 17:22:25 EDT 2019


Tue Jun 25 17:22:12 EDT 2019
Tue Jun 25 17:22:22 EDT 2019
-A KUBE-HOSTPORTS -p tcp -m comment --comment "killme-4-64tk9_ruby hostport 9999" -m tcp --dport 9999 -j KUBE-HP-SG4IGRHXJSQS2BXY
-A KUBE-HP-SG4IGRHXJSQS2BXY -s 10.129.0.13/32 -m comment --comment "killme-4-64tk9_ruby hostport 9999" -j KUBE-MARK-MASQ
-A KUBE-HP-SG4IGRHXJSQS2BXY -p tcp -m comment --comment "killme-4-64tk9_ruby hostport 9999" -m tcp -j DNAT --to-destination 10.129.0.13:8080
Tue Jun 25 17:22:32 EDT 2019
-A KUBE-HOSTPORTS -p tcp -m comment --comment "killme-4-64tk9_ruby hostport 9999" -m tcp --dport 9999 -j KUBE-HP-SG4IGRHXJSQS2BXY
-A KUBE-HP-SG4IGRHXJSQS2BXY -s 10.129.0.13/32 -m comment --comment "killme-4-64tk9_ruby hostport 9999" -j KUBE-MARK-MASQ
-A KUBE-HP-SG4IGRHXJSQS2BXY -p tcp -m comment --comment "killme-4-64tk9_ruby hostport 9999" -m tcp -j DNAT --to-destination 10.129.0.13:8080
Tue Jun 25 17:22:42 EDT 2019

=========================

Additional notes:
  - I use a nodeSelector to make sure these pods always run on node-0:
$ oc get pod -o wide
NAME              READY     STATUS      RESTARTS   AGE       IP            NODE
killme-4-64tk9    1/1       Running     0          10m       10.129.0.13   node-0.datadyne.lab.pnq2.cee.redhat.com
ruby-ex-4-nhlf8   1/1       Running     0          23m       10.129.0.9    node-0.datadyne.lab.pnq2.cee.redhat.com

My service accounts have:

 oc describe scc hostnetwork
Name:						hostnetwork
Priority:					<none>
Access:						
  Users:					system:serviceaccount:default:router,system:serviceaccount:default:registry,system:serviceaccount:ruby:default
. . .

$ oc get clusterrolebinding | grep router
. . .
router-router-role-0                                                  /router-router-role                                                                                                                                    ruby/default                                                   


I think this is sufficient to show there might still be an issue. If you need help reproducing, want to re-reproduce it together, want me to show it live, or etc let me know -- hopefully we should be able to do this again at will. :)

I used the default ruby example app which you can get with:

$     oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git
$     oc new-app --name=killme centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git

I added a nodeSelector and updated the "port" section of the pod definition with these changes, respectively:

$ oc get dc -o yaml | grep hostPort -C1
          - containerPort: 8080
            hostPort: 9999
            protocol: TCP
--
          - containerPort: 8080
            hostPort: 8888
            protocol: TCP

Comment 17 Alexander Constantinescu 2019-07-30 15:19:38 UTC

Hi

So I have been able to reproduce this on 3.11

The full sequence of events (as observed by me) are:

1) Y Pods with hostPorts assigned are running on host X.

iptable rules are fine and well-defined

2) SDN systemd service dies on host X 

iptable rules remain correct once the SDN service is back up. 

3) One pod dies and is re-spawned on host X

iptable rules are inconsistent, all iptable rules are wiped and only the newly re-spawned pod rules are correct. 

----

Conclusion: there definitively seems to be a link to the loss of the SDN process memory. This also only seems to affect pods with hostPorts assigned, I have run the same tests without that and iptable rules remain consistent.  

I will have a look and try to provide a fix ASAP.

Comment 18 Alexander Constantinescu 2019-08-01 15:03:37 UTC

*** Bug 1550659 has been marked as a duplicate of this bug. ***

Comment 21 Steven Walter 2019-09-03 20:41:25 UTC

Any updates on this? Looks like the PR is still open but as the customer has been affected by this for some time I'm wondering if we can push it.

Comment 22 Alexander Constantinescu 2019-09-12 12:00:46 UTC

Hi 

The PRs on the parent branches have been merged and verified as fixing the issue (see linked issue to this BZ)

Final review of the 3.11 PR is on-going. The PR is a bit bigger, as the back-port to 3.11 required back-porting other things as well as to have this fix working. 

We hope to be able to merge the PR by the end of the week.

Best regards
Alexander

Comment 29 David 2019-10-14 07:42:51 UTC

Hi
Does this mean this will be fixed by errata https://errata.devel.redhat.com/advisory/47061?

And will form part of v3.11.152?

I am correct in guessing v3.11.152 hasn't been released yet?

Cheers

David

Comment 30 David 2019-10-17 02:33:54 UTC

Hi 
Its been over a week since this was QA'ed, just following up on my previous question #29..

Is this going to make it into v3.11.152?

When will v3.11.152 be released?

I am trying to manage the customer expectations.

Thanks
David

Comment 32 errata-xmlrpc 2019-10-18 01:34:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3139

Comment 33 Juan Luis de Sousa-Valadas 2020-02-18 09:47:51 UTC

*** Bug 1744077 has been marked as a duplicate of this bug. ***