1629419 – Pod's HostPort iptables rules disappear for no clear reason

Bug 1629419 - Pod's HostPort iptables rules disappear for no clear reason

Summary: Pod's HostPort iptables rules disappear for no clear reason

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.6.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.6.z
Assignee:	Casey Callendrello
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-09-16 02:14 UTC by Miheer Salunke
Modified:	2023-03-24 14:14 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-25 18:50:47 UTC
Target Upstream Version:
Embargoed:
Flags:	stwalter: needinfo-

Attachments	(Terms of Use)

Comment 1 Miheer Salunke 2018-09-16 02:17:02 UTC

https://github.com/openshift/origin/issues/17464  -> related Upstream issue

Comment 3 Miheer Salunke 2018-09-16 02:37:28 UTC

a4/node_stack.txt.20180911_100543

goroutine 356 [chan receive, 1256 minutes]:
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config.(*EndpointsConfig).Run.func2(0xc420062120, 0xc4217f5900)
        /builddir/build/BUILD/atomic-openshift-git-0.95ef3c2/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config/config.go:138 +0x40
created by github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config.(*EndpointsConfig).Run
        /builddir/build/BUILD/atomic-openshift-git-0.95ef3c2/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config/config.go:140 +0xb9

goroutine 358 [chan receive, 1256 minutes]:
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config.(*ServiceConfig).Run.func2(0xc420062120, 0xc421053080)
        /builddir/build/BUILD/atomic-openshift-git-0.95ef3c2/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config/config.go:242 +0x40
created by github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config.(*ServiceConfig).Run
        /builddir/build/BUILD/atomic-openshift-git-0.95ef3c2/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/proxy/config/config.go:244 +0xb9

Comment 8 Ben Bennett 2018-09-26 14:20:53 UTC

Can we get the node changed to loglevel 4?  That should give us more information on what the node is doing.

Comment 14 Steven Walter 2018-11-20 22:49:42 UTC

I failed to reproduce this. I deployed a busybox daemonset with a hostport and ran a watch on its iptables rule:

Every 2.0s: sudo iptables -n -v -t nat -L KUBE-HOSTPORTS -w                      Tue Nov 20 17:42:30 2018

Chain KUBE-HOSTPORTS (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 KUBE-HP-Q3TH7YJFUJLCPTEB  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0		/
* hello-daemonset-826c2_testing hostport 14236 */ tcp dpt:14236

I also made sure there were NO other pods running on the node except for a fluentd pod from a fluentd daemonset:

$ oc get pod -o wide -n logging
NAME                                       READY     STATUS    RESTARTS   AGE       IP            NODE
. . .
logging-fluentd-wr981                      1/1       Running   0          2m        10.130.0.9    node-0.datadyne.lab.example.com
. . .

When I removed the fluentd label on that node (thus removing the pod), nothing changed to the hostport rule. Same when I re-added it, and *deleted* the fluentd daemonset.

The only thing I can think which might be different is that busybox restarts itself regularly (the container, not the pod).


$ openshift version
openshift v3.6.173.0.130
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

Comment 28 Weibin Liang 2019-01-22 16:41:12 UTC

Follow same testing steps used in v3.6 and run the testing in v3.11.72, can not see hostport rules lost problem.

Comment 34 Steven Walter 2019-06-19 17:00:50 UTC

Customer upgraded to 3.7.72. They saw the issue again when node services restarted. Uploading details.

Comment 38 Steven Walter 2019-06-19 17:04:06 UTC

Requesting customer reproduce with higher verbosity log level

Comment 41 Steven Walter 2019-06-25 18:50:47 UTC

Ack, closing in favor of https://bugzilla.redhat.com/show_bug.cgi?id=1723924

Note You need to log in before you can comment on or make changes to this bug.