Created attachment 1769568 [details] sdn-log Created attachment 1769568 [details] sdn-log Description of problem: Customer has a service that deploys pods through daemonset hostPort to list on some TCP and UDP ports. Their external service uses sport to dport to connect to these pods on those same UDP and TCP ports. When it comes to TCP everything works as expected, but for UDP ports, everytime these pods are restarted/deleted, the UDP connections get stuck and no new entries are created on conntrack. This is only solved when the external service making the connection is restarted as well. With some research I found these issues on github for the upstream Kubernetes that seemed to fixed this issue, but I was unable to see the same piece of code change on our side: https://github.com/kubernetes/kubernetes/issues/58336 https://github.com/kubernetes/kubernetes/pull/59286 https://github.com/kubernetes/kubernetes/issues/59033 Version-Release number of selected component (if applicable): Tested this on OCP 3.11.394 as well as 4.5.35 and 4.6.21 How reproducible: This can be reproduced everytime even on the latest OCP4.5 and 4.6 releases. Steps to Reproduce: 1. Create a daemonset for a service to listen on some UDP hostPort apiVersion: apps/v1 kind: DaemonSet metadata: annotations: deprecated.daemonset.template.generation: "1" creationTimestamp: "2021-04-06T12:16:56Z" generation: 1 labels: app: udp-server name: test-udp-conntrack namespace: test-network resourceVersion: "774848" selfLink: /apis/apps/v1/namespaces/test-network/daemonsets/test-udp-conntrack uid: fadbd0c5-96d1-11eb-84c5-525400ea502a spec: revisionHistoryLimit: 10 selector: matchLabels: name: test-udp-conntrack template: metadata: creationTimestamp: null labels: app: udp-server name: test-udp-conntrack spec: containers: - args: - python /etc/test/server.py $POD_IP 8888 command: - /bin/bash - -c env: - name: HOST_IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.hostIP - name: POD_IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.podIP image: registry.redhat.io/rhscl/python-36-rhel7:latest imagePullPolicy: IfNotPresent name: udp-server ports: - containerPort: 8888 hostPort: 8888 name: udpcon protocol: UDP - containerPort: 8888 hostPort: 8888 name: tcpcon protocol: TCP resources: limits: cpu: 100m memory: 256Mi requests: cpu: 20m memory: 128Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/test name: server-script dnsPolicy: ClusterFirst nodeSelector: hostport: "true" restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 10 volumes: - configMap: defaultMode: 420 name: server-script test-udp-conntrack-7gn7m 1/1 Running 0 16m 10.128.6.96 ocp-infra-node2.openshift3.redhatrules.local <none> test-udp-conntrack-99f44 1/1 Running 0 16m 10.128.3.94 ocp-infra-node1.openshift3.redhatrules.local <none> test-udp-conntrack-lz8n6 1/1 Running 0 16m 10.128.5.113 ocp-infra-node4.openshift3.redhatrules.local <none> test-udp-conntrack-md8dq 1/1 Running 0 16m 10.128.4.156 ocp-infra-node3.openshift3.redhatrules.local <none> 2. On the nodes I see HOSTPORTS nat rules are created: # iptables -t nat -nvL | grep 8888 0 0 KUBE-HP-LGATJ4G47OMMBLH3 tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ tcp dpt:8888 0 0 KUBE-HP-2XVMR6Y6WGJIEYMN udp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ udp dpt:8888 0 0 KUBE-MARK-MASQ all -- * * 10.128.6.96 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ 0 0 DNAT udp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ udp to:10.128.6.96:8888 0 0 KUBE-MARK-MASQ all -- * * 10.128.6.96 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ 0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-7gn7m_test-network hostport 8888 */ tcp to:10.128.6.96:8888 3. Start some connection to UDP on some node IP to port 8888 and monitor conntrack entries: $ python3 udp_test-client.py 172.23.190.40 8888 # watch -n2 conntrack -L -p udp --dport=8888 udp 17 179 src=172.23.188.1 dst=172.23.190.40 sport=8888 dport=8888 src=10.128.6.96 dst=172.23.188.1 sport=8888 dport=8888 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown. 4. Delete the pod on these node and notice that when the timeout countdown reaches 0, entry disappears and no new one is created. Also checking iptables we see that it was updated as expected: $ oc delete pod test-udp-conntrack-7gn7m $ oc get pods -o wide test-udp-conntrack-99f44 1/1 Running 0 44m 10.128.3.94 ocp-infra-node1.openshift3.redhatrules.local <none> test-udp-conntrack-j5sd4 1/1 Running 0 9s 10.128.6.97 ocp-infra-node2.openshift3.redhatrules.local <none> test-udp-conntrack-lz8n6 1/1 Running 0 44m 10.128.5.113 ocp-infra-node4.openshift3.redhatrules.local <none> test-udp-conntrack-md8dq 1/1 Running 0 44m 10.128.4.156 ocp-infra-node3.openshift3.redhatrules.local <none> # iptables -t nat -nvL | grep 8888 0 0 KUBE-HP-5RALKNMEBO6Q4WVO udp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-j5sd4_test-network hostport 8888 */ udp dpt:8888 0 0 KUBE-HP-MJ2HYBIHG6LVBSHG tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* test-udp-conntrack-j5sd4_test-network hostport 8888 */ tcp dpt:8888 0 0 KUBE-MARK-MASQ all -- * * 10.128.6.97 0.0.0.0/0 /* test-udp-c # watch -n2 conntrack -L -p udp --dport=8888 conntrack v1.4.4 (conntrack-tools): 0 flow entries have been shown. 5. When restarting the client connection is established again: $ python3 udp_test-client.py 172.23.190.40 8888 CTRL+C $ python3 udp_test-client.py 172.23.190.40 8888 # watch -n2 conntrack -L -p udp --dport=8888 onntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown. udp 17 179 src=172.23.188.1 dst=172.23.190.40 sport=8888 dport=8888 src=10.128.6.97 dst=172.23.188.1 sport=8888 dport=8888 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 Actual results: UDP contrack entries are not recreated if the pod is deleted and new pod IP is created. Expected results: Node to be able to clean old entries and new entries to be created without needing to restart external services. Additional info: Python scripts on my side were kindly provided by the customer as they only creates a service that listens on UDP socket on the podIP and the client that establishes the connection to UDP port on the nodeIP. $ oc exec test-udp-conntrack-j5sd4 -- ps auxwww USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 1000090+ 1 0.0 0.0 32072 6664 ? Ss 13:01 0:00 python /etc/test/server.py 10.128.6.97 8888 This reproducer was to mimic what their application does. They are implementing Hashicorp Consul and it has 3 consul servers external to OCP and on OCP there is a daemonset that deploys the consul agents. These services connect to each other through a specific port on UDP and TCP. Another additional information that might be interesting is the fact that this behaviour happens even if we set hostNetwork: true on the daemonset and open the ports on the OS_FIREWALL_ALLOW chain in the nodes, the same issue happens even though the pods get always the hostIP and no NAT is involved.
This bug reported on 3.11.z and reproduces all the way up to 4.6.z. Setting source release to 3.11 and target to 4.8.0 to get a fix in our development branch and then clone/backport as far as needed/requested.
Hi Everyone. Because RHCOS and toolbox have neither the conntrack-tools I was able to do that with this: https://access.redhat.com/articles/5929341
and did it solve the issue?
I continue to be unable to reproduce, but I still believe the bug exists. @andcosta would you be able to use `rpm-ostree install` to install `conntrack-tools` and restart the node, and see if this issue persists? If adding conntrack fixes the issue, I'll work with coreOS team to get it in RHCOS
It turns out that opernshift/dockershim hostport doesn't implement the conntrack deletion, CRIO does. Is this something we can test with steps on https://bugzilla.redhat.com/show_bug.cgi?id=1946593#c37 ? https://github.com/openshift/origin/pull/26206
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 3.11.487 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2928