+++ This bug was initially created as a clone of Bug #1897641 +++ Description of problem: Baremetal IPI with IPv6 control plane: nodes respond with duplicate packets to ICMP6 echo requests: ping -c 3 openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com PING openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com(openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20)) 56 data bytes 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=1 ttl=64 time=0.448 ms 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=1 ttl=254 time=0.669 ms (DUP!) 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=2 ttl=64 time=0.493 ms 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=2 ttl=254 time=0.520 ms (DUP!) 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=3 ttl=64 time=0.441 ms --- openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com ping statistics --- 3 packets transmitted, 3 received, +2 duplicates, 0% packet loss, time 14ms rtt min/avg/max/mdev = 0.441/0.514/0.669/0.083 ms Pinging the api address results in a loop: [kni@ocp-edge12 ~]$ ping -c3 api.ocp-edge1.lab.eng.tlv2.redhat.com PING api.ocp-edge1.lab.eng.tlv2.redhat.com(api.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::10)) 56 data bytes 64 bytes from api.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::10): icmp_seq=1 ttl=64 time=0.894 ms From registry.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::1): icmp_seq=1 Time exceeded: Hop limit From registry.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::1): icmp_seq=2 Time exceeded: Hop limit --- api.ocp-edge1.lab.eng.tlv2.redhat.com ping statistics --- 2 packets transmitted, 1 received, +2 errors, 50% packet loss, time 2ms rtt min/avg/max/mdev = 0.894/0.894/0.894/0.000 ms If we run tcpdump on the interface bridged to br-ex we can see many duplicate packets: sudo ./tcpdump -i eno6 icmp6 -nnnn tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eno6, link-type EN10MB (Ethernet), capture size 262144 bytes 16:53:29.417180 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.417880 IP6 2620:52:0:2e39::10 > 2620:52:0:2e39::1: ICMP6, echo reply, seq 1, length 64 16:53:29.417897 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.417955 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418232 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418253 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418453 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418484 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418689 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418729 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418949 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.418998 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419189 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419211 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419373 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419397 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419540 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419571 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419781 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419815 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419964 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.419990 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 16:53:29.420180 IP6 2620:52:0:2e39::1 > 2620:52:0:2e39::10: ICMP6, echo request, seq 1, length 64 Version-Release number of selected component (if applicable): 4.6.3 How reproducible: 100% Steps to Reproduce: 1. Deploy baremetal IPI with IPv6 control plane environment 2. Ping nodes hostnames or api hostname Actual results: Replies include duplicate packets which result in packet loss. Expected results: No duplicate packets. Additional info: When removing the interface from br-ex bridge there are no duplicate packets. This issue can be reproduced on both bare metal and VM environments. --- Additional comment from Antonio Ojea on 2020-11-17 07:58:18 UTC --- It turns out Tim was alread working on the same codepath, so will assign to him and try to help from the review to not duplicate efforts https://github.com/openshift/ovn-kubernetes/pull/346 --- Additional comment from Antonio Ojea on 2020-11-17 08:00:04 UTC --- Interestenly, seems the dups have ttl=254, the legits ttl=64 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=1 ttl=64 time=0.448 ms 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=1 ttl=254 time=0.669 ms (DUP!) 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=2 ttl=64 time=0.493 ms 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=2 ttl=254 time=0.520 ms (DUP!) 64 bytes from openshift-master-0.ocp-edge1.lab.eng.tlv2.redhat.com (2620:52:0:2e39::20): icmp_seq=3 ttl=64 time=0.441 ms
Verified on 4.6.0-0.nightly-2020-12-20-032710
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.6.12 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0037