Bug 1830592

Summary: [3.11] OVS flows dont seem to be getting updated with the info on ETCD
Product: OpenShift Container Platform Reporter: Paul Gozart <pgozart>
Component: NetworkingAssignee: Juan Luis de Sousa-Valadas <jdesousa>
Networking sub component: ovn-kubernetes QA Contact: zhaozhanqi <zzhao>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: urgent    
Priority: urgent CC: aconstan, mfojtik
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-27 22:57:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Gozart 2020-05-02 21:30:16 UTC
OVS flows dont seem to be getting updated with the info reflecting on ETCD. Below are outputs that provide necessary information

[xgk9kosa@p01apl881 ~]$ oc get hostsubnets | grep 172.26.230.123
p01osl0300   p01osl0300   172.26.230.24   10.55.4.0/23    [172.26.230.0/23]   [172.26.230.158, 172.26.230.165, 172.26.230.101, 172.26.230.123]


[root@p01osl0303 ~]# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:92:a6:cf brd ff:ff:ff:ff:ff:ff
    inet 172.26.230.27/23 brd 172.26.231.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet 172.26.230.148/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet 172.26.231.233/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet 172.26.230.180/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet 172.26.230.123/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe92:a6cf/64 scope link
       valid_lft forever preferred_lft forever
[root@p01osl0303 ~]#

[root@p01osl0300 ~]# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:92:b9:e8 brd ff:ff:ff:ff:ff:ff
    inet 172.26.230.24/23 brd 172.26.231.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet 172.26.230.97/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet 172.26.230.158/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet 172.26.231.230/23 brd 172.26.231.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe92:b9e8/64 scope link
       valid_lft forever preferred_lft forever
[root@p01osl0300 ~]#


[xgk9kosa@p01apl881 frestdta]$ oc exec -n openshift-sdn ovs-f542b -- ovs-ofctl -O OpenFlow13 dump-flows br0 table=100 | grep -i 0xE71A34
 cookie=0x0, duration=31130.268s, table=100, n_packets=163329, n_bytes=25918040, priority=100,ip,reg0=0xe71a34 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:172.26.230.27->tun_dst,output:1
[xgk9kosa@p01apl881 frestdta]$ oc exec -n openshift-sdn ovs-c9crb -- ovs-ofctl -O OpenFlow13 dump-flows br0 table=100 | grep -i 0xE71A34
 cookie=0x0, duration=27515.680s, table=100, n_packets=20702, n_bytes=1532308, priority=100,ip,reg0=0xe71a34 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:172.26.230.24->tun_dst,output:1


  From the provided packet traces, we were quite able to conclude that the node from which we were able to connect egress IP "172.26.230.123" is sending the traffic to node "172.26.230.27" and the node (pod) which was failing to connect was sending the traffic to node "172.26.230.24".


// TO node - 172.26.230.27 :

Good packet trace is showing, when we are trying to reach egressIP (172.26.230.123) then the source node "172.26.230.37" is sending the traffic to node "172.26.230.27" and surprisingly we are getting the response too.

$ tshark -r good-pod-host.cap -Y "icmp and ip.addr==172.26.230.123" -T fields -e frame.time -e ip.src -e ip.dst -e _ws.col.Info

May  2, 2020 12:38:36.946902000 IST	172.26.230.37,10.52.9.151	172.26.230.27,172.26.230.123	Echo (ping) request  id=0x8610, seq=0/0, ttl=64
May  2, 2020 12:38:36.947693000 IST	172.26.230.27,172.26.230.123	172.26.230.37,10.52.9.151	Echo (ping) reply    id=0x8610, seq=0/0, ttl=64 (request in 1767)


// TO node - 172.26.230.24 :

Bad packet trace is showing, when we are trying to reach egressIP (172.26.230.123) then the source node "172.26.230.34" is sending the traffic to right egress node "172.26.230.24" and there is no response. Resulting into issue.


$ tshark -r bad-pod-host.cap -Y "icmp and ip.addr==172.26.230.123" -T fields -e frame.time -e ip.src -e ip.dst -e _ws.col.Info

May  2, 2020 12:30:15.932049000 IST	172.26.230.34,10.52.2.121	172.26.230.24,172.26.230.123	Echo (ping) request  id=0x3601, seq=0/0, ttl=64
May  2, 2020 12:30:16.931987000 IST	172.26.230.34,10.52.2.121	172.26.230.24,172.26.230.123	Echo (ping) request  id=0x3601, seq=1/256, ttl=64



   From this we concluded 2 things :

[1]. The egress IP address "172.26.230.123" is physically present on node "172.26.230.27" as this node is responding to the requests.
[2]. The ovs flows on few of the nodes has "172.26.230.24" as egress node marked and some of them as "172.26.230.27". This is likely a OVS flow corruption.



    Then we have physically logged in node "172.26.230.24" & "172.26.230.27" and verified the existence of egress IP "172.26.230.123". This concluded our point [1] and we observed it was attached to node "172.26.230.27". 



$ cat sosreport-p01osl0300-02644530-2020-05-01-zzgxyim/sos_commands/networking/ip_-o_addr 
1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
1: lo    inet6 ::1/128 scope host \       valid_lft forever preferred_lft forever
2: eth0    inet 172.26.230.24/23 brd 172.26.231.255 scope global noprefixroute eth0\       valid_lft forever preferred_lft forever
2: eth0    inet 172.26.230.97/23 brd 172.26.231.255 scope global secondary eth0\       valid_lft forever preferred_lft forever
2: eth0    inet 172.26.230.158/23 brd 172.26.231.255 scope global secondary eth0\       valid_lft forever preferred_lft forever
2: eth0    inet 172.26.231.230/23 brd 172.26.231.255 scope global secondary eth0\       valid_lft forever preferred_lft forever
2: eth0    inet6 fe80::250:56ff:fe92:b9e8/64 scope link \       valid_lft forever preferred_lft forever
3: docker0    inet 172.17.0.1/16 scope global docker0\       valid_lft forever preferred_lft forever



    However the etcd database was pointing this egress IP to node "172.26.230.24".

NAME         HOST         HOST IP         SUBNET          EGRESS CIDRS        EGRESS IPS

p01osl0300   p01osl0300   172.26.230.24   10.55.4.0/23    [172.26.230.0/23]   [172.26.230.158, 172.26.230.165, 172.26.230.101, 172.26.230.123]


    So all the node where the OVS & SDN was restarted, we observed that the new rules were generating as per the above database information and it was marking node "172.26.230.24" as egress node. This resulted in restarting all the OVS & SDN pods, which has populated the proper rules.


    We then observed that the egress IP "172.26.230.123" was still present on node "172.26.230.27" and so we manually removed it from there.

# ip addr del 172.26.230.24/23 dev eth0



Much more data is attached to case 02644530 if needed.

Comment 4 Juan Luis de Sousa-Valadas 2020-05-26 23:29:04 UTC
Hi Paul,
Can you please get a core dump of the SDN pod of the node where the flows aren't getting updated while this is happening?
In order to get it you have to ssh into a node and do:

# docker cp $(docker ps --filter label=io.kubernetes.container.name=sdn -q):/usr/bin/openshift /usr/bin/openshift
# gcore -o sdn.core $(pgrep -f 'openshift start network')

This may be  a duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=1824203
Which should is likely to be fixed on the next release.

Comment 5 Paul Gozart 2020-05-27 21:00:49 UTC
(In reply to Juan Luis de Sousa-Valadas from comment #4)
> Hi Paul,
> Can you please get a core dump of the SDN pod of the node where the flows
> aren't getting updated while this is happening?
> In order to get it you have to ssh into a node and do:
> 
> # docker cp $(docker ps --filter label=io.kubernetes.container.name=sdn
> -q):/usr/bin/openshift /usr/bin/openshift
> # gcore -o sdn.core $(pgrep -f 'openshift start network')
> 
> This may be  a duplicate of:
> https://bugzilla.redhat.com/show_bug.cgi?id=1824203
> Which should is likely to be fixed on the next release.

Hi Juan,
I talked to the customer today and he said the issue seems to be resolved after upgrading to the latest 3.11.z.  This bug can be closed.
Thanks,
Paul

Comment 6 Juan Luis de Sousa-Valadas 2020-05-27 22:57:01 UTC
Hi Paul,
Thanks for the update.
As they use egressIPs someone should track BZ#1824243 and make sure they update as soon as there is errata for it. It's really hard to reproduce and only happens on very specific conditions, but it's important that they get this update because if it happens it's fairly severe. And they'll see the symptoms you just reported again.

My expectation is that it will be released either in the next z stream or the following one.