Description of problem: * Setup is OVN-DVR HA. Installing OpenShift 3.11 on top as a tenant. * openshift instances have fips attached to them. * On the same tenant, created an instance to act as the DNS server for the openshift instances. * The DNS-instance is on a separated subnet than the openshift nodes. * Each network is connected to an openstack router and the external_gateway network is the same on both routers * Security group rules allow DNS traffic on both instances. Eventually, DNS queries are not reaching the DNS server. The same configuration on OVS and on OVN-HA is working (from a Jenkins job) - we only see failure on OVN-DVR HA Note: ICMP and SSH are allowed on the SG and working ok. Version-Release number of selected component (if applicable): python-networking-ovn-4.0.3-7.el7ost How reproducible: 100% Steps to Reproduce: 1. Deploy OVN-DVR HA 2. Create a tenant and 2 networks+subnets 3. Create FIPS 4. Boot 2 instances - one on each network and attach the fips to them 5. allow icmp ssh and dns on the security group assigned to the instances 6. set one of the vms as the DNS server of the other and try to resolve a hostname (www.google.com) Actual results: DNS traffic is going out of one of the vm but not seen in the other. Expected results: All allowed traffic should be seen. Additional info: sosreports attached
I got access to the env and did some troubleshooting. The ping to fip from one VM to DNS works. I tried to debug the openflow rules and the output is as follows: [root@compute-0 ~]# ovs-appctl ofproto/trace br-int in_port=4 fa163e979e54fa163ed18f3608004500003868d6400040118d05c0a863160a2e16eda42400350024fd9005290100000100000000000006676f6f676c6503636f6d0000010001 Flow: udp,in_port=4,vlan_tci=0x0000,dl_src=fa:16:3e:d1:8f:36,dl_dst=fa:16:3e:97:9e:54,nw_src=192.168.99.22,nw_dst=10.46.22.237,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=42020,tp_dst=53 bridge("br-int") ---------------- 0. in_port=4, priority 100 set_field:0x1->reg13 set_field:0xc->reg11 set_field:0x11->reg12 set_field:0x8->metadata set_field:0xd->reg14 resubmit(,8) 8. reg14=0xd,metadata=0x8,dl_src=fa:16:3e:d1:8f:36, priority 50, cookie 0x67403e67 resubmit(,9) 9. ip,reg14=0xd,metadata=0x8,dl_src=fa:16:3e:d1:8f:36,nw_src=192.168.99.22, priority 90, cookie 0x6a21fbd3 resubmit(,10) 10. metadata=0x8, priority 0, cookie 0x2cfde4c8 resubmit(,11) 11. ip,metadata=0x8, priority 100, cookie 0x6e86a67b load:0x1->NXM_NX_XXREG0[96] resubmit(,12) 12. metadata=0x8, priority 0, cookie 0xd097c17b resubmit(,13) 13. ip,reg0=0x1/0x1,metadata=0x8, priority 100, cookie 0x3d5e8cda ct(table=14,zone=NXM_NX_REG13[0..15]) drop -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 14. Final flow: udp,reg0=0x1,reg11=0xc,reg12=0x11,reg13=0x1,reg14=0xd,metadata=0x8,in_port=4,vlan_tci=0x0000,dl_src=fa:16:3e:d1:8f:36,dl_dst=fa:16:3e:97:9e:54,nw_src=192.168.99.22,nw_dst=10.46.22.237,nw_tos=0,nw_ecn$0,nw_ttl=64,tp_src=42020,tp_dst=53 Megaflow: recirc_id=0,eth,udp,in_port=4,vlan_tci=0x0000/0x1000,dl_src=fa:16:3e:d1:8f:36,nw_src=192.168.99.22,nw_dst=10.46.22.237,nw_frag=no Datapath actions: ct(zone=1),recirc(0x14fa) =============================================================================== recirc(0x14fa) - resume conntrack with default ct_state=trk|new (use --ct-next to customize) =============================================================================== Flow: recirc_id=0x14fa,ct_state=new|trk,ct_zone=1,eth,udp,reg0=0x1,reg11=0xc,reg12=0x11,reg13=0x1,reg14=0xd,metadata=0x8,in_port=4,vlan_tci=0x0000,dl_src=fa:16:3e:d1:8f:36,dl_dst=fa:16:3e:97:9e:54,nw_src=192.168. 99.22,nw_dst=10.46.22.237,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=42020,tp_dst=53 bridge("br-int") ---------------- thaw Resuming from table 14 14. ct_state=+new-est+trk,ip,reg14=0xd,metadata=0x8, priority 2002, cookie 0xe66b6d51 load:0x1->NXM_NX_XXREG0[97] resubmit(,15) 15. metadata=0x8, priority 0, cookie 0x5d4f6ff6 resubmit(,16) 16. metadata=0x8, priority 0, cookie 0xce883650 resubmit(,17) 17. metadata=0x8, priority 0, cookie 0x1636b72d resubmit(,18) 18. ip,reg0=0x2/0x2,metadata=0x8, priority 100, cookie 0xfaf2c006 ct(commit,zone=NXM_NX_REG13[0..15],exec(load:0->NXM_NX_CT_LABEL[0])) load:0->NXM_NX_CT_LABEL[0] resubmit(,19) 19. metadata=0x8, priority 0, cookie 0x779117cc resubmit(,20) 20. metadata=0x8, priority 0, cookie 0xa9f4938c resubmit(,21) 21. metadata=0x8, priority 0, cookie 0xc08d5434 resubmit(,22) 22. udp,metadata=0x8,tp_dst=53, priority 100, cookie 0xf1edfbc1 controller(userdata=00.00.00.06.00.00.00.00.00.01.de.10.00.00.00.64,pause) Final flow: recirc_id=0x14fa,eth,udp,reg0=0x3,reg11=0xc,reg12=0x11,reg13=0x1,reg14=0xd,metadata=0x8,in_port=4,vlan_tci=0x0000,dl_src=fa:16:3e:d1:8f:36,dl_dst=fa:16:3e:97:9e:54,nw_src=192.168.99.22,nw_dst=10.46.22 .237,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=42020,tp_dst=53 Megaflow: recirc_id=0x14fa,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,udp,in_port=4,dl_src=fa:16:3e:d1:8f:36,nw_dst=0.0.0.0/1,nw_frag=no,tp_dst=53 Datapath actions: ct(commit,zone=1,label=0/0x1),userspace(pid=4294929842,controller(reason=1,dont_send=0,continuation=1,recirc_id=5371,rule_cookie=0xe66b6d51,controller_id=0,max_len=65535))
Dumping some more info for troubleshooting. Some information about how packet goes: VM1 is the origin VM with MAC fa:16:3e:d1:8f:36 and fixed IP 192.168.99.22 VM2 is the destination VM with MAC fa:16:3e:ca:51:a1 , fixed IP 192.168.23.12 and FIP 10.46.22.237 This is ovn-trace output for UDP packet going to port 53, so A query like: # udp,reg14=0xd,vlan_tci=0x0000,dl_src=fa:16:3e:d1:8f:36,dl_dst=fa:16:3e:97:9e:54,nw_src=192.168.99.22,nw_dst=10.46.22.237,nw_tos=0,nw_ecn=0,nw_ttl=32,tp_src=0,tp_dst=53 ingress(dp="openshift-ansible-openshift.example.com-net", inport="openshift.example.com-infra_nodes-uemxzyd64o7k-2-x3fovb4iwk32-port-h7g4unkcw4af") --------------------------------------------------------------------------------------------------------------------------------------------------- 0. ls_in_port_sec_l2 (ovn-northd.c:3869): inport == "openshift.example.com-infra_nodes-uemxzyd64o7k-2-x3fovb4iwk32-port-h7g4unkcw4af" && eth.src == {fa:16:3e:d1:8f:36}, priority 50, uuid 67403e67 next; 1. ls_in_port_sec_ip (ovn-northd.c:2851): inport == "openshift.example.com-infra_nodes-uemxzyd64o7k-2-x3fovb4iwk32-port-h7g4unkcw4af" && eth.src == fa:16:3e:d1:8f:36 && ip4.src == {192.168.99.22}, priority 90, uuid 6a21fbd3 next; 3. ls_in_pre_acl (ovn-northd.c:3152): ip, priority 100, uuid 6e86a67b reg0[0] = 1; next; 5. ls_in_pre_stateful (ovn-northd.c:3289): reg0[0] == 1, priority 100, uuid 3d5e8cda ct_next; ct_next(ct_state=est|trk /* default (use --ct to customize) */) --------------------------------------------------------------- 6. ls_in_acl (ovn-northd.c:3497): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (inport == "openshift.example.com-infra_nodes-uemxzyd64o7k-2-x3fovb4iwk32-port-h7g4unkcw4af" && ip4), priority 2002, uuid c4545831 next; 14. ls_in_dns_lookup (ovn-northd.c:4129): udp.dst == 53, priority 100, uuid f1edfbc1 reg0[4] = dns_lookup(); *** dns_lookup action not implemented next; 16. ls_in_l2_lkup (ovn-northd.c:4263): eth.dst == fa:16:3e:97:9e:54, priority 50, uuid 53f27953 outport = "d652c4"; output; egress(dp="openshift-ansible-openshift.example.com-net", inport="openshift.example.com-infra_nodes-uemxzyd64o7k-2-x3fovb4iwk32-port-h7g4unkcw4af", outport="d652c4") -------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1. ls_out_pre_acl (ovn-northd.c:3111): ip && outport == "d652c4", priority 110, uuid 58adc831 next; 9. ls_out_port_sec_l2 (ovn-northd.c:4346): outport == "d652c4", priority 50, uuid 59f4dc46 output; /* output to "d652c4", type "patch" */ ingress(dp="openshift-ansible-openshift.example.com-router", inport="lrp-d652c4") --------------------------------------------------------------------------------- 0. lr_in_admission (ovn-northd.c:4892): eth.dst == fa:16:3e:97:9e:54 && inport == "lrp-d652c4", priority 50, uuid 51faa532 next; 7. lr_in_ip_routing (ovn-northd.c:4474): ip4.dst == 10.46.22.192/26, priority 53, uuid eb1bc3b6 ip.ttl--; reg0 = ip4.dst; [53/1345] reg1 = 10.46.22.227; eth.src = fa:16:3e:48:ab:52; outport = "lrp-ac16ea"; flags.loopback = 1; next; 8. lr_in_arp_resolve (ovn-northd.c:6198): ip4, priority 0, uuid 7c3bb779 get_arp(outport, reg0); /* MAC binding to fa:16:3e:1c:7d:58. */ next; 9. lr_in_gw_redirect (ovn-northd.c:5655): ip4.src == 192.168.99.22 && outport == "lrp-ac16ea", priority 100, uuid 455088b9 next; 10. lr_in_arp_request (ovn-northd.c:6305): 1, priority 0, uuid 99698a3c output; egress(dp="openshift-ansible-openshift.example.com-router", inport="lrp-d652c4", outport="lrp-ac16ea") ------------------------------------------------------------------------------------------------------ 0. lr_out_undnat (ovn-northd.c:5577): ip && ip4.src == 192.168.99.22 && outport == "lrp-ac16ea", priority 100, uuid 4803e23a eth.src = fa:16:3e:42:7d:33; ct_dnat; ct_dnat /* assuming no un-dnat entry, so no change */ ----------------------------------------------------- 1. lr_out_snat (ovn-northd.c:5624): ip && ip4.src == 192.168.99.22 && outport == "lrp-ac16ea", priority 33, uuid 51ff932f eth.src = fa:16:3e:42:7d:33; ct_snat(10.46.22.246); ct_snat(ip4.src=10.46.22.246) ----------------------------- 3. lr_out_delivery (ovn-northd.c:6333): outport == "lrp-ac16ea", priority 100, uuid 66203690 output; /* output to "lrp-ac16ea", type "patch" */ ingress(dp="nova", inport="ac16ea") ----------------------------------- 0. ls_in_port_sec_l2 (ovn-northd.c:3869): inport == "ac16ea", priority 50, uuid 187c129c next; 16. ls_in_l2_lkup (ovn-northd.c:4287): eth.dst == fa:16:3e:1c:7d:58 && is_chassis_resident("8e829c"), priority 50, uuid b641caf6 outport = "34960d"; output; egress(dp="nova", inport="ac16ea", outport="34960d") ---------------------------------------------------- 9. ls_out_port_sec_l2 (ovn-northd.c:4346): outport == "34960d", priority 50, uuid 5903326c output; /* output to "34960d", type "patch" */ ingress(dp="openshift_dns", inport="lrp-34960d") ------------------------------------------------ 0. lr_in_admission (ovn-northd.c:5642): eth.dst == fa:16:3e:1c:7d:58 && inport == "lrp-34960d" && is_chassis_resident("8e829c"), priority 50, uuid b3c9fb5d next; 3. lr_in_unsnat (ovn-northd.c:5477): ip && ip4.dst == 10.46.22.237 && inport == "lrp-34960d", priority 100, uuid 041c713b ct_snat; ct_snat /* assuming no un-snat entry, so no change */ ----------------------------------------------------- 4. lr_in_dnat (ovn-northd.c:5535): ip && ip4.dst == 10.46.22.237 && inport == "lrp-34960d", priority 100, uuid c04edae5 ct_dnat(192.168.23.12); ct_dnat(ip4.dst=192.168.23.12) ------------------------------ 7. lr_in_ip_routing (ovn-northd.c:4474): ip4.dst == 192.168.23.0/24, priority 49, uuid 26cf3eea ip.ttl--; reg0 = ip4.dst; reg1 = 192.168.23.1; eth.src = fa:16:3e:20:18:e4; outport = "lrp-1f7585"; flags.loopback = 1; next; 8. lr_in_arp_resolve (ovn-northd.c:6091): outport == "lrp-1f7585" && reg0 == 192.168.23.12, priority 100, uuid 881813c2 eth.dst = fa:16:3e:ca:51:a1; next; 10. lr_in_arp_request (ovn-northd.c:6305): 1, priority 0, uuid cdff6673 output; egress(dp="openshift_dns", inport="lrp-34960d", outport="lrp-1f7585") --------------------------------------------------------------------- 3. lr_out_delivery (ovn-northd.c:6333): outport == "lrp-1f7585", priority 100, uuid fd6cd974 output; /* output to "lrp-1f7585", type "patch" */ ingress(dp="openshift_dns", inport="1f7585") -------------------------------------------- 0. ls_in_port_sec_l2 (ovn-northd.c:3869): inport == "1f7585", priority 50, uuid 8c591e40 next; 3. ls_in_pre_acl (ovn-northd.c:3109): ip && inport == "1f7585", priority 110, uuid 326c783d next; 14. ls_in_dns_lookup (ovn-northd.c:4129): udp.dst == 53, priority 100, uuid 56b5a6e2 reg0[4] = dns_lookup(); *** dns_lookup action not implemented next; 16. ls_in_l2_lkup (ovn-northd.c:4202): eth.dst == fa:16:3e:ca:51:a1, priority 50, uuid 7e4e7f82 outport = "8e829c"; output; egress(dp="openshift_dns", inport="1f7585", outport="8e829c") ------------------------------------------------------------- 1. ls_out_pre_acl (ovn-northd.c:3154): ip, priority 100, uuid 2b6362c4 reg0[0] = 1; next; 2. ls_out_pre_stateful (ovn-northd.c:3291): reg0[0] == 1, priority 100, uuid 65e4c7a6 ct_next; ct_next(ct_state=est|trk /* default (use --ct to customize) */) --------------------------------------------------------------- 4. ls_out_acl (ovn-northd.c:3497): !ct.new && ct.est && !ct.rpl && ct_label.blocked == 0 && (outport == "8e829c" && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 53), priority 2002, uuid f1798f66 next; 8. ls_out_port_sec_ip (ovn-northd.c:2851): outport == "8e829c" && eth.dst == fa:16:3e:ca:51:a1 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 192.168.23.12}, priority 90, uuid ad465931 next; 9. ls_out_port_sec_l2 (ovn-northd.c:4346): outport == "8e829c" && eth.dst == {fa:16:3e:ca:51:a1}, priority 50, uuid d6404cee output; /* output to "8e829c", type "" */ datapath flows on the compute node after DNS query is sent: recirc_id(0),in_port(6),eth(src=8e:8b:75:56:3a:4e,dst=f6:d7:83:a7:eb:9e),eth_type(0x8100),vlan(vid=130,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:77400, bytes:5717223, used:0.510s, flags:SFPR., actions :pop_vlan,7 recirc_id(0),in_port(7),eth(src=f6:d7:83:a7:eb:9e,dst=5a:93:92:20:30:3c),eth_type(0x0800),ipv4(frag=no), packets:68952, bytes:27545264, used:0.070s, flags:SFP., actions:push_vlan(vid=130,pcp=0),6 recirc_id(0x4985),in_port(10),ct_state(-new+est-rel+rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:d1:8f:36,dst=fa:16:3e:97:9e:54),eth_type(0x0800),ipv4(src=192.168.99.22,dst=10.46.22.194,proto=6,ttl=64,frag=no), packets:7, bytes:874, used:2.130s, flags:P., actions:ct_clear,set(eth(src=fa:16:3e:42:7d:33,dst=52:54:00:52:cc:e3)),set(ipv4(src=192.168.99.22,dst=10.46.22.194,ttl=63)),ct(zone=8,nat),recirc(0x4986) recirc_id(0x49a3),in_port(10),eth_type(0x0800),ipv4(dst=10.46.22.237,frag=no), packets:0, bytes:0, used:never, actions:ct(commit,zone=2,nat(dst=192.168.23.12)),recirc(0x49a4) recirc_id(0x49a4),in_port(10),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:42:7d:33,dst=fa:16:3e:1c:7d:58),eth_type(0x0800),ipv4(dst=192.168.23.12,proto=17,ttl=63,frag=no),udp(dst=53), pack ets:0, bytes:0, used:never, actions:ct_clear,set(eth(src=fa:16:3e:20:18:e4,dst=fa:16:3e:ca:51:a1)),set(ipv4(dst=192.168.23.12,ttl=62)),userspace(pid=4294963040,controller(reason=1,dont_send=0,continuation=1,recir c_id=18853,rule_cookie=0,controller_id=0,max_len=65535)) recirc_id(0),in_port(9),eth(src=8e:9f:8a:f2:03:76,dst=76:8e:e8:5c:b4:27),eth_type(0x0806), packets:1, bytes:42, used:8.758s, actions:push_vlan(vid=133,pcp=0),6 recirc_id(0),in_port(6),eth(src=5a:93:92:20:30:3c,dst=f6:d7:83:a7:eb:9e),eth_type(0x8100),vlan(vid=130,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:70527, bytes:5436434, used:0.070s, flags:SPR., actions: pop_vlan,7 recirc_id(0x4982),in_port(4),eth_type(0x0800),ipv4(dst=10.46.22.246,frag=no), packets:14, bytes:1148, used:2.130s, flags:P., actions:ct(commit,zone=8,nat(dst=192.168.99.22)),recirc(0x4983) recirc_id(0),in_port(10),eth(src=fa:16:3e:d1:8f:36),eth_type(0x0800),ipv4(src=192.168.99.22,dst=10.46.22.192/255.255.255.224,frag=no), packets:7, bytes:874, used:2.130s, flags:P., actions:ct(zone=1),recirc(0x4985 ) recirc_id(0),in_port(6),eth(src=0e:e1:fd:1f:30:26,dst=8e:9f:8a:f2:03:76),eth_type(0x8100),vlan(vid=133,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:80331, bytes:9639720, used:0.203s, actions:pop_vlan,9 recirc_id(0),in_port(4),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=2c:21:31:e3:8f:00,dst=01:00:5e:00:00:0d),eth_type(0x0800),ipv4(src=10.46.22.252/255.255.255.254,dst=224.0.0.0/240.0.0.0,frag=no), packets:0, bytes:0, used:never, actions:3,ct_clear,ct_clear,ct_clear recirc_id(0),in_port(6),eth(src=76:8e:e8:5c:b4:27,dst=8e:9f:8a:f2:03:76),eth_type(0x8100),vlan(vid=133,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:80286, bytes:9634320, used:0.890s, actions:pop_vlan,9 recirc_id(0),in_port(15),eth(src=fa:16:3e:05:4b:95),eth_type(0x0800),ipv4(src=192.168.99.7,dst=10.46.22.237,proto=17,frag=no), packets:14, bytes:1282, used:2.912s, actions:ct(zone=25),recirc(0x4969) recirc_id(0),in_port(7),eth(src=f6:d7:83:a7:eb:9e,dst=c2:f0:53:71:f8:52),eth_type(0x0800),ipv4(frag=no), packets:164360, bytes:61925920, used:0.543s, flags:SFP., actions:push_vlan(vid=130,pcp=0),6 recirc_id(0),in_port(9),eth(src=8e:9f:8a:f2:03:76,dst=0e:e1:fd:1f:30:26),eth_type(0x0800),ipv4(frag=no), packets:80362, bytes:9321992, used:0.197s, actions:push_vlan(vid=133,pcp=0),6 recirc_id(0),in_port(4),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=00:00:5e:00:02:01,dst=33:33:00:00:00:12),eth_type(0x86dd),ipv6(src=fe80:52:0:2e16::fd,dst=ff02::12,proto=112,hlimit=255,frag=no), packets:80175, bytes:7536450, used:0.811s, actions:3,ct_clear,ct_clear,ct_clear recirc_id(0x4984),in_port(4),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:97:9e:54,dst=fa:16:3e:d1:8f:36),eth_type(0x0800),ipv4(src=0.0.0.0/128.0.0.0,dst=192.168.99.22,proto=6,frag=no),tcp( dst=22), packets:14, bytes:1148, used:2.130s, flags:P., actions:10 recirc_id(0),in_port(7),eth(src=f6:d7:83:a7:eb:9e,dst=8e:8b:75:56:3a:4e),eth_type(0x0800),ipv4(frag=no), packets:82147, bytes:29259772, used:0.511s, flags:SFP., actions:push_vlan(vid=130,pcp=0),6 recirc_id(0x496f),in_port(15),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=fa:16:3e:1c:7d:58),eth_type(0x0800),ipv4(src=10.46.22.230/255.255.255.254,dst=10.46.22.237,proto=17,ttl=63,frag=no), packets:14, bytes:1282, used:2.911s, actions:ct_clear,ct_clear,ct(zone=7,nat),recirc(0x4971) recirc_id(0x4972),in_port(15),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:70:24:cf,dst=fa:16:3e:1c:7d:58),eth_type(0x0800),ipv4(dst=192.168.23.12,proto=17,ttl=63,frag=no),udp(dst=53), packets:14, bytes:1282, used:2.911s, actions:ct_clear,set(eth(src=fa:16:3e:20:18:e4,dst=fa:16:3e:ca:51:a1)),set(ipv4(dst=192.168.23.12,ttl=62)),userspace(pid=4294963040,controller(reason=1,dont_send=0,continuation=1,recirc_id=18803,rule_cookie=0,controller_id=0,max_len=65535)) recirc_id(0),in_port(6),eth(src=76:8e:e8:5c:b4:27,dst=8e:9f:8a:f2:03:76),eth_type(0x8100),vlan(vid=133,pcp=0),encap(eth_type(0x0806)), packets:1, bytes:64, used:8.758s, actions:pop_vlan,9 recirc_id(0x4969),in_port(15),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:05:4b:95),eth_type(0x0800),ipv4(dst=0.0.0.0/128.0.0.0,proto=17,frag=no),udp(dst=53), packets:14, bytes:1282, used:2.912s, actions:ct(commit,zone=25,label=0/0x1),userspace(pid=4294929648,controller(reason=1,dont_send=0,continuation=1,recirc_id=18796,rule_cookie=0xa81974bb,controller_id=0,max_len=65535)) recirc_id(0),in_port(4),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=00:00:5e:00:01:01,dst=01:00:5e:00:00:12),eth_type(0x0800),ipv4(src=10.46.22.252/255.255.255.254,dst=224.0.0.0/240.0.0.0,frag=no), packets:80179, bytes:4810740, used:0.304s, actions:3,ct_clear,ct_clear,ct_clear recirc_id(0),in_port(4),eth(src=4c:16:fc:b0:3c:02,dst=01:80:c2:00:00:00),eth_type(0/0xffff), packets:38845, bytes:2330700, used:1.258s, actions:drop recirc_id(0x4986),in_port(10),ct_state(-new+est-rel+rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:42:7d:33,dst=52:54:00:52:cc:e3),eth_type(0x0800),ipv4(src=0.0.0.0/128.0.0.0,dst=10.46.22.192/255.255.255.224,frag=no), packets:7, bytes:874, used:2.130s, flags:P., actions:ct_clear,ct_clear,ct_clear,4 recirc_id(0x4983),in_port(4),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=52:54:00:52:cc:e3,dst=fa:16:3e:42:7d:33),eth_type(0x0800),ipv4(dst=192.168.99.22,proto=6,ttl=64,frag=no), packets:14, bytes:1148, used:2.130s, flags:P., actions:ct_clear,set(eth(src=fa:16:3e:97:9e:54,dst=fa:16:3e:d1:8f:36)),set(ipv4(dst=192.168.99.22,ttl=63)),ct(zone=1),recirc(0x4984) recirc_id(0),tunnel(tun_id=0x0,src=172.17.2.25,dst=172.17.2.18,flags(-df+csum+key)),in_port(1),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784), packets:80280, bytes:5298480, used:0.552s, actions:userspace(pid=4294963040,slow_path(bfd)) recirc_id(0x49a2),in_port(10),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=00:00:00:00:00:00/01:00:00:00:00:00,dst=fa:16:3e:1c:7d:58),eth_type(0x0800),ipv4(src=10.46.22.240/255.255.255.248,dst=10.46.22.237,proto=17,ttl=63,frag=no), packets:0, bytes:0, used:never, actions:ct_clear,ct_clear,ct(zone=7,nat),recirc(0x49a3) recirc_id(0x4985),in_port(10),ct_state(+new-est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=fa:16:3e:d1:8f:36),eth_type(0x0800),ipv4(dst=0.0.0.0/128.0.0.0,proto=17,frag=no),udp(dst=53), packets:0, bytes:0, used:never, actions:ct(commit,zone=1,label=0/0x1),userspace(pid=4294929842,controller(reason=1,dont_send=0,continuation=1,recirc_id=18848,rule_cookie=0xe66b6d51,controller_id=0,max_len=65535)) recirc_id(0),in_port(6),eth(src=52:54:00:5e:aa:eb,dst=14:02:ec:7c:88:31),eth_type(0x0800),ipv4(frag=no), packets:107441, bytes:968156388, used:0.011s, flags:SFP., actions:8 recirc_id(0),in_port(9),eth(src=8e:9f:8a:f2:03:76,dst=32:e9:86:dd:6f:36),eth_type(0x0800),ipv4(frag=no), packets:80325, bytes:9317700, used:0.595s, actions:push_vlan(vid=133,pcp=0),6 recirc_id(0x4971),in_port(15),eth_type(0x0800),ipv4(dst=10.46.22.237,frag=no), packets:14, bytes:1282, used:2.911s, actions:ct(commit,zone=2,nat(dst=192.168.23.12)),recirc(0x4972) recirc_id(0),in_port(9),eth(src=8e:9f:8a:f2:03:76,dst=76:8e:e8:5c:b4:27),eth_type(0x0800),ipv4(frag=no), packets:80364, bytes:9322224, used:0.685s, actions:push_vlan(vid=133,pcp=0),6 recirc_id(0),tunnel(tun_id=0x0,src=172.17.2.21,dst=172.17.2.18,flags(-df+csum+key)),in_port(1),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784), packets:80286, bytes:5298876, used:0.890s, actions:userspace(pid=4294963040,slow_path(bfd)) recirc_id(0),in_port(10),eth(src=fa:16:3e:d1:8f:36),eth_type(0x0800),ipv4(src=192.168.99.22,dst=10.46.22.237,proto=17,frag=no), packets:0, bytes:0, used:never, actions:ct(zone=1),recirc(0x4985) recirc_id(0),in_port(8),eth(src=14:02:ec:7c:88:31,dst=52:54:00:5e:aa:eb),eth_type(0x0800),ipv4(frag=no), packets:102046, bytes:8485106, used:0.011s, flags:SFP., actions:6 recirc_id(0),in_port(6),eth(src=c2:f0:53:71:f8:52,dst=f6:d7:83:a7:eb:9e),eth_type(0x8100),vlan(vid=130,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:149344, bytes:182007153, used:0.543s, flags:SFPR., actions:pop_vlan,7 recirc_id(0),in_port(4),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=52:54:00:52:cc:e3,dst=fa:16:3e:42:7d:33),eth_type(0x0800),ipv4(src=10.46.22.192/255.255.255.224,dst=10.46.22.246,proto=6,ttl=64,frag=no), packets:14, bytes:1148, used:2.130s, flags:P., actions:ct_clear,ct(zone=15,nat),recirc(0x4982) recirc_id(0),tunnel(tun_id=0x0,src=172.17.2.14,dst=172.17.2.18,flags(-df+csum+key)),in_port(1),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784), packets:80331, bytes:5301846, used:0.203s, actions:userspace(pid=4294963040,slow_path(bfd)) recirc_id(0),in_port(6),eth(src=32:e9:86:dd:6f:36,dst=8e:9f:8a:f2:03:76),eth_type(0x8100),vlan(vid=133,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:80280, bytes:9633600, used:0.552s, actions:pop_vlan,9
[root@compute-0 ~]# ovs-dpctl show system@ovs-system: lookups: hit:1860249 missed:66563 lost:0 flows: 29 masks: hit:14810227 total:8 hit/pkt:7.69 port 0: ovs-system (internal) port 1: genev_sys_6081 (geneve: packet_type=ptap) port 2: br-int (internal) port 3: br-ex (internal) port 4: ens1f0 port 5: vlan132 (internal) port 6: ens1f1 port 7: vlan130 (internal) port 8: br-isolated (internal) port 9: vlan133 (internal) port 10: tap5396479b-5a <------ this is source port port 11: tap42a0a518-b9 port 12: tapb74b4195-be port 13: tapf778d143-a0 port 14: tap77f9b441-24 port 15: tap82eb692b-b8 port 16: tapb3377ee0-49 port 17: tapb87046f7-15 port 18: tapa5bd6940-80 port 19: tap10c031e3-5a port 20: tap8e829c5c-71 <------ this is destination port port 21: tap91f57097-40 port 22: tapa4de0c67-50
The problem seems to be in the controller() action that does dns_lookup() for the second time in the DNS network. Although the packet is resumed to he pipeline, it gets lost and actually dropped in the datapath. recirc_id(0x22e),dp_hash(0),skb_priority(0),in_port(0/0xffff0000),skb_mark(0),ct_state(+new-est-rel-rpl-inv+trk-snat-dnat),ct_zone(0x1e),ct_mark(0),ct_label(0),ct_tuple4(src=10.46.22.246,dst=192.168.23.12,proto=17,tp_src=47500,tp_dst=53),eth(src=fa:16:3e:20:18:e4,dst=fa:16:3e:ca:51:a1),eth_type(0x0800),ipv4(src=10.46.22.246,dst=192.168.23.12,proto=17,tos=0,ttl=62,frag=no),udp(src=47500,dst=53), packets:0, bytes:0, used:never, actions:drop
I talked with Udi and he said they don't use Neutron DNS in OCP. If Neutron DNS is turned off (setting dns_domain to openstacklocal) the issue is mitigated. I also see this is marked as Regression: Udi do you have a version where this used to work? I thought the bug has been there since ever.
(In reply to Jakub Libosvar from comment #10) > I talked with Udi and he said they don't use Neutron DNS in OCP. If Neutron > DNS is turned off (setting dns_domain to openstacklocal) the issue is > mitigated. > > I also see this is marked as Regression: Udi do you have a version where > this used to work? I thought the bug has been there since ever. Just to emphasis, We don't use neutron DNS in Openshift 3.x. Openshift 4.x deployments are using neutron DNS. I marked it as regression since Shelley asked me to add the keyword. You can drop the regression keyword but this issue can affect customers as a blocker. We have a w/a - not to use the DNS domain, not sure this w/a will be applicable on customers deployments.
Hi team, What z-stream release is this targeted for?
Can not be tested on the latest OSP13 with openvswitch 2.11 http://rhos-qe-mirror-tlv.usersys.redhat.com/rcm-guest/puddles/OpenStack/13.0-RHEL-7/2019-10-18.1/ It uses openvswitch2.11-2.11.0-21.el7fdp.x86_64 and not openvswitch2.11-2.11.0-26.el8fdp
Hello. According to [1] we already released openvswitch2.11-2.11.0-26.el7fdp.x86_64.rpm package. Can we triage this bug? [1] https://access.redhat.com/downloads/content/rhel---7/x86_64/6671/openvswitch2.11/2.11.0-26.el7fdp/x86_64/fd431d51/package Regards, Alex.
Verified on OSP13 puddle 2020-02-10.8 openvswitch2.11-2.11.0-35.el7fdp.x86_64 verification procedure 1- create two different tenant networks and one subnet for each 2- create a router that connects these subnets 3- create a security rule (SR) for ingress DNS traffic (udp port 53) 4- create two servers, vm1 and vm2 with different subnets and with the previous SR 5- run tcpdump on both servers: tcpdump -n -i ens3 udp and port 53 6- at vm2, run a script that listens at UDP port 53 and answers: echo -n -e "wrong response!" | sudo nc -u -w1 -l 53 7- at vm1, send a DNS query towards vm2: host foo.com <vm2_ip_address> Check that queries are received at vm2 and responses are received at vm1 (although responses will not be a valid answer for the host command). I have also verified (THANKS, JAKUB) that the flows whose n_packets are incremented are these ones in the source compute: cookie=0xc8711a64, duration=8642.817s, table=22, n_packets=222, n_bytes=18558, idle_age=306, priority=100,udp,metadata=0x1d8,tp_dst=53 actions=controller(userdata=00.00.00.06.00.00.00.00.00.01.de.10.00.00.00.64,pause),resubmit(,23) cookie=0x66352cc2, duration=8630.177s, table=22, n_packets=60, n_bytes=4020, idle_age=306, priority=100,udp,metadata=0x1d9,tp_dst=53 actions=controller(userdata=00.00.00.06.00.00.00.00.00.01.de.10.00.00.00.64,pause),resubmit(,23) And this ones in the destination compute: cookie=0x8076f5ed, duration=8706.927s, table=44, n_packets=52, n_bytes=3484, idle_age=389, priority=2002,ct_state=+new-est+trk,udp,reg15=0x3,metadata=0x1d9,tp_dst=53 actions=load:0x1->NXM_NX_XXREG0[97],resubmit(,45) They have metadata=0x1d8 and 0x1d9, which correspond with the two tenant networks previously created (tunnel_keys=473 and 474 respectively) So we see now that after OVN returns no answer for the dns_lookup call, the packet is forwarded to its destination address.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0769