The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1685642 - Connectivity issue across VXLAN tunnels in OVS-DPDK after reboot of hypervisor - problem clears up after restarting openvswitch
Summary: Connectivity issue across VXLAN tunnels in OVS-DPDK after reboot of hyperviso...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: RHEL 7.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Flavio Leitner
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks: 1740845 1759262 1759334 1759707
TreeView+ depends on / blocked
 
Reported: 2019-03-05 17:18 UTC by Andreas Karis
Modified: 2023-10-06 18:13 UTC (History)
22 users (show)

Fixed In Version: openvswitch-2.9.0-119.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1740845 1758824 1759262 1759334 1759707 (view as bug list)
Environment:
Last Closed: 2019-11-06 04:18:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-37 0 None None None 2023-10-06 18:13:19 UTC
Red Hat Product Errata RHBA-2019:3717 0 None None None 2019-11-06 04:18:40 UTC

Description Andreas Karis 2019-03-05 17:18:20 UTC
Affected OVS versions:
OVS 2.9, minor -19 to minor -98

Summary of test setup:
----------------------

2 compute nodes (hypervisors) with correct DPDK configuration (as far as I can tell) and VXLAN tunnels between hypervisors.

We run a successful test between instance 57 on compute 57 and instance 67 on compute 67. 
~~~
ping 192.168.0.12
~~~
This test is across a VXLAN tenant network. 

We run a second successful test from instance 57 across netns `vlan` in the instance, pinging instance 67-2 on compute 67.
~~~
ip netns exec vlan ping 192.168.0.4
~~~
This test is across an unrelated VLAN tenant network (both just happen to share the same subnet).

We reboot hypervisor 57 (and with it the instance). When the hypervisor comes back, we start instance 57. Now, we repeat the same 2 tests. We cannot ping across the VXLAN network to 192.168.0.12 (on subnet vxlan, target instance 67). We can ping across the VLAN network to 192.168.0.4 (on subnet vlan, target instance 67-2).

After we restart openvswitch on hypervisor 57, instance 57 can ping to instance 67 via the VXLAN tunnel again.

Comment 16 Flavio Leitner 2019-03-08 20:23:46 UTC
Hi Andreas, 

Thanks for all the tests done so far. I am trying to follow up and see if I can spot something.
However, I am curious if this setup has worked before and then at some point the issue appeared or if this is a new deployment.
The reason is that if it worked before, we can try to find what changed in between which is faster, otherwise this could get deeper into the HW specifics, like internal NIC registers values for example.

fbl

Comment 17 Flavio Leitner 2019-03-08 20:47:56 UTC
Andreas,

> On the vhu, we see that the ARP requests from 67 make it to the vhu on 57. We see that The instance on 57 sends out ICMP requests
> towards 67 (which don't make it to dpdkbond0). We see that ARP replies are sent from the instance on 57 to the instance on 67
> (which don't make it to dpdkbond0):

How do you know that? Because if I recall correctly, the dpdkbond0 doesn't have an OFPort Number, which means the datapath will resolve a LAG directly.

I suspect that the environment has one lag down but the traffic is sent anyways and gets lost there. After a reboot, this lag is either correctly set as down or it becomes up, or the traffic goes to the other working lag.

I am going to download the sosreports to verify that.

Comment 18 Flavio Leitner 2019-03-08 20:59:56 UTC
This is from the sosreport before reboot. The hash indicates the correct device. Maybe we could check with ovs-tcpdump if the packet is going out on dpdk0 at least.

---- dpdkbond0 ----                                                                                                                
bond_mode: balance-tcp    
bond may use recirculation: yes, Recirc-ID : 1    
bond-hash-basis: 0    
updelay: 200 ms    
downdelay: 100 ms    
next rebalance: 2452 ms    
lacp_status: negotiated    
lacp_fallback_ab: false    
active slave mac: 14:02:ec:92:3e:f0(dpdk0)    
     
slave dpdk0: enabled    
    active slave    
    may_enable: true    
    hash 133: 1 kB load    
    hash 192: 1 kB load    
     
slave dpdk1: disabled    
    may_enable: false    

Another thing is that the userspace failed to cache the remote peer:
$ grep 192.168.3.176 before-restart/sos_commands/openvswitch/ovs-appctl_tnl.arp.show 
$ grep 192.168.3.176 after-reboot/sos_commands/openvswitch/ovs-appctl_tnl.arp.show 
192.168.3.176                                 48:df:37:1c:0e:c0   br-dpdk

No remote ARP cached in OVS.
Can you please confirm that it reproduces only when the TNL ARP cache doesn't include the remote IP address and that after a restart the ARP is properly cached?

Thanks,
fbl

Comment 19 Andreas Karis 2019-03-12 17:32:37 UTC
My message to the customer:


Can we isolate this further and remove the dpdkbond0, and instead make this a single port deployment without bond?
~~~
ovs-vsctl del-port dpdkbond0
mv /etc/sysconfig/network-scripts/ifcfg-dpdkbond0 /root
cat <<'EOF'>/etc/sysconfig/network-scripts/ifcfg-dpdk0
# This file is autogenerated by os-net-config
DEVICE=dpdk0
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSDPDKPort
OVS_BRIDGE=br-dpdk
MTU=9100
OVS_EXTRA="set Interface $DEVICE options:dpdk-devargs=0000:04:00.0 -- set Interface $DEVICE mtu_request=$MTU"
EOF
chattr +i /etc/sysconfig/network-scripts/ifcfg-dpdk0
ifup dpdk0
~~~

Run `ovs-vsctl show` to verify and make sure that instances can ping each other via VLAN and VXLAN.

Then, reboot the hypervisor. I didn't test this so I hope that os-net-config doesn't create any issues with this manual override. Check that instances can ping each other via VLAN, and verify if you can reproduce the VXLAN issue or not.

Comment 20 Andreas Karis 2019-03-12 17:35:23 UTC
I also asked him to:

Another thing is that the userspace failed to cache the remote peer:
$ grep 192.168.3.176 before-restart/sos_commands/openvswitch/ovs-appctl_tnl.arp.show 
$ grep 192.168.3.176 after-reboot/sos_commands/openvswitch/ovs-appctl_tnl.arp.show 
192.168.3.176                                 48:df:37:1c:0e:c0   br-dpdk

No remote ARP cached in OVS.
Can you please confirm that it reproduces only when the TNL ARP cache doesn't include the remote IP address and that after a restart the ARP is properly cached?

http://www.openvswitch.org/support/dist-docs-2.5/README-native-tunneling.md.txt
~~~
ovs-appctl tnl/arp/show
~~~
You can run the above command and grep for 192.168.3.176 - the current assumption is that in working scenarios, the entry is there, and in non-working scenarios, the entry is not there.

Also, in non-working scenarios, please ping 192.168.3.176 and then rerun `ovs-appctl tnl/arp/show` and make sure that the ARP entry is now there and see if the instances can now ping to each other via the VXLAN tunnel.

Comment 21 Andreas Karis 2019-03-12 17:38:26 UTC
> On the vhu, we see that the ARP requests from 67 make it to the vhu on 57. We see that The instance on 57 sends out ICMP requests
> towards 67 (which don't make it to dpdkbond0). We see that ARP replies are sent from the instance on 57 to the instance on 67
> (which don't make it to dpdkbond0):

I ran ovs-tcpdump on the vhu and on dpdkbond0.

So 67 -> 57: no problemn
57 -> 67: ovs-tcpdump shows packets on the vhu interface. ovs-tcpdump does not show packets on dpdkbond0.

On dpdkbond0, we can see that these packets come in from compute67:
~~~
[root@srbhoncihv57 ~]#
[root@srbhoncihv57 ~]#  ovs-tcpdump -nne -i "dpdkbond0" -l | egrep '192.168.0.12|192.168.3.176'
(...)
16:09:38.580235 48:df:37:1c:0e:c0 > 14:02:ec:92:3e:f0, ethertype 802.1Q (0x8100), length 96: vlan 666, p 0, ethertype IPv4, 192.168.3.176.47821 > 192.168.3.182.4789: VXLAN, flags [I] (0x08), vni 159
fa:16:3e:67:d6:8f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.0.6 tell 192.168.0.12, length 28
(...)
~~~
But we don't see anything leave from compute57 to compute67 across the VXLAN tunnel (according to the above packet capture on dpdkbond0).

- Andreas

Comment 113 Flavio Leitner 2019-08-10 14:05:13 UTC
Following the core:


6298         switch (a->type) {
6299         case OFPACT_OUTPUT:
6300             xlate_output_action(ctx, ofpact_get_OUTPUT(a)->port,
6301                                 ofpact_get_OUTPUT(a)->max_len, true, last,
6302                                 false);
6303             break;

(gdb) p  *(struct ofpact_output *)ctx->xin->ofpacts
$3 = {ofpact = {type = OFPACT_OUTPUT, raw = 255 '\377', len = 12}, port = 65529, max_len = 0}

(gdb) p  ((struct ofpact_output *)ctx->xin->ofpacts)->port
$4 = 65529
(gdb) p  ((struct ofpact_output *)ctx->xin->ofpacts)->max_len
$5 = 0
(gdb) 


    xlate_output_action(ctx = 0x7f54f1818af0
					    ofpact_get_OUTPUT(a)->port = 65529
                        ofpact_get_OUTPUT(a)->max_len = 0
					    true,
                        last = true
                        false);


4912     switch (port) {  port=0xfff9/OFPP_TABLE
[...]
4917     case OFPP_TABLE:
4918         xlate_table_action(ctx, ctx->xin->flow.in_port.ofp_port,
4919                            0, may_packet_in, true, false, false,
4920                            do_xlate_actions);

         xlate_table_action(ctx = 0x7f54f1818af0,
							ctx->xin->flow.in_port.ofp_port = 65534/0xfffe,
                            table_id = 0,
							may_packet_in,
							honor_table_miss=true,
						    with_ct_orig=false,
							is_last_action=false,
                            do_xlate_actions);


4128     if (ctx->was_mpls) { -> false
[...]
4132     if (xlate_resubmit_resource_check(ctx)) {


4055 static bool
4056 xlate_resubmit_resource_check(struct xlate_ctx *ctx)
4057 {
4058     if (ctx->depth >= MAX_DEPTH) {  ctx->depth = 1, false
4059         xlate_report_error(ctx, "over max translation depth %d", MAX_DEPTH);
4060         ctx->error = XLATE_RECURSION_TOO_DEEP;
4061     } else if (ctx->resubmits >= MAX_RESUBMITS) {  ctx->resubmits=6, false
4062         xlate_report_error(ctx, "over %d resubmit actions", MAX_RESUBMITS);
4063         ctx->error = XLATE_TOO_MANY_RESUBMITS;
4064     } else if (ctx->odp_actions->size > UINT16_MAX) { ctx->odp_actions->size = 0, false
4065         xlate_report_error(ctx, "resubmits yielded over 64 kB of actions");
4066         /* NOT an error, as we'll be slow-pathing the flow in this case? */
4067         ctx->exit = true; /* XXX: translation still terminated! */
4068     } else if (ctx->stack.size >= 65536) { ctx->stack.size = 0, false
4069         xlate_report_error(ctx, "resubmits yielded over 64 kB of stack");
4070         ctx->error = XLATE_STACK_TOO_DEEP;
4071     } else {
4072         return true;  <---- return point
4073     }
4074 
4075     return false;
4076 }

4133         uint8_t old_table_id = ctx->table_id;  <---- 0
4134         struct rule_dpif *rule;
4135 
4136         ctx->table_id = table_id; <---- 0
4137 
4138         /* Swap packet fields with CT 5-tuple if requested. */
4139         if (with_ct_orig) {  <--- false
[...]
4149         rule = rule_dpif_lookup_from_table(ctx->xbridge->ofproto,
4150                                            ctx->xin->tables_version,
4151                                            &ctx->xin->flow, ctx->wc,
4152                                            ctx->xin->resubmit_stats,
4153                                            &ctx->table_id, in_port,
4154                                            may_packet_in, honor_table_miss,
4155                                            ctx->xin->xcache);


4149         rule = rule_dpif_lookup_from_table(ctx->xbridge->ofproto = 0x55bb1ce743d0
4150                                            ctx->xin->tables_version = 105/0x69
4151                                            &ctx->xin->flow = 0x7f54f1819460
												ctx->wc = 0x7f54f1818050
4152                                            ctx->xin->resubmit_stats = 
4153                                            ctx->table_id = 0
												in_port = 65534/0xfffe
4154                                            may_packet_in = 
												honor_table_miss = true,
4155                                            ctx->xin->xcache = NULL
												);

4184 struct rule_dpif *
4185 rule_dpif_lookup_from_table(struct ofproto_dpif *ofproto = 0x55bb1ce743d0
4186                             ovs_version_t version= 105/0x69,
								 struct flow *flow = 0x7f54f1819460,
4187                             struct flow_wildcards *wc = 0x7f54f1818050,
4188                             const struct dpif_flow_stats *stats = ???
4189                             uint8_t *table_id = 0
								 ofp_port_t in_port 65534/0xfffe/OFPP_LOCAL
4190                             bool may_packet_in
							     bool honor_table_miss = true,
4191                             struct xlate_cache *xcache = NULL)


4201     if (flow->nw_frag & FLOW_NW_FRAG_ANY  -> flow->nw_frag=0, false
[...]
4232     flow->in_port.ofp_port = in_port; = 65534/0xfffe/OFPP_LOCAL

4240     for (next_id = *table_id;  = 0
4241          next_id < ofproto->up.n_tables;  next_id < 255
4242          next_id++, next_id += (next_id == TBL_INTERNAL/254/0xfe))
4243     {



4245         rule = rule_dpif_lookup_in_table(ofproto, version, next_id, flow, wc);
4245         rule = rule_dpif_lookup_in_table(0x55bb1ce743d0, 105/0x69, next_id, 0x7f54f1819460, 0x7f54f1818050);


static struct rule_dpif *
4135 rule_dpif_lookup_in_table(struct ofproto_dpif *ofproto 0x55bb1ce743d0,
								, ovs_version_t version 105/0x69,
4136                           uint8_t table_id,
							   struct flow *flow 0x7f54f1819460,,
4137                           struct flow_wildcards *wc 0x7f54f1818050)


4139     struct classifier *cls = &ofproto->up.tables[table_id].cls;
 (gdb) p ((struct ofproto_dpif *)0x55bb1ce743d0)->up.tables[0].cls
$42 = {n_rules = 9, n_flow_segments = 3 '\003', flow_segments = ":?Q", subtables_map = {impl = {
      p = 0x55bb1cf7b940}}, subtables = {impl = {p = 0x55bb1d0c7790}, temp = 0x0}, partitions = {impl = {
      p = 0x0}}, tries = {{field = 0x55bb1b7c0ba0 <mf_fields+7616>, root = {p = 0x0}}, {
      field = 0x55bb1b7c0b68 <mf_fields+7560>, root = {p = 0x0}}, {field = 0x0, root = {p = 0x0}}}, 
  n_tries = 2, publish = true}
(gdb) p &((struct ofproto_dpif *)0x55bb1ce743d0)->up.tables[0].cls
$43 = (struct classifier *) 0x55bb1cf2a0d8

4139     struct classifier *cls = &ofproto->up.tables[table_id].cls = 0x55bb1cf2a0d8;

4140     return rule_dpif_cast(rule_from_cls_rule(classifier_lookup(cls, version,
4141                                                                flow, wc)));

1158 const struct cls_rule *
1159 classifier_lookup(const struct classifier *cls, ovs_version_t version,
1160                   struct flow *flow, struct flow_wildcards *wc)
1161 {
 
1162     return classifier_lookup__(cls = 0x55bb1cf2a0d8
									version = 105/0x69
                                    flow = 0x7f54f1819460
                                    wc = 0x7f54f1818050
									tallow_conjunctive_matches = true);


 955     /* Initialize trie contexts for find_match_wc(). */
 956     for (int i = 0; i < cls->n_tries (2); i++) {
         trie_ctx[0]:
 844     ctx->trie = trie = 0x55bb1cf2a100;
 845     ctx->be32ofs = trie->field->flow_be32ofs = 127;
 846     ctx->lookup_done = false;
         trie_ctx[1]:
 844     ctx->trie = trie = 0x55bb1cf2a110; 
 845     ctx->be32ofs = trie->field->flow_be32ofs = 126;
 846     ctx->lookup_done = false;
 958     }


 966         /* Skip subtables with no match, or where the match is lower-priority
 967          * than some certain match we've already found. */
 968         match = find_match_wc(subtable, version, flow, trie_ctx, cls->n_tries,
 969                               wc);

1641 find_match_wc(const struct cls_subtable *subtable ???
                   ovs_version_t version = 105/0x69
1642               const struct flow *flow = 0x7f54f1819460
                   struct trie_ctx trie_ctx[CLS_MAX_TRIES] = local, see line 846 above 
1643               unsigned int n_tries = 2
                   struct flow_wildcards *wc 0x7f54f1818050  )

1645     if (OVS_UNLIKELY(!wc)) {  wc = 0x7f54f1818050, false
[...]
1650     uint32_t basis = 0, hash;
1651     const struct cls_match *rule = NULL;
1652     struct flowmap stages_map = FLOWMAP_EMPTY_INITIALIZER;
1653     unsigned int mask_offset = 0;
1654     int i;
1655 
1656     /* Try to finish early by checking fields in segments. */
1657     for (i = 0; i < subtable->n_indices; i++) {

Assuming last subtable at 0x55bb1cf42960
(gdb) p ((struct cls_subtable *)0x55bb1cf42960)->n_indices
$51 = 0 '\000'

Loop skipped, i=0

1676     /* Trie check for the final range. */
1677     if (check_tries(trie_ctx, n_tries, subtable->trie_plen,
1678                     subtable->index_maps[i], flow, wc)) {
1679         goto no_match;
1680     }

(gdb) p &((struct cls_subtable *)0x55bb1cf42960)->trie_plen
$54 = (unsigned int (*)[3]) 0x55bb1cf429c8

(gdb) p &((struct cls_subtable *)0x55bb1cf42960)->index_maps[0]
$53 = (const struct flowmap *) 0x55bb1cf42988

1534 check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES] local, see line 846 above 
				unsigned int n_tries = 2,
1535             field_plen[CLS_MAX_TRIES] =  0x55bb1cf429c8
1536             const struct flowmap range_map 0x55bb1cf42988
				 const struct flow *flow = 0x7f54f1819460
1537             struct flow_wildcards *wc =  0x7f54f1818050)

(gdb) p ((struct cls_subtable *)0x55bb1cf42960)->trie_plen
$56 = {0, 0, 0}

1544     for (j = 0; j < n_tries; j++) {   0, 1
1545         /* Is the trie field relevant for this subtable, and
1546            is the trie field within the current range of fields? */
1547         if (field_plen[j] && -> 0, 0,0  -> false
[...]
1584     return false;
1585 }


1681     hash = flow_hash_in_minimask_range(flow, &subtable->mask,
1682                                        subtable->index_maps[i],
1683                                        &mask_offset, &basis);

(gdb) p ((struct cls_subtable *)0x55bb1cf42960)->mask
$57 = {masks = {map = {bits = {0, 0}}}}

i=0
(gdb) p &((struct cls_subtable *)0x55bb1cf42960)->index_maps[0]
$59 = (const struct flowmap *) 0x55bb1cf42988


   hash = flow_hash_in_minimask_range(flow = 0x7f54f1819460,
									  &subtable->mask = 0x55bb1cf42a00,
                                      subtable->index_maps[i], i=0 -> 0x55bb1cf42988
                                      &mask_offset, ptr to 0 in the stack 
                                      &basis, ptr to 0 in the stack)


277 {
278     const uint64_t *mask_values = miniflow_get_values(&mask->masks);

(gdb) p &((struct minimask *)0x55bb1cf42a00)->masks
$64 = (struct miniflow *) 0x55bb1cf42a00

       const uint64_t *mask_values = miniflow_get_values(0x55bb1cf42a00);
        This is actually moving to the end of struct of size 16 bytes:
        0x55bb1cf42a10
        mask_values = 0x55bb1cf42a10

279     const uint64_t *flow_u64 = (const uint64_t *)flow;
        flow_64 = 0x7f54f1819460

        offset = 0
280     const uint64_t *p = mask_values + *offset; <--- 0x55bb1cf42a10
        

281     uint32_t hash = *basis;  <--- 0
282     map_t map;
283 
284     FLOWMAP_FOR_EACH_MAP (map, range) {
285         size_t idx;
286 
287         MAP_FOR_EACH_INDEX (idx, map) { 
288             hash = hash_add64(hash, flow_u64[idx] & *p++);
289         }
290         flow_u64 += MAP_T_BITS;
291     }
292 
293     *basis = hash; /* Allow continuation from the unfinished value. */
294     *offset = p - mask_values;
295     return hash_finish(hash, *offset * 8);
296 }


Using stub.c, the hash = 0x0

1684     rule = find_match(subtable, version, flow, hash);

         find_match(subtable = 0x55bb1cf42960
                    version = 105/0x69
                    flow = 0x7f54f1819460
                    hash = 0     );

(gdb) set $subtable = ((struct cls_subtable *)0x55bb1cf42960)
(gdb) p $subtable
$362 = (struct cls_subtable *) 0x55bb1cf42960
(gdb) p $subtable->rules
$363 = {impl = {p = 0x55bb1cfa88c0}}

1624     CMAP_FOR_EACH_WITH_HASH (head, cmap_node, hash, &subtable->rules) {
         
140 #define CMAP_FOR_EACH_WITH_HASH(NODE, MEMBER, HASH, CMAP)   \
141     CMAP_NODE_FOR_EACH(NODE, MEMBER, cmap_find(CMAP, HASH))

const struct cmap_node *
cmap_find(const struct cmap *cmap, uint32_t hash)
{
    const struct cmap_impl *impl = cmap_get_impl(cmap); -> 0x55bb1cfa88c0
(gdb) p/x ((struct cmap_impl *)0x55bb1cfa88c0)->basis
$378 = 0xc9f6b223

    uint32_t h1 = rehash(impl, hash); -> stub, 0x6ced0b68
    uint32_t h2 = other_hash(h1); -> stub, 0xb686ced

    return cmap_find__(&impl->buckets[h1 & impl->mask],
                       &impl->buckets[h2 & impl->mask],
                       hash);

    return cmap_find__(&impl->buckets[0x0] = 0x55bb1cfa8900,
                       &impl->buckets[0x1] = 0x55bb1cfa8940,
                       hash = 0x0);

    return cmap_find__(&impl->buckets[0x0],
                       &impl->buckets[0x1],
                       hash);
}}                      

static inline const struct cmap_node *
cmap_find__(const struct cmap_bucket *b1, const struct cmap_bucket *b2,
            uint32_t hash)
{   


b1 = 
(gdb) p $impl->buckets[0]
$386 = {{{counter = 0, hashes = {0, 0, 0, 0, 0}, nodes = {{next = {p = 0x55bb1cf9ee68}}, {next = {
            p = 0x0}}, {next = {p = 0x0}}, {next = {p = 0x0}}, {next = {p = 0x0}}}}, 
    pad0 = '\000' <repeats 24 times>, "h\356\371\034\273U", '\000' <repeats 33 times>}}

b2 =
(gdb) p $impl->buckets[1]
$387 = {{{counter = 0, hashes = {0, 0, 0, 0, 0}, nodes = {{next = {p = 0x0}}, {next = {p = 0x0}}, {
          next = {p = 0x0}}, {next = {p = 0x0}}, {next = {p = 0x0}}}}, pad0 = '\000' <repeats 63 times>}}


cmap_find -> 0x55bb1cf9ee68

(gdb) p/x 0x55bb1cf9ee68 - 0x18
$395 = 0x55bb1cf9ee50

(gdb) p *((struct cls_match *)0x55bb1cf9ee50)

(gdb) p/x *((struct cls_match *)0x55bb1cf9ee50)
$718 = {next = {p = 0x0}, conj_set = {p = 0x0}, priority = 0x0, cmap_node = {next = {p = 0x0}}, 
  versions = {add_version = 0x51d, remove_version = 0xffffffffffffffff}, cls_rule = 0x55bb1cf6e288, 
  flow = {map = {bits = {0x0, 0x0}}}}

(gdb) p &((struct cls_match *)0x55bb1cf9ee50)->cmap_node
$399 = (struct cmap_node *) 0x55bb1cfa8918  <--- matches, so the cls_match pointer is correct.


1625         if (OVS_LIKELY(miniflow_and_mask_matches_flow(&head->flow,
1626                                                       &subtable->mask,
1627                                                       flow))) {


        miniflow_and_mask_matches_flow(&head->flow = 0x55bb1cf9ee88
                                      &subtable->mask = 0x55bb1cf42a00
                                      flow = 0x7f54f1819460,


1596 miniflow_and_mask_matches_flow(const struct miniflow *flow = 0x55bb1cf9ee88,
1597                                const struct minimask *mask 0x55bb1cf42a00,
1598                                const struct flow *target = 0x7f54f1819460)
1599 {
1600     const uint64_t *flowp = miniflow_get_values(flow);          0x55bb1cf9ee98
1601     const uint64_t *maskp = miniflow_get_values(&mask->masks);  0x55bb1cf42b00
1602     const uint64_t *target_u64 = (const uint64_t *)target;      0x7f54f1819460
1603     map_t map; 


(gdb) set $mask = ((struct minimask *)0x55bb1cf42a00)
(gdb) p $mask->masks.map
$670 = {bits = {0, 0}}

1628             /* Return highest priority rule that is visible. */
1629             CLS_MATCH_FOR_EACH (rule, head) {
1630                 if (OVS_LIKELY(cls_match_visible_in_version(rule, version))) {
1631                     return rule;
1632                 }
1633             }

rule = $head = 0x55bb1cf9ee50

version = 105/0x69

cls_match_visible_in_version(0x55bb1cf9ee50, 105/0x69))) {

return versions_visible_in_version(&rule->versions, version);
return versions_visible_in_version(&0x55bb1cf9ee70, 105/0x69);

(gdb) p/x $head->versions
$839 = {add_version = 0x51d, remove_version = 0xffffffffffffffff}
ostatic inline bool
versions_visible_in_version(const struct versions *versions,
                            ovs_version_t version)
{
    ovs_version_t remove_version;

    /* C11 does not want to access an atomic via a const object pointer. */
    atomic_read_relaxed(&CONST_CAST(struct versions *,
                                    versions)->remove_version,
                        &remove_version);

    return versions->add_version <= version && version < remove_version;
    return  0x51d <= 105/0x69 && 105/0x69 < 0xffffffffffffffff
    return  FALSE
}

1629             CLS_MATCH_FOR_EACH (rule, head) {
1630                 if (OVS_LIKELY(cls_match_visible_in_version(rule, version))) {

(gdb) p/x $head->next->p
$860 = 0x0

loop breaks, returning NULL

Comment 116 Flavio Leitner 2019-08-13 16:53:01 UTC
Patch proposed upstream:
https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361633.html

Comment 123 OvS team 2019-09-06 00:51:28 UTC
* Tue Sep 03 2019 Flavio Leitner <fbl> - 2.11.0-22
- tnl-neigh: Use outgoing ofproto version (#1685642)

Comment 125 OvS team 2019-09-07 02:56:39 UTC
* Thu Sep 05 2019 Flavio Leitner <fbl> - 2.9.0-119
- tnl-neigh: Use outgoing ofproto version (#1685642)

Comment 136 Flavio Leitner 2019-10-06 00:56:13 UTC
hotfix Bug 1758824
https://bugzilla.redhat.com/show_bug.cgi?id=1758824

Comment 151 errata-xmlrpc 2019-11-06 04:18:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3717


Note You need to log in before you can comment on or make changes to this bug.