Bug 2133457 - [16.2][OVS] [IPv6] icmpv6 is unreachable for short time after reboot overcloud [NEEDINFO]
Summary: [16.2][OVS] [IPv6] icmpv6 is unreachable for short time after reboot overcloud
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z10
: 16.2 (Train on RHEL 8.4)
Assignee: Miro Tomaska
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On: 2130394
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-10 14:03 UTC by Fiorella Yanac
Modified: 2023-08-14 09:06 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
echaudro: needinfo-
mtomaska: needinfo? (mpattric)
echaudro: needinfo? (mpattric)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-19272 0 None None None 2022-10-10 14:09:32 UTC

Comment 2 Mike Pattrick 2022-11-14 21:37:20 UTC
@echaudro This appears to be similar to bz2130394. The kernel ends up with an incorrect flow entry, and using dpctl to delete it allows the test to pass. Also noticed that the duplicate_upcall counter increases.

Comment 3 Eelco Chaudron 2022-11-15 08:44:32 UTC
(In reply to Michael Pattrick from comment #2)
> @echaudro This appears to be similar to bz2130394. The kernel
> ends up with an incorrect flow entry, and using dpctl to delete it allows
> the test to pass. Also noticed that the duplicate_upcall counter increases.

This is odd as this would only happen if there is already traffic when the system reboots (restarts) and traffic hits OVS at right between adding the bridge and configuring the controller. But I guess still possible. 

Mike can you try the patched kernel to confirm this is the same issue?

Comment 4 Mike Pattrick 2022-11-15 16:59:54 UTC
We've run the test again with the patched kernel, and the issue remains.

Some additional information, when running tcpdump on interface qvo430839ba-85 we see this packet:

> ethertype IPv6 (0x86dd), length 86: 2001:db8::f816:3eff:fead:c7ee > 2001:db8::f816:3eff:fe29:2e85: ICMP6, neighbor advertisement, tgt is 2001:db8::f816:3eff:fead:c7ee, length 32

We have the following flows installed:

> duration=485.560s, table=0, n_packets=404, n_bytes=34232, priority=10,icmp6,in_port="qvo430839ba-85",icmp_type=136 actions=resubmit(,24)
> duration=485.562s, table=24, n_packets=0, n_bytes=0, priority=2,icmp6,in_port="qvo430839ba-85",icmp_type=136,nd_target=2001:db8::f816:3eff:fead:c7ee actions=resubmit(,60)
> duration=910.457s, table=24, n_packets=287, n_bytes=24298, idle_age=0, priority=0 actions=drop

The second entry should match, but we see zero packets on that one. We have the following flow installed in the kernel:

> recirc_id(0),skb_priority(0),in_port(qvo430839ba-85),eth(),eth_type(0x86dd),ipv6(proto=58,frag=no),icmpv6(type=136), packets:385, bytes:32630, used:0.280s, actions:drop

So we have a flow installed for the drop, even though the resubmit is a higher priority and more specific.

As noted above, when the offending flow is cleared from the kernel, this test passes immediately.

Comment 5 Toni Freger 2023-03-07 08:31:06 UTC
(In reply to Eelco Chaudron from comment #3)
> (In reply to Michael Pattrick from comment #2)
> > @echaudro This appears to be similar to bz2130394. The kernel
> > ends up with an incorrect flow entry, and using dpctl to delete it allows
> > the test to pass. Also noticed that the duplicate_upcall counter increases.
> 
> This is odd as this would only happen if there is already traffic when the
> system reboots (restarts) and traffic hits OVS at right between adding the
> bridge and configuring the controller. But I guess still possible. 
> 
> Mike can you try the patched kernel to confirm this is the same issue?

@fyanac, @eolivare folks can you please check if this type of coverage missing in tobiko? if yes, please review and decide if neutron team should have it in the backlog for automation coverage. Thanks!

Comment 14 Eran Kuris 2023-06-20 09:22:59 UTC
Updating the flags and removing the blocks as it's not persistent failures


Note You need to log in before you can comment on or make changes to this bug.