Bug 2069668 - [PerfCI][ovn][16.2] dynamic-workload job fails randomly while pinging test vms
Summary: [PerfCI][ovn][16.2] dynamic-workload job fails randomly while pinging test vms
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z3
: 16.2 (Train on RHEL 8.4)
Assignee: Arnau Verdaguer
QA Contact: Bharath M V
URL:
Whiteboard:
Depends On: 2069714 2069783
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-29 12:40 UTC by Yatin Karel
Modified: 2024-03-28 13:40 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2069711 2069714 (view as bug list)
Environment:
Last Closed: 2024-03-28 13:40:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-14365 0 None None None 2022-03-29 13:57:30 UTC

Description Yatin Karel 2022-03-29 12:40:07 UTC
Description of problem:
browbeat scenario fails in job DFG-perfscale-PerfCI-OSP16.2-dynamic-workloads-ovn always since some time.

Version-Release number of selected component (if applicable):
ovn-2021-21.12.0-11.el8fdp.x86_64

How reproducible:
always in https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-OSP16.2-dynamic-workloads-ovn/ atleast since 1st March, before that there was some other failure. Last success for the job was seen on 30th Jan, 2022.

This job runs browbeat's dynamic_workload_min[1] scenario's 5 iteration, and then ping to a vm fails in an iteration, the iteration aborts, and next iteration continues and then some vm pass and then one fails.

The similar job is passing for 16.1, so seems some regression in 16.2.


Steps to Reproduce:
1. can be reproduced with https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-custom/65/parameters/
2.
3.

Actual results:
ping to vm fails randomly


Expected results:
all scenarios should succeed.


Additional info:


[1] https://opendev.org/x/browbeat/src/branch/master/rally/rally-plugins/dynamic-workloads/dynamic_workload_min.py#L70-L73

Comment 1 Daniel Alvarez Sanchez 2022-03-29 12:52:51 UTC

*** This bug has been marked as a duplicate of bug 2066413 ***

Comment 2 Daniel Alvarez Sanchez 2022-03-29 13:03:05 UTC
What we saw is that pinging from an external destination to a FIP (non DVR), the ICMP reply packets coming out from the VM were dropped in the integration bridge of the compute node.
Triggering a recompute on ovn-controller fixed the issue.

Please Yatin, can you upload the OVN databases to the BZ so that the core OVN team can try to reproduce it?

Comment 9 Ales Musil 2022-04-21 13:19:56 UTC

*** This bug has been marked as a duplicate of bug 2069783 ***

Comment 13 Elvira 2022-09-07 13:55:01 UTC
ovn-2021-21.12.0-82.el8fdp.x86_64 is now available in puddle RHOS-16.2-RHEL-8-20220902.n.1. Moving to modified


Note You need to log in before you can comment on or make changes to this bug.