2069668 – [PerfCI][ovn][16.2] dynamic-workload job fails randomly while pinging test vms

Bug 2069668 - [PerfCI][ovn][16.2] dynamic-workload job fails randomly while pinging test vms

Summary: [PerfCI][ovn][16.2] dynamic-workload job fails randomly while pinging test vms

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-networking-ovn
Sub Component:
Version:	16.2 (Train)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	z3
Target Release:	16.2 (Train on RHEL 8.4)
Assignee:	Arnau Verdaguer
QA Contact:	Bharath M V
Docs Contact:
URL:
Whiteboard:
Depends On:	2069714 2069783
Blocks:
TreeView+	depends on / blocked

Reported:	2022-03-29 12:40 UTC by Yatin Karel
Modified:	2024-03-28 13:40 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	2069711 2069714 (view as bug list)
Environment:
Last Closed:	2024-03-28 13:40:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OSP-14365	0	None	None	None	2022-03-29 13:57:30 UTC

Description Yatin Karel 2022-03-29 12:40:07 UTC

Description of problem:
browbeat scenario fails in job DFG-perfscale-PerfCI-OSP16.2-dynamic-workloads-ovn always since some time.

Version-Release number of selected component (if applicable):
ovn-2021-21.12.0-11.el8fdp.x86_64

How reproducible:
always in https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-OSP16.2-dynamic-workloads-ovn/ atleast since 1st March, before that there was some other failure. Last success for the job was seen on 30th Jan, 2022.

This job runs browbeat's dynamic_workload_min[1] scenario's 5 iteration, and then ping to a vm fails in an iteration, the iteration aborts, and next iteration continues and then some vm pass and then one fails.

The similar job is passing for 16.1, so seems some regression in 16.2.


Steps to Reproduce:
1. can be reproduced with https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/perfscale/view/PerfCI/job/DFG-perfscale-PerfCI-custom/65/parameters/
2.
3.

Actual results:
ping to vm fails randomly


Expected results:
all scenarios should succeed.


Additional info:


[1] https://opendev.org/x/browbeat/src/branch/master/rally/rally-plugins/dynamic-workloads/dynamic_workload_min.py#L70-L73

Comment 1 Daniel Alvarez Sanchez 2022-03-29 12:52:51 UTC


*** This bug has been marked as a duplicate of bug 2066413 ***

Comment 2 Daniel Alvarez Sanchez 2022-03-29 13:03:05 UTC

What we saw is that pinging from an external destination to a FIP (non DVR), the ICMP reply packets coming out from the VM were dropped in the integration bridge of the compute node.
Triggering a recompute on ovn-controller fixed the issue.

Please Yatin, can you upload the OVN databases to the BZ so that the core OVN team can try to reproduce it?

Comment 9 Ales Musil 2022-04-21 13:19:56 UTC


*** This bug has been marked as a duplicate of bug 2069783 ***

Comment 13 Elvira 2022-09-07 13:55:01 UTC

ovn-2021-21.12.0-82.el8fdp.x86_64 is now available in puddle RHOS-16.2-RHEL-8-20220902.n.1. Moving to modified

Note You need to log in before you can comment on or make changes to this bug.