Bug 1882785

Summary: Multi-Arch CI Jobs destroy libvirt network but occasionally leave it defined
Product: OpenShift Container Platform Reporter: Jeremy Poulin <jpoulin>
Component: Multi-ArchAssignee: Jeremy Poulin <jpoulin>
Status: CLOSED ERRATA QA Contact: Rafael Fonseca <rdossant>
Severity: low Docs Contact:
Priority: low    
Version: 4.6CC: clnperez, danili, dmistry
Target Milestone: ---Keywords: TestOnly
Target Release: 4.7.0   
Hardware: s390x   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1910158 (view as bug list) Environment:
Last Closed: 2021-02-24 15:21:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1910158    

Description Jeremy Poulin 2020-09-25 16:53:57 UTC
Description of problem:
This doesn't happen very often, but occasionally teardown will fail to fully undefine the libvirt network used for a CI job. This is bad because further network devices that are leased that subnet will fail to create a cluster, leaving that lease completely broken until someone manually intervenes.

This seems to be happening far more often with 4.6.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Dan Li 2020-09-28 13:16:17 UTC
Hi Jeremy, which Target Release are you targeting for this bug (4.6 or 4.7)? Currently it is considered "Untriaged" and it would be great if we can provide a target release.

Comment 2 Dan Li 2020-09-28 17:20:17 UTC
Hi Jeremy, one more logistics question - will this bug be resolved before the end of this Sprint (October 3rd)? If not, can we add "UpcomingSprint"?

Comment 3 Jeremy Poulin 2020-09-28 17:30:53 UTC
Hi Dan - very unlikely to be resolved this week.

We're still in the monitoring phase to determine what causes the underlying problem, so I have added the UpcomingSprint label.

Comment 4 Dan Li 2020-10-19 20:23:46 UTC
Adding "UpcomingSprint" tag as Jeremy is OOTO and this bug is unlikely to be resolved before the end of this Sprint (Oct 24th)

Comment 5 Dan Li 2020-11-12 00:58:03 UTC
Hi Jeremy, will this bug be resolved before the end of this sprint (Nov 14th)? If not, can we add "UpcomingSprint"?

Comment 6 Jeremy Poulin 2020-11-12 15:21:09 UTC
There isn't a clear path forward on how to fix this yet, but this will likely be targeted for post step-registry migration.

Comment 7 Dan Li 2020-12-02 18:45:24 UTC
Hi Jeremy, will this bug be resolved before the end of this sprint (Dec 5th)? If not, can we add "UpcomingSprint"?

Comment 8 Jeremy Poulin 2020-12-02 21:39:02 UTC
This is affected by the work that Deep is doing with the step registry migration. I don't think it will make it into next sprint, but I can see it becoming higher priority once we knock out some of the major stability improvements. Marking this "UpcomingSprint"

Comment 9 Dan Li 2020-12-15 18:30:00 UTC
Hi Jeremy,

I am doing this exercise one week early because most people are out next week. 

1. Do you think this bug will be resolved before the end of this sprint (December 26th)? If not, I'd like to add "UpcomingSprint"
2. Do you think this bug's Target Release is still 4.7.0? If it does not target 4.7, can we set it to blank value "---"?

Comment 10 Jeremy Poulin 2020-12-17 23:36:31 UTC
Adding upcomingSprint and setting the delivery to blank. I did have a conversation with the test platform team about this bug, but right now it reminds lower priority than all our other work.
https://coreos.slack.com/archives/CBN38N3MW/p1608140054245400

Comment 14 errata-xmlrpc 2021-02-24 15:21:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633