Bug 2083389

Summary: SRIOV stuck in sriovnetwork state draining
Product: OpenShift Container Platform Reporter: John Coleman <jocolema>
Component: NetworkingAssignee: Sebastian Scheinkman <sscheink>
Networking sub component: SR-IOV QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: high CC: bnemeth, cgoncalves, eglottma, johender, keyoung, kjavier, mleonard, openshift-bugs-escalate, sscheink, sushil.suresh, zshi
Version: 4.8   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-30 10:17:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2095210    
Bug Blocks:    

Comment 2 Balazs Nemeth 2022-05-12 11:06:18 UTC
The reason why we have the drain lock is explained here: https://bugzilla.redhat.com/show_bug.cgi?id=1960103


If you still managed to get the cluster into a bad state (i.e. by rebooting at some intermediate state..), we should have a reproducer to specify how exactly to get into that state. I presume this is some kind of a race condition.  Only then can we try to make the code more robust for that edge case.

Comment 8 milti leonard 2022-05-19 17:07:00 UTC
@balazs i believe john shared the steps for reproducing earlier, can you pls give an update on next steps or thoughts as to what the issue here might be?

Comment 10 Balazs Nemeth 2022-05-19 17:42:43 UTC
@miltimilti



a) We do not know _why_ we get into this state currently. If we can avoid that in the first place, then we should never have had a problem.
b) I agree that there is a way to get into this state, and if you do, we have a state in which we are stuck. This we should be able to patch up.

My next step will be to provide a custom image that fixes b).