Bug 1721018

Summary: [V2V] Failed to create VM after deleting conversion pod
Product: Container Native Virtualization (CNV) Reporter: Maayan Hadasi <mguetta>
Component: V2VAssignee: Marek Libra <mlibra>
Status: CLOSED ERRATA QA Contact: Maayan Hadasi <mguetta>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.0CC: bthurber, cnv-qe-bugs, dagur, ibragins, istein, mlibra, ncredi, sgordon, tjelinek
Target Milestone: ---   
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-enterprise-console v4.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-30 16:27:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
web-UI_screenshot none

Description Maayan Hadasi 2019-06-17 07:15:56 UTC
Description of problem:

In web-UI, if previous migration was completed successfully, we need to remove conversion pod because only one can exist in a time. But the first migration after deleting the conversion pod failed on:

"No API token found for service account "kubevirt-v2v-conversion-9mvxb", retry after the token is automatically created and added to the service account"


Version-Release number of selected component (if applicable):
HCO_BUNDLE_REGISTRY_TAG: v2.0.0-24


How reproducible:
Almost every time. Depends on the time passed between the deleting of the pod and the next VM creation   


Steps to Reproduce:
1. Migrate VM from VMware to CNV
2. Delete conversion pod
3. Migrate VM from VMware to CNV


Actual results:
Step 3 fails


Expected results:


Additional info:

Marek's mail:

Hi,

the issue is most probably caused by sequential dependency between creation of serviceaccount and the conversion pod.
Once ServiceAccount object is created, a controller creates API tokens (as secrets) for it.
When a pod is being created, API checks whether referenced ServiceAccount has these tokens created. If not, the reported error is returned.
In web-ui, the ServiceAccount and conversion pod are created nearly at the same time, so this issue can non-deterministically appear.

I suggest to open a bug and discuss its targeting considering wider context.

Possible workarounds:
- use "shared" service for all conversion pod instances within a single namespace
- postpone conversion pod creation till the serviceaccount has >0 secrets assigned (with timeout)
- restart conversion create if it fails (with delay, limit retry count)

I think the first option is the best one.

Please note, in 4.2 timeframe this logic will be moved from UI layer to brand new specialized operator.
This issue will be taken into consideration when implementing it.

Marek

Comment 1 Maayan Hadasi 2019-06-17 07:17:41 UTC
Created attachment 1581324 [details]
web-UI_screenshot

Comment 5 Marek Libra 2019-08-01 08:23:47 UTC
Patch: https://github.com/kubevirt/web-ui-components/pull/533

Comment 6 Marek Libra 2019-08-01 08:25:30 UTC
As the v2v deployment is about to be moved to separate operator out of UI, I have fixed the issue by postponing conversion pod creation till corresponding ServiceAccount is ready (means: has tokens associated).

Comment 7 Tomas Jelinek 2019-09-30 15:05:50 UTC
Since the v2v has been removed from 2.1.0, retargeting bugs to 2.1.1 to make sure they will be verified on the right version.

Comment 8 Brett Thurber 2019-10-21 18:45:40 UTC
Moving to ON_QA to test for CNV 2.1.1.

Comment 11 Ilanit Stein 2019-12-12 09:42:02 UTC
Moving bug to VERIFIED,
based in that couple of consecutive single VM migration did not show the error,
mentioned in the description. 
Tested on Openshift-4.3/CNV-2.2.0-10
(PSI based environment).

Comment 13 errata-xmlrpc 2020-01-30 16:27:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307