Bug 1721018 - [V2V] Failed to create VM after deleting conversion pod
Summary: [V2V] Failed to create VM after deleting conversion pod
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: V2V
Version: 2.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 2.2.0
Assignee: Marek Libra
QA Contact: Maayan Hadasi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-17 07:15 UTC by Maayan Hadasi
Modified: 2020-01-30 16:27 UTC (History)
9 users (show)

Fixed In Version: openshift-enterprise-console v4.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-30 16:27:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
web-UI_screenshot (45.45 KB, image/png)
2019-06-17 07:17 UTC, Maayan Hadasi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:0307 0 None None None 2020-01-30 16:27:21 UTC

Description Maayan Hadasi 2019-06-17 07:15:56 UTC
Description of problem:

In web-UI, if previous migration was completed successfully, we need to remove conversion pod because only one can exist in a time. But the first migration after deleting the conversion pod failed on:

"No API token found for service account "kubevirt-v2v-conversion-9mvxb", retry after the token is automatically created and added to the service account"


Version-Release number of selected component (if applicable):
HCO_BUNDLE_REGISTRY_TAG: v2.0.0-24


How reproducible:
Almost every time. Depends on the time passed between the deleting of the pod and the next VM creation   


Steps to Reproduce:
1. Migrate VM from VMware to CNV
2. Delete conversion pod
3. Migrate VM from VMware to CNV


Actual results:
Step 3 fails


Expected results:


Additional info:

Marek's mail:

Hi,

the issue is most probably caused by sequential dependency between creation of serviceaccount and the conversion pod.
Once ServiceAccount object is created, a controller creates API tokens (as secrets) for it.
When a pod is being created, API checks whether referenced ServiceAccount has these tokens created. If not, the reported error is returned.
In web-ui, the ServiceAccount and conversion pod are created nearly at the same time, so this issue can non-deterministically appear.

I suggest to open a bug and discuss its targeting considering wider context.

Possible workarounds:
- use "shared" service for all conversion pod instances within a single namespace
- postpone conversion pod creation till the serviceaccount has >0 secrets assigned (with timeout)
- restart conversion create if it fails (with delay, limit retry count)

I think the first option is the best one.

Please note, in 4.2 timeframe this logic will be moved from UI layer to brand new specialized operator.
This issue will be taken into consideration when implementing it.

Marek

Comment 1 Maayan Hadasi 2019-06-17 07:17:41 UTC
Created attachment 1581324 [details]
web-UI_screenshot

Comment 5 Marek Libra 2019-08-01 08:23:47 UTC
Patch: https://github.com/kubevirt/web-ui-components/pull/533

Comment 6 Marek Libra 2019-08-01 08:25:30 UTC
As the v2v deployment is about to be moved to separate operator out of UI, I have fixed the issue by postponing conversion pod creation till corresponding ServiceAccount is ready (means: has tokens associated).

Comment 7 Tomas Jelinek 2019-09-30 15:05:50 UTC
Since the v2v has been removed from 2.1.0, retargeting bugs to 2.1.1 to make sure they will be verified on the right version.

Comment 8 Brett Thurber 2019-10-21 18:45:40 UTC
Moving to ON_QA to test for CNV 2.1.1.

Comment 11 Ilanit Stein 2019-12-12 09:42:02 UTC
Moving bug to VERIFIED,
based in that couple of consecutive single VM migration did not show the error,
mentioned in the description. 
Tested on Openshift-4.3/CNV-2.2.0-10
(PSI based environment).

Comment 13 errata-xmlrpc 2020-01-30 16:27:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307


Note You need to log in before you can comment on or make changes to this bug.