Bug 2152875 - Unable to do post-copy migration
Summary: Unable to do post-copy migration
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 4.12.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.12.4
Assignee: Itamar Holder
QA Contact: Denys Shchedrivyi
URL:
Whiteboard:
: 2152242 (view as bug list)
Depends On:
Blocks: 2164836
TreeView+ depends on / blocked
 
Reported: 2022-12-13 11:29 UTC by lpivarc
Modified: 2023-06-14 12:17 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2164836 (view as bug list)
Environment:
Last Closed: 2023-06-14 12:17:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hyperconverged-cluster-operator pull 2220 0 None Merged KubeVirt: enable the seccomp feature gate and custom profile 2023-01-30 16:10:32 UTC
Github kubevirt kubevirt pull 8941 0 None closed Launcher: Unconfined seccomp 2023-03-10 12:16:29 UTC
Github kubevirt kubevirt pull 8978 0 None Merged [release-0.58] SCC: allow unconfined seccomp 2023-03-11 08:38:14 UTC
Red Hat Issue Tracker CNV-23367 0 None None None 2022-12-13 11:39:09 UTC

Description lpivarc 2022-12-13 11:29:51 UTC
Description of problem:

Beginning OCP 4.12 we are unable to do post-migration as SCC will default seccomp profile. The default seccomp profile shipped by OCP is blocking userfaultfd. 


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 sgott 2022-12-14 13:04:36 UTC
Proposing this as a blocker to 4.12.0 because it is a regression of a major feature.

Comment 2 Kedar Bidarkar 2022-12-14 13:10:30 UTC
*** Bug 2152242 has been marked as a duplicate of this bug. ***

Comment 4 sgott 2022-12-19 15:44:44 UTC
Per offline discussion, post-copy is not enabled in CNV thus this should not block the release.

Comment 6 Denys Shchedrivyi 2023-03-20 18:11:13 UTC
Verified on v4.12.2-18 - VM successfully migrated in PostCopy mode.

Comment 7 Denys Shchedrivyi 2023-03-20 19:14:08 UTC
 Seems that the problem still exists (or it is another problem).
 
 During running automation we see that migration failed to switch to post copy mode:

> {"component":"virt-launcher","level":"info","msg":"unable to execute QEMU command {\"execute\":\"migrate-set-capabilities\",\"arguments\":{\"capabilities\":[{\"capability\":\"xbzrle\",\"state\":false},{\"capability\":\"auto-converge\",\"state\":false},{\"capability\":\"rdma-pin-all\",\"state\":false},{\"capability\":\"postcopy-ram\",\"state\":true},{\"capability\":\"compress\",\"state\":false},{\"capability\":\"pause-before-switchover\",\"state\":false},{\"capability\":\"late-block-activate\",\"state\":true},{\"capability\":\"multifd\",\"state\":false},{\"capability\":\"dirty-bitmaps\",\"state\":false},{\"capability\":\"return-path\",\"state\":true}]},\"id\":\"libvirt-402\"}: {\"id\":\"libvirt-402\",\"error\":{\"class\":\"GenericError\",\"desc\":\"Postcopy is not supported\"}}","pos":"qemuMonitorJSONCheckErrorFull:388","subcomponent":"libvirt","thread":"28","timestamp":"2023-03-20T18:34:05.643000Z"}

> {"component":"virt-launcher","level":"error","msg":"internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported","pos":"qemuMonitorJSONCheckErrorFull:402","subcomponent":"libvirt","thread":"28","timestamp":"2023-03-20T18:34:05.643000Z"}
 

 However with manual tests I saw it worked fine:

>    Migration State:
>    Completed:      true
>    End Timestamp:  2023-03-20T18:08:04Z
>    Migration Configuration:
>      Allow Auto Converge:                    false
>      Allow Post Copy:                        true
>      Bandwidth Per Migration:                0
>      Completion Timeout Per Gi B:            1
>      Node Drain Taint Key:                   kubevirt.io/drain
>      Parallel Migrations Per Cluster:        5
>      Parallel Outbound Migrations Per Node:  2
>      Progress Timeout:                       150
>      Unsafe Migration Override:              false
>    Migration Policy Name:                    policy1
>    Migration UID:                            6d45f8fc-9dda-44cd-b9bd-940140b3bdf5
>    Mode:                                     PostCopy

Comment 13 Kedar Bidarkar 2023-06-14 12:17:08 UTC
We are deciding to mark this bug as Closed, 
a) As this is already fixed in 4.13.z
b) post-copy migration is currently Dev-Preview


Note You need to log in before you can comment on or make changes to this bug.