Bug 2091911
| Summary: | virt-launcher pod remains as NonRoot after LiveMigrating VM from NonRoot to Root | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Akriti Gupta <akrgupta> |
| Component: | Virtualization | Assignee: | Jed Lejosne <jlejosne> |
| Status: | CLOSED ERRATA | QA Contact: | Akriti Gupta <akrgupta> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.11.0 | CC: | acardace, cnv-qe-bugs, jlejosne, kbidarka |
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | virt-launcher-v4.11.0-97 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-14 19:35:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Akriti Gupta
2022-05-31 10:12:37 UTC
Starting With NonRoot FeatureGate set as True in HCO CR
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates'
{
"deployTektonTaskResources": false,
"enableCommonBootImageImport": true,
"nonRoot": true,
"sriovLiveMigration": true,
"withHostPassthroughCPU": false
}
VMI is in Running State
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel8520-ocs 112s Running xx.yy.zz.aa virt-akr-411-hptcp-worker-0-wspsz True
Check VirtLauncher Pod , It is NonRoot
---------------------------------------
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
virt-launcher-vm-rhel8520-ocs-clt5w 1/1 Running 0 2m41s
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get pod virt-launcher-vm-rhel8520-ocs-clt5w -o json | jq '.spec.securityContext'
{
"runAsGroup": 107,
"runAsNonRoot": true,
"runAsUser": 107,
"seLinuxOptions": {
"type": "virt_launcher.process"
}
}
Update NonRoot FeatureGate as False in HCO CR
----------------------------------------------
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates'
{
"deployTektonTaskResources": false,
"enableCommonBootImageImport": true,
"nonRoot": false,
"sriovLiveMigration": true,
"withHostPassthroughCPU": false
}
LiveMigrate the VMI
---
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel8520-ocs 5m56s Running xx.yy.zz.aa virt-akr-411-hptcp-worker-0-wspsz True
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ virtctl migrate vm-rhel8520-ocs
VM vm-rhel8520-ocs was scheduled to migrate
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel8520-ocs 6m30s Running xx.yy.zz.bb virt-akr-411-hptcp-worker-0-88894 True
Check VirtLauncher Pod, It should be Root
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
virt-launcher-vm-rhel8520-ocs-7b6w9 1/1 Running 0 50s
virt-launcher-vm-rhel8520-ocs-clt5w 0/1 Completed 0 7m
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get pod virt-launcher-vm-rhel8520-ocs-7b6w9 -o json | jq '.spec.securityContext'
{
"runAsGroup": 107,
"runAsNonRoot": true,
"runAsUser": 107,
"seLinuxOptions": {
"type": "virt_launcher.process"
}
}
I tested this again, While doing so made sure the KV CR did get updated, that is "NonRoot" did get dropped from the FG list, before I triggered a migration. Targetting this to 4.11 while we assess the impact. There exists a workaround, which is to reboot the affected VM. Reproduced upstream, PRed a fix candidate: https://github.com/kubevirt/kubevirt/pull/7841 I am still working on this, hoping to have functional/unit tests written by the end of the day. Talking about this offline to some people, I do wonder what the point of it is, from a user perspective at least. VMIs are mostly immutable, so a non-root -> root migration can't be used to add devices/features to a running VM. It does bring some coherence (enabling non-root migrates root VMs to non-root, so why not the other way round). In some very specific situation, it could allow us to ensure some issue is not caused by the non-root feature, without having to reboot a given VM... But that's about all I can think of. I'd be curious if other use-cases exist. Thoughts? Checked with v4.11.0-521
NonRoot FeatureGate set as True in HCO CR
------------------------------------------
[cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates'
{
"deployTektonTaskResources": false,
"enableCommonBootImageImport": true,
"nonRoot": true,
"sriovLiveMigration": true,
"withHostPassthroughCPU": false
}
VMI is in Running State
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel84-ocs 20m Running 10.128.2.150 virt-akr-411-n45lw-worker-0-rkhhf True
Check VirtLauncher Pod , It is NonRoot
---------------------------------------
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
virt-launcher-vm-rhel84-ocs-fx7xl 1/1 Running 0 22m
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods virt-launcher-vm-rhel84-ocs-fx7xl -o json | jq '.spec.securityContext'
{
"runAsGroup": 107,
"runAsNonRoot": true,
"runAsUser": 107,
"seLinuxOptions": {
"type": "virt_launcher.process"
}
}
Update NonRoot FeatureGate as False in HCO CR
----------------------------------------------
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get hyperconverged kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates'
{
"deployTektonTaskResources": false,
"enableCommonBootImageImport": true,
"nonRoot": false,
"sriovLiveMigration": true,
"withHostPassthroughCPU": false
}
LiveMigrate the VMI
---------------------
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel84-ocs 59m Running 10.128.2.150 virt-akr-411-n45lw-worker-0-rkhhf True
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ virtctl migrate vm-rhel84-ocs
VM vm-rhel84-ocs was scheduled to migrate
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi
NAME AGE PHASE IP NODENAME READY
vm-rhel84-ocs 60m Running 10.131.0.160 virt-akr-411-n45lw-worker-0-8d2sm True
Check VirtLauncher Pod, It should be Root
------------------------------------------
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
virt-launcher-vm-rhel84-ocs-b6d55 1/1 Running 0 30m
virt-launcher-vm-rhel84-ocs-fx7xl 0/1 Completed 0 90m
[cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods virt-launcher-vm-rhel84-ocs-b6d55 -o json | jq '.spec.securityContext'
{
"runAsUser": 0,
"seLinuxOptions": {
"type": "virt_launcher.process"
}
}
(In reply to Akriti Gupta from comment #6) > Checked with v4.11.0-521 > > NonRoot FeatureGate set as True in HCO CR > ------------------------------------------ > > [cnv-qe-jenkins@virt-akr-411-hptcp-executor ~]$ oc get hyperconverged > kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates' > { > "deployTektonTaskResources": false, > "enableCommonBootImageImport": true, > "nonRoot": true, > "sriovLiveMigration": true, > "withHostPassthroughCPU": false > } > > > VMI is in Running State > > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-rhel84-ocs 20m Running 10.128.2.150 > virt-akr-411-n45lw-worker-0-rkhhf True > > > Check VirtLauncher Pod , It is NonRoot > --------------------------------------- > > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods > NAME READY STATUS RESTARTS AGE > virt-launcher-vm-rhel84-ocs-fx7xl 1/1 Running 0 22m > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods > virt-launcher-vm-rhel84-ocs-fx7xl -o json | jq '.spec.securityContext' > { > "runAsGroup": 107, > "runAsNonRoot": true, > "runAsUser": 107, > "seLinuxOptions": { > "type": "virt_launcher.process" > } > } > > > > Update NonRoot FeatureGate as False in HCO CR > ---------------------------------------------- > > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get hyperconverged > kubevirt-hyperconverged -n openshift-cnv -o json | jq '.spec.featureGates' > { > "deployTektonTaskResources": false, > "enableCommonBootImageImport": true, > "nonRoot": false, > "sriovLiveMigration": true, > "withHostPassthroughCPU": false > } > > > LiveMigrate the VMI > --------------------- > > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-rhel84-ocs 59m Running 10.128.2.150 > virt-akr-411-n45lw-worker-0-rkhhf True > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ virtctl migrate vm-rhel84-ocs > VM vm-rhel84-ocs was scheduled to migrate > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get vmi > NAME AGE PHASE IP NODENAME > READY > vm-rhel84-ocs 60m Running 10.131.0.160 > virt-akr-411-n45lw-worker-0-8d2sm True > > > Check VirtLauncher Pod, It should be Root > ------------------------------------------ > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods > NAME READY STATUS RESTARTS AGE > virt-launcher-vm-rhel84-ocs-b6d55 1/1 Running 0 30m > virt-launcher-vm-rhel84-ocs-fx7xl 0/1 Completed 0 90m > [cnv-qe-jenkins@virt-akr-411-n45lw-executor ~]$ oc get pods > virt-launcher-vm-rhel84-ocs-b6d55 -o json | jq '.spec.securityContext' > { > "runAsUser": 0, > "seLinuxOptions": { > "type": "virt_launcher.process" > } > } working as expected, migration from nonroot to root works fine Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6526 |