This bug was initially created as a copy of Bug #2008511 I am copying this bug because: Description of problem: Have been upgrading a Cluster from OCP/CNV-4.6/2.5.8 to OCP/CNV-4.7.32/2.6.7 to OCP-4.8.13 for another bug verification. CNV_SOURCE=osbs ; 2.5.8 to 2.6.7 upgrade was as follows: 2.5.8 to 2.6.4 to 2.6.5 to 2.6.6 to 2.6.7 Post 2.6.7, cluster status was GOOD. When upgrading from OCP/CNV 4.7.32/2.6.7 to OCP 4.8.13, that is CNV 2.6.7 running on OCP 4.8.13 , we see "VMI is in LiveMigrate loop" From the below pastebin, upgrade seems complete but, Worker Nodes are not yet updated to 4.8 Images. The Worker Nodes are stuck, because the vm2-ocs-rhel84 VM is in a constant LiveMigrate Loop. I guess, If I stop/start or restart the VM, the worker Nodes might continue with the upgrade. But I feel the VMI should have LiveMigrated successfully. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Upgrade stuck, due to constant VMI Live-Migrate Loop Expected results: Upgrade is not stuck, due to VMI Live-Migrate Loop Additional info: Running Must-Gather shows, ClusterOperators: clusteroperator/machine-config is not available (Cluster not available for 4.8.13) because Failed to resync 4.8.13 because: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool worker is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 1) clusteroperator/operator-lifecycle-manager is not upgradeable because ClusterServiceVersions blocking cluster upgrade: openshift-local-storage/local-storage-operator.4.6.0-202109030220 is incompatible with OpenShift minor versions greater than 4.8,openshift-cnv/kubevirt-hyperconverged-operator.v2.6.7 is incompatible with OpenShift minor versions greater than 4.8 1) local-storage operator was not updated and it remains the same, which was installed with 4.6/2.5.8 2) OCS was not updated and it remains the same, which was installed with 4.6/2.5.8
Summary: LiveMigration was successful, directly from virt-launcher version v2.6.4 ( to be precise it's v2.6.3-2 ) to v4.8.4-12 VERIFIED with: virt-operator-container-v4.8.4-12 hco-bundle-registry-container-v4.8.4-44 [kbidarka@localhost migration]$ oc get pods NAME READY STATUS RESTARTS AGE virt-launcher-vm-rhel84-ocs-8nqps 0/1 Completed 0 90m virt-launcher-vm-rhel84-ocs-pxp5b 1/1 Running 0 7m23s virt-launcher-vm2-rhel84-ocs-9xldq 1/1 Running 0 9m41s virt-launcher-vm2-rhel84-ocs-f7pd7 0/1 Error 0 90m [kbidarka@localhost migration]$ oc get vmi NAME AGE PHASE IP NODENAME vm-rhel84-ocs 21h Running 10.xxx.yy.zzz virt05m-kswp6-worker-0-k2xtk vm2-rhel84-ocs 21h Running 10.xxx.yy.zzz virt05m-kswp6-worker-0-k2xtk [kbidarka@localhost migration]$ virtctl console vm-rhel84-ocs Successfully connected to vm-rhel84-ocs console. The escape sequence is ^] [cloud-user@vm-rhel84-ocs ~]$ uptime 11:22:46 up 21:15, 1 user, load average: 0.00, 0.00, 0.00 [cloud-user@vm-rhel84-ocs ~]$ [kbidarka@localhost migration]$ [kbidarka@localhost migration]$ virtctl console vm2-rhel84-ocs Successfully connected to vm2-rhel84-ocs console. The escape sequence is ^] [cloud-user@vm2-rhel84-ocs ~]$ uptime 11:22:57 up 21:15, 1 user, load average: 0.00, 0.00, 0.00 [cloud-user@vm2-rhel84-ocs ~]$ [kbidarka@localhost migration]$ [kbidarka@localhost migration]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.25 True False 19h Cluster version is 4.8.25 [kbidarka@localhost migration]$ oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v4.8.4 OpenShift Virtualization 4.8.4 kubevirt-hyperconverged-operator.v2.6.4 Succeeded Direct CNV upgrade from v2.6.4 to v4.8.4 -------------------------------------------- [kbidarka@localhost ocp-cnv-scripts]$ oc get ip -A NAMESPACE NAME CSV APPROVAL APPROVED openshift-cnv install-fw9n5 kubevirt-hyperconverged-operator.v2.6.4 Manual true openshift-cnv install-v42hc kubevirt-hyperconverged-operator.v4.8.4 Manual true --- Source Virt-Launcher Pod: ]$ ./fetch_pod_version_info_icsp.sh virt-launcher-vm-rhel84-ocs-8nqps default Output: container-native-virtualization/virt-launcher/images/v2.6.3-2 Target Virt-Launcher Pod: ]$ ./fetch_pod_version_info_icsp.sh virt-launcher-vm-rhel84-ocs-pxp5b default Output: container-native-virtualization/virt-launcher/images/v4.8.4-12 --- [kbidarka@localhost migration]$ oc logs -f virt-launcher-vm2-rhel84-ocs-f7pd7 | grep '\"level\":\"error\"' {"component":"virt-launcher","level":"error","msg":"internal error: Child process (dmidecode -q -t 0,1,2,3,4,11,17) unexpected exit status 1: /dev/mem: No such file or directory","pos":"virCommandWait:2745","subcomponent":"libvirt","thread":"42","timestamp":"2022-01-07T14:52:15.221000Z"} {"component":"virt-launcher","level":"error","msg":"Error when closing file","pos":"os_helper.go:34","reason":"invalid argument","timestamp":"2022-01-07T14:52:16.550089Z"} {"component":"virt-launcher","level":"error","msg":"At least one cgroup controller is required: No such device or address","pos":"virCgroupDetect:415","subcomponent":"libvirt","thread":"30","timestamp":"2022-01-07T14:52:17.836000Z"} {"component":"virt-launcher","level":"error","msg":"failed to get domain metadata","pos":"libvirt_helper.go:137","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:17.939299Z"} {"component":"virt-launcher","level":"error","msg":"Could not fetch the Domain specification.","pos":"client.go:231","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:17.939365Z"} {"component":"virt-launcher","level":"error","msg":"failed to get domain metadata","pos":"libvirt_helper.go:137","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.478106Z"} {"component":"virt-launcher","level":"error","msg":"Could not fetch the Domain specification.","pos":"client.go:231","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.478207Z"} {"component":"virt-launcher","kind":"","level":"error","msg":"failed to get domain job info","name":"vm2-rhel84-ocs","namespace":"default","pos":"manager.go:742","reason":"virError(Code=55, Domain=20, Message='Requested operation is not valid: domain is not running')","timestamp":"2022-01-07T16:13:06.657070Z","uid":"b9fe7292-8841-4396-8e76-17f7636d9a0a"} {"component":"virt-launcher","level":"error","msg":"timeout on stopping the cmd server, continuing anyway.","pos":"server.go:478","timestamp":"2022-01-07T16:13:07.906406Z"} {"component":"virt-launcher","level":"error","msg":"dirty virt-launcher shutdown","pos":"virt-launcher.go:522","reason":"wait: no child processes","timestamp":"2022-01-07T16:13:07.912884Z"} [kbidarka@localhost migration]$ oc logs -f virt-launcher-vm-rhel84-ocs-8nqps | grep '\"level\":\"error\"' {"component":"virt-launcher","level":"error","msg":"internal error: Child process (dmidecode -q -t 0,1,2,3,4,11,17) unexpected exit status 1: /dev/mem: No such file or directory","pos":"virCommandWait:2745","subcomponent":"libvirt","thread":"42","timestamp":"2022-01-07T14:52:15.734000Z"} {"component":"virt-launcher","level":"error","msg":"Error when closing file","pos":"os_helper.go:34","reason":"invalid argument","timestamp":"2022-01-07T14:52:17.242507Z"} {"component":"virt-launcher","level":"error","msg":"At least one cgroup controller is required: No such device or address","pos":"virCgroupDetect:415","subcomponent":"libvirt","thread":"28","timestamp":"2022-01-07T14:52:18.312000Z"} {"component":"virt-launcher","level":"error","msg":"failed to get domain metadata","pos":"libvirt_helper.go:137","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.410666Z"} {"component":"virt-launcher","level":"error","msg":"Could not fetch the Domain specification.","pos":"client.go:231","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.410741Z"} {"component":"virt-launcher","level":"error","msg":"failed to get domain metadata","pos":"libvirt_helper.go:137","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.981304Z"} {"component":"virt-launcher","level":"error","msg":"Could not fetch the Domain specification.","pos":"client.go:231","reason":"virError(Code=80, Domain=20, Message='metadata not found: Requested metadata element is not present')","timestamp":"2022-01-07T14:52:18.981376Z"} {"component":"virt-launcher","kind":"","level":"error","msg":"failed to get domain job info","name":"vm-rhel84-ocs","namespace":"default","pos":"manager.go:742","reason":"virError(Code=55, Domain=20, Message='Requested operation is not valid: domain is not running')","timestamp":"2022-01-07T16:15:18.317062Z","uid":"af7fc901-7852-490f-bd60-8532dc7068d1"} {"component":"virt-launcher","level":"error","msg":"timeout on stopping the cmd server, continuing anyway.","pos":"server.go:478","timestamp":"2022-01-07T16:15:19.550850Z"} {"component":"virt-launcher","level":"error","msg":"Failed to reap process -1","pos":"virt-launcher.go:487","reason":"no child processes","timestamp":"2022-01-07T16:15:19.555211Z"}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.8.4 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0213