Bug 1911118
| Summary: | Windows VMI LiveMigration / shutdown fails on 'XML error: non unique alias detected: ua-') | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Ruth Netser <rnetser> |
| Component: | Virtualization | Assignee: | Hao Yu <hayu> |
| Status: | CLOSED ERRATA | QA Contact: | vsibirsk |
| Severity: | urgent | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.6.0 | CC: | cnv-qe-bugs, danken, fdeutsch, gouyang, ipinto, mprivozn, rmohr, sgott, ycui |
| Target Milestone: | --- | Keywords: | Regression |
| Target Release: | 2.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | hco-bundle-registry-container-v2.6.0-620 virt-operator-container-v2.6.0-114 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-03-10 11:22:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1930339 | ||
The only device aliases listed in all domain XMLs attached that have "ua-" in them are: ua-rootdisk ua-default ua-cloudinitdisk ua-tablet Could it be that this is a domain XML parsing error? There are more aliases though. I'm looking at source_pod.log from attached bug.tar.gz and I'm looking at the last few lines just around where the error is reported. I can see base64 encoded domain XML:
{"component":"virt-launcher","kind":"","level":"info","msg":"Domain XML generated. Base64 dump ...","name":"win-19","namespace":"default","pos":"libvirt_helper.go:99","timestamp":"2020-12-27T16:55:11.552019Z","uid":"1c635b6c-d50a-4555-8539-a969529c5eba"}
And when I decode it and grep "ua-" I get the following:
<alias name="ua-default"></alias>
<alias name="ua-"></alias>
<alias name="ua-i0"></alias>
<alias name="ua-tio-serial0"></alias>
<alias name="ua-"></alias>
<alias name="ua-e.0"></alias>
<alias name="ua-.1"></alias>
<alias name="ua-.2"></alias>
<alias name="ua-.3"></alias>
<alias name="ua-.4"></alias>
<alias name="ua-.5"></alias>
<alias name="ua-.6"></alias>
<alias name="ua-rootdisk"></alias>
<alias name="ua-tablet"></alias>
<alias name="ua-ut1"></alias>
<alias name="ua-ut2"></alias>
<alias name="ua-ial0"></alias>
<alias name="ua-ial0"></alias>
and there are two "ua-" lines. Looking at the domain XML they correspond to these devices:
<controller type="usb" index="0" model="qemu-xhci">
<alias name="ua-"></alias>
<address type="pci" domain="0x0000" bus="0x02" slot="0x00" function="0x0"></address>
</controller>
<controller type="sata" index="0">
<alias name="ua-"></alias>
<address type="pci" domain="0x0000" bus="0x00" slot="0x1f" function="0x2"></address>
</controller>
Can it be, that virt-launcher generated non-unique alias?
Seen the same error on VMI shutdown:
Last Probe Time: <nil>
Last Transition Time: 2021-01-17T20:36:38Z
Message: server error. command Shutdown failed: "LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')"
Reason: Synchronizing with the Domain failed.
Status: False
Type: Synchronized
Windows VM was created and started, and then deleted.
The VM was deleted, VMI is running and virt-launcher pod is terminating:
============================================================================================
$ oc describe vmi -n ssp-general-test-networkinterfacemultiqueue-feature
Name: win2k19-1610912729-0605187
Namespace: ssp-general-test-networkinterfacemultiqueue-feature
Labels: kubevirt.io/domain=win2k19-1610912729-0605187
kubevirt.io/nodeName=virt01-d2rcr-worker-0-7qwgk
kubevirt.io/size=medium
kubevirt.io/vm=win2k19-1610912729-0605187
Annotations: kubevirt.io/latest-observed-api-version: v1alpha3
kubevirt.io/storage-observed-api-version: v1alpha3
API Version: kubevirt.io/v1alpha3
Kind: VirtualMachineInstance
Metadata:
Creation Timestamp: 2021-01-17T20:10:22Z
Deletion Grace Period Seconds: 0
Deletion Timestamp: 2021-01-17T20:35:39Z
Finalizers:
foregroundDeleteVirtualMachine
Generation: 16
Managed Fields:
API Version: kubevirt.io/v1alpha3
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubevirt.io/latest-observed-api-version:
f:kubevirt.io/storage-observed-api-version:
f:labels:
.:
f:kubevirt.io/domain:
f:kubevirt.io/nodeName:
f:kubevirt.io/size:
f:kubevirt.io/vm:
f:ownerReferences:
f:spec:
.:
f:domain:
.:
f:clock:
.:
f:timer:
.:
f:hpet:
.:
f:present:
f:hyperv:
f:pit:
.:
f:tickPolicy:
f:rtc:
.:
f:tickPolicy:
f:utc:
f:cpu:
.:
f:cores:
f:sockets:
f:threads:
f:devices:
.:
f:disks:
f:inputs:
f:interfaces:
f:networkInterfaceMultiqueue:
f:rng:
f:features:
.:
f:acpi:
f:apic:
f:hyperv:
.:
f:frequencies:
f:ipi:
f:reenlightenment:
f:relaxed:
f:reset:
f:runtime:
f:spinlocks:
.:
f:spinlocks:
f:synic:
f:synictimer:
f:tlbflush:
f:vapic:
f:vpindex:
f:firmware:
.:
f:uuid:
f:machine:
.:
f:type:
f:resources:
.:
f:requests:
.:
f:memory:
f:evictionStrategy:
f:networks:
f:terminationGracePeriodSeconds:
f:volumes:
f:status:
.:
f:activePods:
.:
f:83c906a0-96a7-448f-a323-d04eae693294:
f:conditions:
f:guestOSInfo:
f:nodeName:
f:qosClass:
Manager: virt-controller
Operation: Update
Time: 2021-01-17T20:36:38Z
API Version: kubevirt.io/v1alpha3
Fields Type: FieldsV1
fieldsV1:
f:status:
f:interfaces:
f:migrationMethod:
f:phase:
f:volumeStatus:
Manager: virt-handler
Operation: Update
Time: 2021-01-17T20:36:38Z
Owner References:
API Version: kubevirt.io/v1alpha3
Block Owner Deletion: true
Controller: true
Kind: VirtualMachine
Name: win2k19-1610912729-0605187
UID: 9587e9eb-d08d-4906-b507-e36bc3ab6288
Resource Version: 2815779
Self Link: /apis/kubevirt.io/v1alpha3/namespaces/ssp-general-test-networkinterfacemultiqueue-feature/virtualmachineinstances/win2k19-1610912729-0605187
UID: 7c9671fa-e428-473f-bcaa-763b8cd467ac
Spec:
Domain:
Clock:
Timer:
Hpet:
Present: false
Hyperv:
Present: true
Pit:
Present: true
Tick Policy: delay
Rtc:
Present: true
Tick Policy: catchup
Utc:
Cpu:
Cores: 1
Sockets: 1
Threads: 1
Devices:
Disks:
Disk:
Bus: sata
Name: win2k19-1610912729-0605187
Inputs:
Bus: usb
Name: tablet
Type: tablet
Interfaces:
Mac Address: 02:ae:20:00:00:1d
Masquerade:
Model: virtio
Name: default
Network Interface Multiqueue: true
Rng:
Features:
Acpi:
Enabled: true
Apic:
Enabled: true
Hyperv:
Frequencies:
Enabled: true
Ipi:
Enabled: true
Reenlightenment:
Enabled: true
Relaxed:
Enabled: true
Reset:
Enabled: true
Runtime:
Enabled: true
Spinlocks:
Enabled: true
Spinlocks: 8191
Synic:
Enabled: true
Synictimer:
Enabled: true
Tlbflush:
Enabled: true
Vapic:
Enabled: true
Vpindex:
Enabled: true
Firmware:
Uuid: 60c21962-cd81-5ae5-ae11-d23935df1b26
Machine:
Type: pc-q35-rhel8.3.0
Resources:
Requests:
Cpu: 100m
Memory: 4Gi
Eviction Strategy: LiveMigrate
Networks:
Name: default
Pod:
Termination Grace Period Seconds: 180
Volumes:
Data Volume:
Name: win2k19-1610912729-0605187
Name: win2k19-1610912729-0605187
Status:
Active Pods:
83c906a0-96a7-448f-a323-d04eae693294: virt01-d2rcr-worker-0-7qwgk
Conditions:
Last Probe Time: <nil>
Last Transition Time: <nil>
Status: True
Type: LiveMigratable
Last Probe Time: <nil>
Last Transition Time: 2021-01-17T20:36:38Z
Message: server error. command Shutdown failed: "LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')"
Reason: Synchronizing with the Domain failed.
Status: False
Type: Synchronized
Last Probe Time: 2021-01-17T20:36:39Z
Last Transition Time: 2021-01-17T20:36:39Z
Message: The Pod is terminating
Reason: PodTerminating
Status: False
Type: Ready
Guest OS Info:
Interfaces:
Ip Address: 10.129.2.48
Ip Addresses:
10.129.2.48
Mac: 02:ae:20:00:00:1d
Name: default
Migration Method: LiveMigration
Node Name: virt01-d2rcr-worker-0-7qwgk
Phase: Running
Qos Class: Burstable
Volume Status:
Name: win2k19-1610912729-0605187
Target: sda
Events: <none>
============================================================================================
$ oc describe pod -n ssp-general-test-networkinterfacemultiqueue-feature
Name: virt-launcher-win2k19-1610912729-0605187-b4rbc
Namespace: ssp-general-test-networkinterfacemultiqueue-feature
Priority: 0
Node: virt01-d2rcr-worker-0-7qwgk/192.168.3.12
Start Time: Sun, 17 Jan 2021 20:10:22 +0000
Labels: kubevirt.io=virt-launcher
kubevirt.io/created-by=7c9671fa-e428-473f-bcaa-763b8cd467ac
kubevirt.io/domain=win2k19-1610912729-0605187
kubevirt.io/size=medium
kubevirt.io/vm=win2k19-1610912729-0605187
Annotations: k8s.v1.cni.cncf.io/network-status:
[{
"name": "",
"interface": "eth0",
"ips": [
"10.129.2.48"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "",
"interface": "eth0",
"ips": [
"10.129.2.48"
],
"default": true,
"dns": {}
}]
kubevirt.io/domain: win2k19-1610912729-0605187
openshift.io/scc: kubevirt-controller
traffic.sidecar.istio.io/kubevirtInterfaces: k6t-eth0
Status: Terminating (lasts 13h)
Termination Grace Period: 210s
IP: 10.129.2.48
IPs:
IP: 10.129.2.48
Controlled By: VirtualMachineInstance/win2k19-1610912729-0605187
Containers:
compute:
Container ID: cri-o://23e1159a005cbfa44995ec1f5bac9ba359cbf884a80dd9551f66517f0bdf4628
Image: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:7818aca8c1a5d49e2c394b4a5e74b72c22fa6642bc403f3088f5e586ec61b2b4
Image ID: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:6410b0b3ce5ffede48db04c225b5fe7ebdbaebd20f3c15bc456e2b2ab53aaf9b
Port: <none>
Host Port: <none>
Command:
/usr/bin/virt-launcher
--qemu-timeout
5m
--name
win2k19-1610912729-0605187
--uid
7c9671fa-e428-473f-bcaa-763b8cd467ac
--namespace
ssp-general-test-networkinterfacemultiqueue-feature
--kubevirt-share-dir
/var/run/kubevirt
--ephemeral-disk-dir
/var/run/kubevirt-ephemeral-disks
--container-disk-dir
/var/run/kubevirt/container-disks
--grace-period-seconds
195
--hook-sidecars
0
--less-pvc-space-toleration
10
--ovmf-path
/usr/share/OVMF
State: Running
Started: Sun, 17 Jan 2021 20:10:33 +0000
Ready: True
Restart Count: 0
Limits:
devices.kubevirt.io/kvm: 1
devices.kubevirt.io/tun: 1
devices.kubevirt.io/vhost-net: 1
Requests:
cpu: 100m
devices.kubevirt.io/kvm: 1
devices.kubevirt.io/tun: 1
devices.kubevirt.io/vhost-net: 1
memory: 4481613825
Environment: <none>
Mounts:
/var/run/kubevirt-ephemeral-disks from ephemeral-disks (rw)
/var/run/kubevirt/container-disks from container-disks (rw)
/var/run/kubevirt/hotplug-disks from hotplug-disks (rw)
/var/run/kubevirt/sockets from sockets (rw)
/var/run/libvirt from libvirt-runtime (rw)
Devices:
/dev/win2k19-1610912729-0605187 from win2k19-1610912729-0605187
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
sockets:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
win2k19-1610912729-0605187:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: win2k19-1610912729-0605187
ReadOnly: false
virt-bin-share-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
libvirt-runtime:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
ephemeral-disks:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
container-disks:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
hotplug-disks:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
QoS Class: Burstable
Node-Selectors: kubevirt.io/schedulable=true
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
============================================================================================
$ oc logs -n openshift-cnv virt-handler-ws2cq |grep 0605187|less
...
...
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Running, reason: Unknown\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1312","timestamp":"2021-01-17T20:36:38.095283Z","uid":""}
E0117 20:36:38.099616 6549 event.go:256] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"win2k19-1610912729-0605187.165b1df696d818ff", GenerateName:"", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"VirtualMachineInstance", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", Name:"win2k19-1610912729-0605187", UID:"7c9671fa-e428-473f-bcaa-763b8cd467ac", APIVersion:"kubevirt.io/v1alpha3", ResourceVersion:"2797268", FieldPath:""}, Reason:"Created", Message:"VirtualMachineInstance defined.", Source:v1.EventSource{Component:"virt-handler", Host:"virt01-d2rcr-worker-0-7qwgk"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63746511036, loc:(*time.Location)(0x3668860)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbff945d5844f6ac2, ext:229431546398511, loc:(*time.Location)(0x3668860)}}, Count:6, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "win2k19-1610912729-0605187.165b1df696d818ff" is forbidden: unable to create new content in namespace ssp-general-test-networkinterfacemultiqueue-feature because it is being terminated' (will not retry!)
{"component":"virt-handler","kind":"","level":"error","msg":"Synchronizing the VirtualMachineInstance failed.","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1444","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.154929Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
E0117 20:36:38.158763 6549 event.go:256] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"win2k19-1610912729-0605187.165b1f622816ac09", GenerateName:"", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"VirtualMachineInstance", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", Name:"win2k19-1610912729-0605187", UID:"7c9671fa-e428-473f-bcaa-763b8cd467ac", APIVersion:"kubevirt.io/v1alpha3", ResourceVersion:"2814770", FieldPath:""}, Reason:"SyncFailed", Message:"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"", Source:v1.EventSource{Component:"virt-handler", Host:"virt01-d2rcr-worker-0-7qwgk"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbff945d5893bd009, ext:229431628999792, loc:(*time.Location)(0x3668860)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbff945d5893bd009, ext:229431628999792, loc:(*time.Location)(0x3668860)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "win2k19-1610912729-0605187.165b1f622816ac09" is forbidden: unable to create new content in namespace ssp-general-test-networkinterfacemultiqueue-feature because it is being terminated' (will not retry!)
{"component":"virt-handler","level":"info","msg":"re-enqueuing VirtualMachineInstance ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1052","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.266191Z"}
{"component":"virt-handler","level":"info","msg":"Processing event ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1305","timestamp":"2021-01-17T20:36:38.268453Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1307","timestamp":"2021-01-17T20:36:38.268533Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Running, reason: Unknown\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1312","timestamp":"2021-01-17T20:36:38.268559Z","uid":""}
{"component":"virt-handler","kind":"","level":"error","msg":"Synchronizing the VirtualMachineInstance failed.","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1444","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.314998Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
{"component":"virt-handler","level":"info","msg":"re-enqueuing VirtualMachineInstance ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1052","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.323113Z"}
{"component":"virt-handler","level":"info","msg":"Processing event ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1305","timestamp":"2021-01-17T20:36:38.323208Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1307","timestamp":"2021-01-17T20:36:38.323256Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Running, reason: Unknown\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1312","timestamp":"2021-01-17T20:36:38.323296Z","uid":""}
E0117 20:36:38.329942 6549 event.go:256] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"win2k19-1610912729-0605187.165b1f622816ac09", GenerateName:"", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"VirtualMachineInstance", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", Name:"win2k19-1610912729-0605187", UID:"7c9671fa-e428-473f-bcaa-763b8cd467ac", APIVersion:"kubevirt.io/v1alpha3", ResourceVersion:"2815743", FieldPath:""}, Reason:"SyncFailed", Message:"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"", Source:v1.EventSource{Component:"virt-handler", Host:"virt01-d2rcr-worker-0-7qwgk"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbff945d5893bd009, ext:229431628999792, loc:(*time.Location)(0x3668860)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbff945d592c64db7, ext:229431789070878, loc:(*time.Location)(0x3668860)}}, Count:2, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "win2k19-1610912729-0605187.165b1f622816ac09" is forbidden: unable to create new content in namespace ssp-general-test-networkinterfacemultiqueue-feature because it is being terminated' (will not retry!)
{"component":"virt-handler","kind":"","level":"error","msg":"Synchronizing the VirtualMachineInstance failed.","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1444","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.372012Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
{"component":"virt-handler","level":"info","msg":"re-enqueuing VirtualMachineInstance ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1052","reason":"server error. command Shutdown failed: \"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\"","timestamp":"2021-01-17T20:36:38.378433Z"}
{"component":"virt-handler","level":"info","msg":"Processing event ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vm.go:1305","timestamp":"2021-01-17T20:36:38.378517Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"win2k19-1610912729-0605187","namespace":"ssp-general-test-networkinterfacemultiqueue-feature","pos":"vm.go:1307","timestamp":"2021-01-17T20:36:38.378548Z","uid":"7c9671fa-e428-473f-bcaa-763b8cd467ac"}
============================================================================================
$ oc logs -n openshift-cnv virt-controller-5c48789fbb-g8pl7 |grep 0605187|less
...
...
{"component":"virt-controller","level":"info","msg":"reenqueuing VirtualMachineInstance ssp-general-test-networkinterfacemultiqueue-feature/win2k19-1610912729-0605187","pos":"vmi.go:215","reason":"patching of vmi conditions and activePods failed: namespaces \"ssp-general-test-networkinterfacemultiqueue-feature\" not found, [{ \"op\": \"test\", \"path\": \"/status/conditions\", \"value\": [{\"type\":\"LiveMigratable\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"Synchronized\",\"status\":\"False\",\"lastProbeTime\":null,\"lastTransitionTime\":\"2021-01-17T20:36:38Z\",\"reason\":\"Synchronizing with the Domain failed.\",\"message\":\"server error. command Shutdown failed: \\\"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\\\"\"},{\"type\":\"Ready\",\"status\":\"False\",\"lastProbeTime\":\"2021-01-17T20:36:39Z\",\"lastTransitionTime\":\"2021-01-17T20:36:39Z\",\"reason\":\"PodTerminating\",\"message\":\"The Pod is terminating\"}] } { \"op\": \"replace\", \"path\": \"/status/conditions\", \"value\": [{\"type\":\"LiveMigratable\",\"status\":\"True\",\"lastProbeTime\":null,\"lastTransitionTime\":null},{\"type\":\"Synchronized\",\"status\":\"False\",\"lastProbeTime\":null,\"lastTransitionTime\":\"2021-01-17T20:36:38Z\",\"reason\":\"Synchronizing with the Domain failed.\",\"message\":\"server error. command Shutdown failed: \\\"LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')\\\"\"},{\"type\":\"Ready\",\"status\":\"False\",\"lastProbeTime\":\"2021-01-18T09:51:28Z\",\"lastTransitionTime\":\"2021-01-18T09:51:28Z\",\"reason\":\"PodTerminating\",\"message\":\"The Pod is terminating\"}] }]","service":"http","timestamp":"2021-01-18T09:51:28.576617Z"}
E0118 09:51:28.619988 1 event.go:256] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"win2k19-1610912729-0605187.165b4990a2cd7bee", GenerateName:"", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"VirtualMachineInstance", Namespace:"ssp-general-test-networkinterfacemultiqueue-feature", Name:"win2k19-1610912729-0605187", UID:"7c9671fa-e428-473f-bcaa-763b8cd467ac", APIVersion:"kubevirt.io/v1alpha3", ResourceVersion:"2815779", FieldPath:""}, Reason:"PodTerminating", Message:"Pod virt-launcher-win2k19-1610912729-0605187-b4rbc is terminating, marking VMI as not ready.", Source:v1.EventSource{Component:"virtualmachine-controller", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbff973205020b1ee, ext:508357063290, loc:(*time.Location)(0x2fb68c0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbff97468216b84ee, ext:1819647179558, loc:(*time.Location)(0x2fb68c0)}}, Count:19, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'namespaces "ssp-general-test-networkinterfacemultiqueue-feature" not found' (will not retry!)
The bug is tracked in Jira issue, https://issues.redhat.com/browse/CNV-9347 PR has been merged in master and backported to the release-0.36 branch here: https://github.com/kubevirt/kubevirt/pull/5009 Verified on: $ oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-rc.1 True False 2d1h Cluster version is 4.7.0-rc.1 $ oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v2.6.0 OpenShift Virtualization 2.6.0 kubevirt-hyperconverged-operator.v2.5.3 Succeeded Checked with OS - win2019 and rhel8.3 Verified with steps: 1) Create and start VM with DV from template 2) Stop/start VM several times - VMI and virt-launcher pod removed 3) Live-Migrate VM - VM migrated without issues 4) Drain node where VM is running - node drained, VM migrated 5) Delete VM - VMI and virt-launcher pod removed The device on the VMI which triggered this was the tablet input device
```
Inputs:
Bus: usb
Name: tablet
Type: tablet
```
Other devices could potentially trigger that error too, but I could not reproduce it for other devices so far.
When testing 2.6 and 2.5 - were the same VM definitions used? It could be that the 2.5 tempaltes do not have the devices Roman mentions, but 2.6 has them I could still see this problem like c#7 on CNV 2.6.0 + HCO 2.5.3, it only happens on Windows VM. $ oc get csv -n openshift-cnv NAME DISPLAY VERSION REPLACES PHASE kubevirt-hyperconverged-operator.v2.6.0 OpenShift Virtualization 2.6.0 kubevirt-hyperconverged-operator.v2.5.3 Succeeded 1. VM is deleted. $ oc get -n test-nfqrw vm No resources found in test-nfqrw namespace. 2. Virt-launcher pod is stucking in terminating for about 40m oc get -n test-nfqrw pod NAME READY STATUS RESTARTS AGE virt-launcher-auto-clone-win10-vm-pw1am-wjqvj 2/2 Terminating 0 40m 3. VMI is running $ oc get -n test-nfqrw vmi NAME AGE PHASE IP NODENAME auto-clone-win10-vm-pw1am 40m Running 10.131.0.162 uit01-kzqc6-worker-0-gszsn Gouhua, I just double checked the fixed-in version and I reported it incorrectly. The version ending with -110 does *NOT* contain the fix. I apologize for the confusion. Can you please confirm which cluster version you tested with? Confirmed that the symptoms I saw is not related to this bug, it's because VM is deleted just after the VM turns into 'Running' state, it cannot be shutdown gracefully, so it has to wait the grace period 3660s to be deleted. This grace period 3600s is adding to windows template recently that I didn't aware of. This looks like a separate bug. Sorry for the confusion. > This looks like a separate bug. https://bugzilla.redhat.com/show_bug.cgi?id=1933043 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0799 |
Description of problem: Start Windows and Fedora VMs. Drain the node that the VMI are running on. Fedora VMI is migrated successfully; Windows VMI fails to migrate. Version-Release number of selected component (if applicable): CNV 2.6.0 (virt-operator-container-v2.6.0-95) How reproducible: 100% Steps to Reproduce: 1. Create Windows DV 2. Create Windows VM from template oc process -n openshift windows2k19-server-medium-v0.12.3 -p NAME=win-19 -p PVCNAME=win-19 3. Start VM 4. Drain the node that hosts the VMI oc adm drain <node name> --delete-local-data --ignore-daemonsets=true --force Actual results: VMI migration fails on: server error. command Migrate failed: "LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')" Expected results: VMI migration should succeed. Additional info: * Fedora VMI (BlockMigration/LiveMigration) succeeds. * Attached: - Windows: -- DV and VM yamls -- VM/VMI yamls -- dumpxml -- source and target pods -- VMI describe - Fedora (BlockMigration/LiveMigration) VMs and VMIs yamls and domxml $ oc describe vmi win-19 Name: win-19 Namespace: default Labels: kubevirt.io/domain=win-19 kubevirt.io/migrationTargetNodeName=ssp04-2zj76-worker-0-rlzj6 kubevirt.io/nodeName=ssp04-2zj76-worker-0-74wtp kubevirt.io/size=medium Annotations: kubevirt.io/latest-observed-api-version: v1alpha3 kubevirt.io/storage-observed-api-version: v1alpha3 API Version: kubevirt.io/v1alpha3 Kind: VirtualMachineInstance Metadata: Creation Timestamp: 2020-12-27T16:48:09Z Finalizers: foregroundDeleteVirtualMachine Generation: 15 Managed Fields: API Version: kubevirt.io/v1alpha3 Fields Type: FieldsV1 fieldsV1: f:status: f:evacuationNodeName: Manager: virt-api Operation: Update Time: 2020-12-27T16:49:21Z API Version: kubevirt.io/v1alpha3 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubevirt.io/latest-observed-api-version: f:kubevirt.io/storage-observed-api-version: f:labels: .: f:kubevirt.io/domain: f:kubevirt.io/migrationTargetNodeName: f:kubevirt.io/nodeName: f:kubevirt.io/size: f:ownerReferences: f:spec: .: f:domain: .: f:clock: .: f:timer: .: f:hpet: .: f:present: f:hyperv: f:pit: .: f:tickPolicy: f:rtc: .: f:tickPolicy: f:utc: f:cpu: .: f:cores: f:sockets: f:threads: f:devices: .: f:disks: f:inputs: f:interfaces: f:features: .: f:acpi: f:apic: f:hyperv: .: f:relaxed: f:spinlocks: .: f:spinlocks: f:vapic: f:firmware: .: f:uuid: f:machine: .: f:type: f:resources: .: f:requests: .: f:memory: f:evictionStrategy: f:networks: f:terminationGracePeriodSeconds: f:volumes: f:status: .: f:activePods: .: f:4cfe1e83-adee-42ec-885a-963f3f9dd302: f:9b204742-52e9-47a0-ae28-59ad73f37e1d: f:guestOSInfo: f:migrationState: .: f:migrationUid: f:sourceNode: f:targetNode: f:targetPod: f:nodeName: f:qosClass: Manager: virt-controller Operation: Update Time: 2020-12-27T16:49:40Z API Version: kubevirt.io/v1alpha3 Fields Type: FieldsV1 fieldsV1: f:status: f:conditions: f:interfaces: f:migrationMethod: f:migrationState: f:targetDirectMigrationNodePorts: .: f:34579: f:46367: f:targetNodeAddress: f:phase: f:volumeStatus: Manager: virt-handler Operation: Update Time: 2020-12-27T16:49:41Z Owner References: API Version: kubevirt.io/v1alpha3 Block Owner Deletion: true Controller: true Kind: VirtualMachine Name: win-19 UID: 3a13f855-cd6a-45c9-a6c1-7c73d5aa4199 Resource Version: 6585022 UID: 1c635b6c-d50a-4555-8539-a969529c5eba Spec: Domain: Clock: Timer: Hpet: Present: false Hyperv: Present: true Pit: Present: true Tick Policy: delay Rtc: Present: true Tick Policy: catchup Utc: Cpu: Cores: 1 Sockets: 1 Threads: 1 Devices: Disks: Disk: Bus: sata Name: rootdisk Inputs: Bus: usb Name: tablet Type: tablet Interfaces: Masquerade: Model: e1000e Name: default Features: Acpi: Enabled: true Apic: Enabled: true Hyperv: Relaxed: Enabled: true Spinlocks: Enabled: true Spinlocks: 8191 Vapic: Enabled: true Firmware: Uuid: e7d85e84-c5ca-5fe0-8a02-922c050f01be Machine: Type: pc-q35-rhel8.2.0 Resources: Requests: Cpu: 100m Memory: 4Gi Eviction Strategy: LiveMigrate Networks: Name: default Pod: Termination Grace Period Seconds: 3600 Volumes: Name: rootdisk Persistent Volume Claim: Claim Name: win-19 Status: Active Pods: 4cfe1e83-adee-42ec-885a-963f3f9dd302: ssp04-2zj76-worker-0-rlzj6 9b204742-52e9-47a0-ae28-59ad73f37e1d: ssp04-2zj76-worker-0-74wtp Conditions: Last Probe Time: <nil> Last Transition Time: <nil> Status: True Type: LiveMigratable Last Probe Time: <nil> Last Transition Time: 2020-12-27T16:48:18Z Status: True Type: Ready Last Probe Time: <nil> Last Transition Time: 2020-12-27T16:49:41Z Message: server error. command Migrate failed: "LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')" Reason: Synchronizing with the Domain failed. Status: False Type: Synchronized Evacuation Node Name: ssp04-2zj76-worker-0-74wtp Guest OS Info: Interfaces: Ip Address: 10.128.2.155 Ip Addresses: 10.128.2.155 Mac: 02:00:00:0d:1c:e6 Name: default Migration Method: LiveMigration Migration State: Migration UID: 97b769fd-38dc-4233-8487-0be1a67e3940 Source Node: ssp04-2zj76-worker-0-74wtp Target Direct Migration Node Ports: 34579: 0 46367: 49152 Target Node: ssp04-2zj76-worker-0-rlzj6 Target Node Address: 10.129.2.13 Target Pod: virt-launcher-win-19-h2dmz Node Name: ssp04-2zj76-worker-0-74wtp Phase: Running Qos Class: Burstable Volume Status: Name: rootdisk Target: sda Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 9m8s disruptionbudget-controller Created PodDisruptionBudget kubevirt-disruption-budget-sb8pw Normal SuccessfulCreate 9m8s virtualmachine-controller Created virtual machine pod virt-launcher-win-19-sp6wq Normal Started 8m54s virt-handler VirtualMachineInstance started. Normal SuccessfulCreate 7m55s disruptionbudget-controller Created Migration kubevirt-evacuation-g28rx Normal Created 7m37s (x8 over 8m55s) virt-handler VirtualMachineInstance defined. Normal PreparingTarget 7m37s virt-handler Migration Target is listening at 10.129.2.13, on ports: 34579,46367 Normal PreparingTarget 7m36s (x3 over 7m37s) virt-handler VirtualMachineInstance Migration Target Prepared. Warning SyncFailed 2m6s (x18 over 7m36s) virt-handler server error. command Migrate failed: "LibvirtError(Code=27, Domain=20, Message='XML error: non unique alias detected: ua-')" Warning FailedCreate 32s (x17 over 2m36s) disruptionbudget-controller Error creating a Migration: admission webhook "migration-create-validator.kubevirt.io" denied the request: in-flight migration detected. Active migration job (97b769fd-38dc-4233-8487-0be1a67e3940) is currently already in progress for VMI win-19.