Bug 2124406
| Summary: | Memory dump hp-volume pod keeps in pending sometimes | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Yan Du <yadu> |
| Component: | Storage | Assignee: | skagan |
| Status: | NEW --- | QA Contact: | Yan Du <yadu> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.12.0 | CC: | akalenyu, alitke, jpeimer, ngavrilo, skagan |
| Target Milestone: | --- | ||
| Target Release: | 4.15.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Shelly, it would be nice if we could somehow detect and handle this situation automatically rather then requiring a manual workaround. I realize this is difficult because we don't want to leak storage details into kubevirt. Maybe some sort of timeout and then we will discard the old PVC? Regarding the warning, the error seems to happen in the hotplug phase of the pvc for the memory dump, Yan, did you look at the volume status in the vmi to see if anything is shown there? dealing with such case is more relevant to the hotplug. But regardless, currently the memory dump command just triggers the memory dump process and exits, do we want it to wait until it completes? and if not completed in the defined period of time we want to return error and disassociate the pvc? and if we also created it in the process also delete it? The vmi volume status is as below:
$ oc get vmi -o yaml
apiVersion: v1
items:
- apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
annotations:
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1alpha3
creationTimestamp: "2022-09-14T09:38:42Z"
finalizers:
- kubevirt.io/virtualMachineControllerFinalize
- foregroundDeleteVirtualMachine
generation: 13
labels:
kubevirt.io/nodeName: c01-yadu412-kjc7h-worker-0-n26rk
kubevirt.io/vm: vm-datavolume
name: vm-fedora-datavolume
namespace: default
ownerReferences:
- apiVersion: kubevirt.io/v1
blockOwnerDeletion: true
controller: true
kind: VirtualMachine
name: vm-fedora-datavolume
uid: a692cd87-03a5-4cbb-b414-73877f5f9528
resourceVersion: "240123"
uid: 2150b759-0a27-4ec2-8bb8-a4d248e6023b
spec:
domain:
cpu:
cores: 1
model: host-model
sockets: 1
threads: 1
devices:
disks:
- disk:
bus: virtio
name: datavolumevolume
interfaces:
- masquerade: {}
name: default
features:
acpi:
enabled: true
firmware:
uuid: e69d93b8-45ca-5bd6-b02e-bf134bb338de
machine:
type: pc-q35-rhel8.6.0
resources:
requests:
memory: 1024M
networks:
- name: default
pod: {}
terminationGracePeriodSeconds: 0
volumes:
- dataVolume:
name: fedora-dv
name: datavolumevolume
- memoryDump:
claimName: pvc1
hotpluggable: true
name: pvc1
status:
activePods:
9557121e-81c7-466d-8293-c7114cb1e791: c01-yadu412-kjc7h-worker-0-n26rk
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-09-14T09:38:54Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: null
message: 'cannot migrate VMI: PVC fedora-dv is not shared, live migration requires
that all PVCs must be shared (using ReadWriteMany access mode)'
reason: DisksNotLiveMigratable
status: "False"
type: LiveMigratable
- lastProbeTime: "2022-09-14T09:39:11Z"
lastTransitionTime: null
status: "True"
type: AgentConnected
guestOSInfo:
id: fedora
kernelRelease: 5.12.11-300.fc34.x86_64
kernelVersion: '#1 SMP Wed Jun 16 15:47:58 UTC 2021'
name: Fedora
prettyName: Fedora 34 (Cloud Edition)
version: "34"
versionId: "34"
interfaces:
- infoSource: domain, guest-agent
interfaceName: eth0
ipAddress: 10.128.2.44
ipAddresses:
- 10.128.2.44
mac: 52:54:00:82:3d:b6
name: default
queueCount: 1
launcherContainerImageVersion: registry.redhat.io/container-native-virtualization/virt-launcher@sha256:35bdecc535e077fe19ec3fcdfc4e30d895acd806f330c9cb8435c1e1b0da7c00
migrationMethod: BlockMigration
migrationTransport: Unix
nodeName: c01-yadu412-kjc7h-worker-0-n26rk
phase: Running
phaseTransitionTimestamps:
- phase: Pending
phaseTransitionTimestamp: "2022-09-14T09:38:42Z"
- phase: Scheduling
phaseTransitionTimestamp: "2022-09-14T09:38:43Z"
- phase: Scheduled
phaseTransitionTimestamp: "2022-09-14T09:38:54Z"
- phase: Running
phaseTransitionTimestamp: "2022-09-14T09:38:57Z"
qosClass: Burstable
runtimeUser: 107
virtualMachineRevisionName: revision-start-vm-a692cd87-03a5-4cbb-b414-73877f5f9528-2
volumeStatus:
- name: datavolumevolume
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
filesystemOverhead: "0.055"
requests:
storage: 10Gi
volumeMode: Filesystem
target: vda
- hotplugVolume:
attachPodName: hp-volume-j69qt
memoryDumpVolume:
claimName: pvc1
message: Created hotplug attachment pod hp-volume-j69qt, for volume pvc1
name: pvc1
persistentVolumeClaimInfo:
accessModes:
- ReadWriteOnce
capacity:
storage: 149Gi
filesystemOverhead: "0.055"
requests:
storage: "1191182336"
volumeMode: Filesystem
phase: AttachedToNode
reason: SuccessfulCreate
target: ""
kind: List
metadata:
resourceVersion: ""
$ oc describe vmi
----------8<--------------------
Volume Status:
Name: datavolumevolume
Persistent Volume Claim Info:
Access Modes:
ReadWriteOnce
Capacity:
Storage: 10Gi
Filesystem Overhead: 0.055
Requests:
Storage: 10Gi
Volume Mode: Filesystem
Target: vda
Hotplug Volume:
Attach Pod Name: hp-volume-j69qt
Memory Dump Volume:
Claim Name: pvc1
Message: Created hotplug attachment pod hp-volume-j69qt, for volume pvc1
Name: pvc1
Persistent Volume Claim Info:
Access Modes:
ReadWriteOnce
Capacity:
Storage: 149Gi
Filesystem Overhead: 0.055
Requests:
Storage: 1191182336
Volume Mode: Filesystem
Phase: AttachedToNode
Reason: SuccessfulCreate
Target:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 9m18s virtualmachine-controller Created virtual machine pod virt-launcher-vm-fedora-datavolume-wwh6r
Normal Created 9m3s virt-handler VirtualMachineInstance defined.
Normal Started 9m3s virt-handler VirtualMachineInstance started.
Normal SuccessfulCreate 8m54s virtualmachine-controller Created attachment pod hp-volume-j69qt
Normal SuccessfulCreate 8m49s (x5 over 8m54s) virtualmachine-controller Created hotplug attachment pod hp-volume-j69qt, for volume pvc1
OK I see, so the hotplug doesnt show there is any issue. In that term need to look into it. Regarding the memory dump behavior waiting for @alitke response Summarizing grooming discussion: This will happen regardless of memory dump/not when main disk is not topology constrained but some hotplugged volume is, for example, ceph for main disk and hpp for hotplugged disk. Might make sense to have an extra bug for this Maybe we want to set hotplug volume status to failed when we detect such a situation as a short-term fix, but we should still decide if the underlying issue here is hotplug-related, or should we focus on more friendly virtctl memory-dump interaction? @alitke This is definitely a generic hotplug issue but in the specific case of memory dump I think we have an opportunity to improve the user experience. When a user wants to trigger a new memory dump and we are in this situation (a dump PVC that cannot be attached), we can simply remove the old PVC and create a new one. This is safe because the user already told us that they want to replace the existing memory dump with a new one. I do think we will also encounter a similar error with VM export and we need to look into how to handle it. I think in order to do that we need to at least make the hotplug process fail or show some error cause otherwise I don't think we can know that the PVC cannot be attached, I don't think putting a timeout on that is right. In case of identifying such error I think it will be possible to delete current PVC and create a new one. |
Description of problem: happened to find that sometime after VM restart, the vm is scheduled to another node for some reason, then we tried to trigger the memory dump again, the hp-volume pod keeps in pending status since the memory and can not do a new memory dump since the previous is not finished. Version-Release number of selected component (if applicable): CNV-v4.12.0-450 How reproducible: Sometimes Steps to Reproduce: 1. Create a VM 2. Do memory dump $ virtctl memory-dump get vm-fedora-datavolume --claim-name=memoryvolume --create-claim 3. Restart the VM - sometimes the vm is scheduled to another node 4. Do memory dump again $ virtctl memory-dump get vm-fedora-datavolume Actual results: $ oc get pod -n default NAME READY STATUS RESTARTS AGE hp-volume-w4nz8 0/1 Pending 0 19h virt-launcher-vm-fedora-datavolume-qjzxj 1/1 Running 0 19h Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 118m (x2088 over 19h) default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 node(s) didn't match Pod's node affinity/selector, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. Warning FailedScheduling 14m default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 node(s) didn't match Pod's node affinity/selector, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. Warning FailedScheduling 14m default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 node(s) didn't match Pod's node affinity/selector, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. Warning FailedScheduling 9m52s default-scheduler 0/6 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 1 node(s) were unschedulable, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 node(s) didn't match Pod's node affinity/selector, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. Warning FailedScheduling 8m28s default-scheduler 0/6 nodes are available: 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 5 node(s) didn't match Pod's node affinity/selector, 5 node(s) had volume node affinity conflict. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. Expected results: maybe we could have a friendly warning about why the memory dump failed at this situation? Additional info: