Bug 1903065

Summary:	[CNV][Chaos] VM don't report when the qemu is stuck
Product:	OpenShift Container Platform	Reporter:	Ondra Machacek <omachace>
Component:	Storage	Assignee:	Jan Safranek <jsafrane>
Storage sub component:	Storage	QA Contact:	Wei Duan <wduan>
Status:	CLOSED DEFERRED	Docs Contact:
Severity:	medium
Priority:	unspecified	CC:	aos-bugs, cnv-qe-bugs, jsafrane, kbidarka, omachace, pkliczew, sgott
Version:	4.6
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-07-30 09:27:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1908661

Description Ondra Machacek 2020-12-01 09:06:45 UTC

Description of problem:
There is no monitoring of the qemu vm - reporting the qemu is stuck. For example when VM lost connection to storage(NFS with hard mount), there is no timeout, and kubevirt never report the VM as non responsive.

Version-Release number of selected component (if applicable):
2.5.0

How reproducible:
always

Steps to Reproduce:
1. Create VM with PV on NFS.
2. Stop NFS service.

Actual results:
VM is reported as running forever.

Expected results:
VM is reported as as non responsive after qemu-timeout.

Additional info:
For example oVirt sets unresponsive VM if monitor timeouts:
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vm.py#L1814

Libvirt logs:
{"component":"virt-launcher","level":"info","msg":"Thread 31 (virNetServerHandleJob) is now running job remoteDispatchConnectGetAllDomainStats","pos":"virThreadJobSet:94","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"conn=0x7fefd8001560, stats=0x3e, retStats=0x7fefeeefe7e0, flags=0x10","pos":"virConnectGetAllDomainStats:11715","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de593340","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593340(name=stack) driver=QEMU perm=3","pos":"virAccessManagerCheckConnect:223","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593050(name=none) driver=QEMU perm=3","pos":"virAccessManagerCheckConnect:223","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x5609de593340","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x7fefcc045370","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de593340","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593340(name=stack) driver=QEMU domain=0x7fefcc00da00 perm=1","pos":"virAccessManagerCheckDomain:239","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593050(name=none) driver=QEMU domain=0x7fefcc00da00 perm=1","pos":"virAccessManagerCheckDomain:239","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x5609de593340","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x7fefa0009a40","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"Starting job: job=query agentJob=none asyncJob=none (vm=0x7fefcc045370 name=default_vm-1, current job=query agentJob=none async=none)","pos":"qemuDomainObjBeginJobInternal:9782","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"info","msg":"Waiting for job (vm=0x7fefcc045370 name=default_vm-1)","pos":"qemuDomainObjBeginJobInternal:9812","subcomponent":"libvirt","thread":"31","timestamp":"2020-12-01T09:04:40.813000Z"}
{"component":"virt-launcher","level":"warning","msg":"Cannot start job (query, none, none) for domain default_vm-1; current job is (query, none, none) owned by (34 remoteDispatchConnectGetAllDomainStats, 0 \u003cnull\u003e, 0 \u003cnull\u003e (flags=0x0)) for (124s, 0s, 0s)","pos":"qemuDomainObjBeginJobInternal:9896","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"error","msg":"Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchConnectGetAllDomainStats)","pos":"qemuDomainObjBeginJobInternal:9918","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x7fefa0009a40","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x7fefcc045370","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x7fefd00116d0","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_DISPOSE: obj=0x7fefd00116d0","pos":"virObjectUnref:350","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"release domain 0x7fefd00116d0 default_vm-1 13da100f-c015-569f-9d70-09a7e63fe27d","pos":"virDomainDispose:323","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x7fefd8001560","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"Thread 35 (virNetServerHandleJob) finished job remoteDispatchDomainMemoryStats with ret=-1","pos":"virThreadJobClear:119","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x5609de596010","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x5609de599cb0","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de594390","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"28","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de599cb0","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"28","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de596010","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"28","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_UNREF: obj=0x5609de594390","pos":"virObjectUnref:348","subcomponent":"libvirt","thread":"28","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"Thread 35 (virNetServerHandleJob) is now running job remoteDispatchDomainMemoryStats","pos":"virThreadJobSet:94","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_NEW: obj=0x7fefd00116d0 classname=virDomain","pos":"virObjectNew:251","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x7fefd8001560","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"dom=0x7fefd00116d0, (VM: name=default_vm-1, uuid=13da100f-c015-569f-9d70-09a7e63fe27d), stats=0x7fefd0013fe0, nr_stats=13, flags=0x0","pos":"virDomainMemoryStats:5747","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x7fefcc045370","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"OBJECT_REF: obj=0x5609de593340","pos":"virObjectRef:386","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593340(name=stack) driver=QEMU domain=0x7fefcc00da00 perm=1","pos":"virAccessManagerCheckDomain:239","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}
{"component":"virt-launcher","level":"info","msg":"manager=0x5609de593050(name=none) driver=QEMU domain=0x7fefcc00da00 perm=1","pos":"virAccessManagerCheckDomain:239","subcomponent":"libvirt","thread":"35","timestamp":"2020-12-01T09:04:40.831000Z"}

Comment 1 lpivarc 2021-03-03 11:07:11 UTC

Duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1901335?

Comment 2 Ondra Machacek 2021-03-03 20:34:28 UTC

That is different problem. The problem here is mainly with NFS hard mount. If NFS connection is lost it never timeout. bug 1901335 is about IO error in general.

Comment 4 sgott 2021-06-01 19:25:42 UTC

Ondra,

There exists a PR https://github.com/kubevirt/kubevirt/pull/5401 that addresses the somewhat related BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1901335

The PR above pauses the VM on IO Errors. Given this context, how do IO errors specifically related to NFS differ from the general case?

Comment 5 Piotr Kliczewski 2021-06-02 06:32:57 UTC

Stu, We used couple of methods to cause IO errors. One of the cases was to cause connection issue between NFS share and server. In the past this issue cause headache in vdsm to handle it correctly.
In general handling any io errors should be done in similar way but nfs seems to be tricky. I would just make sure that NFS is not causing any additional issues.

Comment 6 sgott 2021-06-02 13:02:47 UTC

The BZ mentioned in Comment #4 addresses error handling at the KubeVirt level. If the error is not propagated at all, that's an issue at a lower level.

Re-assigned this to the Node component. Please re-assign if this is incorrect.

Comment 7 Peter Hunt 2021-06-02 13:45:25 UTC

I don't think node is the right component. we cover Kubelet and CRI-O, and this bug doesn't seem to be an issue with either. What "lower level" are you looking for?

Comment 8 sgott 2021-06-02 18:05:43 UTC

This is an issue related to NFS PVC's not reporting any status/errors. I was of the impression that CRI-O was involved in that. Sorry for the confusion.

Comment 9 Peter Hunt 2021-06-02 19:03:59 UTC

Ah I believe that is in storage's department, moving there

Comment 10 Jan Safranek 2021-06-04 11:46:52 UTC

I am not sure I understand the issue. A VM running in a Pod uses a NFS volume and the NFS server becomes unavailable? Kubernetes (kubelet) makes sure that a correct volume is mounted into the right pod, which probably happened in this case. However, after that, kubelet stays out of the data path. It's between NFS filesystem / NFS client in kernel and NFS server now. From what I remember, I/O calls, such as read(), never return when the NFS server becomes unavailable. Is it the issue you're trying to solve? I am afraid OCP can't help here much.

Comment 11 Piotr Kliczewski 2021-06-04 13:07:24 UTC

Jan, please take a look at comment #2. The idea is to handle NFS issues and let the user know that there is an issue as well as recover when the connection/nfs server is restored.

Comment 12 Jan Safranek 2021-06-04 15:16:59 UTC

Kubernetes currently does not monitor mounted volumes. It's on our long-term TODO list (upstream), however, it will take some time until it's fully implemented & graduates to GA, see https://kubernetes.io/docs/concepts/storage/volume-health-monitoring/. It's alpha in 1.21 and it will stay alpha in 1.22 for sure. In addition, it needs explicit support in a CSI driver. If you insist, we may turn this into OCP RFE, not sure if it can speed things up.

What I recommend in applications (Pods) is to have liveness probe that check if the application is running, which should include some code path that checks if the application can access the storage it needs. For VMs, it would make sense to have a probe to poke all VM's volumes once in a while. The Pod will be then killed when it looses connection to the NFS server for a defined time.

Alternatively, you can use soft nfs mounts + a timeout. Or do some meonitoring in qemu / vdsm.

Would any of the above be enough for you?

Comment 13 Piotr Kliczewski 2021-06-04 15:30:43 UTC

Base on your description it indeed looks more like RFE than BZ. Please paste a reference link where the RFE is tracked. Thanks!

Comment 14 Jan Safranek 2021-07-30 09:27:53 UTC

I created https://issues.redhat.com/browse/RFE-2032. A link to customer case would help a lot to prioritize the request!