Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1392316

Summary:	[RFE] Always provide timeout for operations blocked in the QEMU driver
Product:	Red Hat Enterprise Linux 7	Reporter:	Francesco Romani <fromani>
Component:	libvirt	Assignee:	Libvirt Maintainers <libvirt-maint>
Status:	CLOSED WONTFIX	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.2	CC:	berrange, jdenemar, michal.skrivanek, pkrempa, rbalakri
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-11-07 08:27:06 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Francesco Romani 2016-11-07 07:49:18 UTC

Description of problem:
The libvirt QEMU driver uses the QEMU monitor to query the hypervisor state.
If the hypervisor encounters a storage layer error, perhaps because it uses shared storage and there is a network failure, it can get stuck in I/O inside the kernel and enter the D state.
In this case the QEMU monitor can become unresponsive, and all the libvirt APIs
which need to enter the QEMU monitor can block, ultimately leading to the exaustion of the libvirt worker pool.

This in turn makes life of the management application (e.g. oVirt) harder, and can lead to a chain of failures.

We would like libvirt to always timeout, or signal the error to the upper layer, instead of sometimes block forever.

Version-Release number of selected component (if applicable):
Experienced with libvirt 1.3.3 and QEMU 2.6, but libvirt 2.0.0 should behave the same (I'm not aware of any reason not to)

We acknowledge that to fix this bug changes to QEMU (monitor protocol?) may be needed, and that is a complex scenario. This bug is to track the progress of fixing this scenario.

How reproducible:
100% given enough time, frequent enough libvirt usage and unresponsive storage

Steps to Reproduce:
1. set up 1+ QEMU VMs on shared storage, perhaps NFS or ISCSI
2. make sure libvirt is frequently used, involving APIs which needs to access the QEMU monitor, a good example are bulk stats (but not just that)
3. wait for libvirt APIs to block forever. It could take a random amount of time depending on a lot of factors, but it will happen

Actual results:
Sooner (minutes) or later (hours) all the libvirtd worker threads will get stuck trying to access the unresponsive qemu monitor

Expected results:
Insted of blocking forever, all the APIs which should quickly (seconds) return timeout or errors in every case.

Comment 1 Peter Krempa 2016-11-07 08:14:49 UTC

The "always" part is impossible on our side.

Once we send a command to the monitor there's no way to cancel it if it got stuck. This means that if the storage unblocks for any reason the command will be executed. If libvirt reported an error due to timeout the user or applications on top of that would assume that the command failed and not that it will be eventually finished later.

Most of the libvirt APIs are synchronous in this aspect.

Comment 2 Yaniv Kaul 2016-11-07 09:55:34 UTC

(In reply to Peter Krempa from comment #1)
> The "always" part is impossible on our side.
> 
> Once we send a command to the monitor there's no way to cancel it if it got
> stuck. This means that if the storage unblocks for any reason the command
> will be executed. If libvirt reported an error due to timeout the user or
> applications on top of that would assume that the command failed and not
> that it will be eventually finished later.
> 
> Most of the libvirt APIs are synchronous in this aspect.

I wonder if for any 'read' kind of command the 'always' makes sense though.

Comment 3 Daniel Berrangé 2016-11-07 10:00:53 UTC

> Sooner (minutes) or later (hours) all the libvirtd worker threads will get
> stuck trying to access the unresponsive qemu monitor

NB, libvirt added a concept of "high priority" worker threads explicitly to let managment apps get themselves out of this problem. Certain API calls that we know can never block, always get directed to the dedicated pool of high priority worker threads. In particular virDomainDestroy is high priority, so it is always possible to unblock the worker threads by killing the guest that is non-responsive.

Comment 4 Michal Skrivanek 2016-11-07 10:31:44 UTC

it does still require the monitoring part of the application to deal with that when we use bulk calls to get stats for all VMs. Or the management to be intelligent enough and monitor for stuck calls and shut down VMs in this case, which can be a bit too harsh and anyway not good enough as such detection would take some time, and for that time the monitoring is not going to work for all the other VMs

Comment 5 Jiri Denemark 2016-11-07 11:22:46 UTC

Are you suggesting that the problem is with the bulk stats API, which gets blocked because some domains are in D state? If so, please file a bug for that issue and we'll work on a reasonable solution for it.

Comment 6 Francesco Romani 2016-11-07 11:38:10 UTC

(In reply to Jiri Denemark from comment #5)
> Are you suggesting that the problem is with the bulk stats API, which gets
> blocked because some domains are in D state? If so, please file a bug for
> that issue and we'll work on a reasonable solution for it.

Yes, this is one of the few APIs we care most about. I will review the list and file more fine grained RFEs about them.