Bug 1497197

Summary: Calling virDomain.getMemoryStats and virDomain.diskErrors may block the entire process [rhel-7.4.z]
Product: Red Hat Enterprise Linux 7 Reporter: Oneata Mircea Teodor <toneata>
Component: libvirt-pythonAssignee: Pavel Hrdina <phrdina>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: chhu, dyuan, fromani, gveitmic, gwatson, jherrman, jiyan, jsuchane, lhuang, libvirt-maint, lmen, mkalinin, mtessun, nsoffer, pkrempa, weizhan, xuzhang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-python-3.2.0-3.el7_4.1 Doc Type: Bug Fix
Doc Text:
When the storage of a guest hosted on a Red Hat Virtualization Hypervisor system became inaccessible, the host in some cases became unresponsive. This update fixes the behavior of the virDomainMemoryStats() function, which prevents the described problem from occurring.
Story Points: ---
Clone Of: 1496517 Environment:
Last Closed: 2017-10-19 15:16:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1496517    
Bug Blocks: 1481022    
Attachments:
Description Flags
test scripts
none
test xml none

Description Oneata Mircea Teodor 2017-09-29 12:54:40 UTC
This bug has been copied from bug #1496517 and has been proposed to be backported to 7.4 z-stream (EUS).

Comment 5 weizhang 2017-10-10 01:41:22 UTC
Verified on libvirt-python-3.2.0-3.el7_4.1

Reproduce steps on libvirt-python-3.2.0-3.el7:
1. prepare nfs with iso in it and add this cdrom in guest xml
2. restart libvirtd 
3. on console 1, run
# python ./test.py
We can see the output for main thread and other 2 threads, and the vm started

run abount 30-40s, on console 2 run
# iptables -A OUTPUT -d [NFS_SERVER] -j DROP
to block the nfs access

For several min, the qemu process into Dl state, and the output for print will hang for 30s, see the time I marked in the log

*******
print time: 2017-10-09 19:44:06.384147

print time: 2017-10-09 19:44:07.385305

print time: 2017-10-09 19:44:08.386450

start memoryStats: 2017-10-09 19:44:08.395628 <------------- here the time

libvirt: QEMU Driver error : Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats)
print time: 2017-10-09 19:44:38.395709 
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 765, in run
    self.__target(*self.__args, **self.__kwargs)
  File "./test.py", line 19, in memory_stats
    dom.memoryStats()
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1450, in memoryStats
    if ret is None: raise libvirtError ('virDomainMemoryStats() failed', dom=self)
libvirtError: Timed out during operation: cannot acquire state change lock (held by remoteDispatchConnectGetAllDomainStats)

print time: 2017-10-09 19:44:39.396900  <-------------- here the time

print time: 2017-10-09 19:44:40.398037

print time: 2017-10-09 19:44:41.399191
**********

On libvirt-python-3.2.0-3.el7_4.1 the print will finished without hang, so verify it. 

The test xml and scripts will be given on attachment

Comment 6 weizhang 2017-10-10 01:44:58 UTC
Created attachment 1336584 [details]
test scripts

Comment 7 weizhang 2017-10-10 01:46:08 UTC
Created attachment 1336585 [details]
test xml

Comment 9 errata-xmlrpc 2017-10-19 15:16:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2948