Bug 1394405 - libvirtError: Timed out during operation: cannot acquire state change lock
Summary: libvirtError: Timed out during operation: cannot acquire state change lock
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 5.0 (RHEL 7)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: async
: 5.0 (RHEL 7)
Assignee: Eoghan Glynn
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-11 21:26 UTC by Aaron Thomas
Modified: 2020-01-17 16:10 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-05-25 17:31:14 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1254872 0 None None None 2016-11-11 21:26:35 UTC

Description Aaron Thomas 2016-11-11 21:26:35 UTC
Description of problem:
-----------------------------------------
Appears to be similar to the upstream nova bug attached to this BZ with queries against a domain which is being worked on asynchronously can result in hung clients and the inability to do anything further with that domain.

Version-Release number of selected component (if applicable):
-----------------------------------------
libvirt-1.2.8-16.el7_1.3.x86_64
libvirt-client-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-config-network-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-lxc-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64
libvirt-python-1.2.8-7.el7_1.1.x86_64
openstack-nova-common-2014.1.5-31.el7ost.noarch
openstack-nova-compute-2014.1.5-31.el7ost.noarch

How reproducible:
-----------------------------------------
Appears to occur every few days based on heavy queries however logging appears to indicate two separate threads that have made libvirt API calls against the same instance. Based on https://bugs.launchpad.net/nova/+bug/1254872 one of the calls has either hung completely, or is taking a very long time to respond causing the second API call to report this error message "libvirtError: Timed out during operation: cannot acquire state change lock".

Actual results:
-----------------------------------------
Queries against a domain which is being worked on asynchronously
can result in hung clients and the inability to do anything
further with that domain.

Expected results:
-----------------------------------------
Queries against a domain which is being worked on asynchronously
do not result in hung clients and the inability to do anything
further with that domain.

Additional info:
-----------------------------------------
We've requested the customer enable verbose libvirt logging to help identity the specific API calls that were last reported as successful as it appears possible this can be caused by a lot of factors.

Comment 1 Kashyap Chamarthy 2016-11-15 14:22:39 UTC
Assuming they are about to provide debug logs with the below log filters:

  1. In /etc/libvirt/libvirtd.conf, have these two config attributes:

     . . .
     log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util"
     log_outputs="1:file:/var/log/libvirt/libvirtd.log"
     . . .

  2. Restart libvirtd:

     $ systemctl restart libvirtd

  3. Repeat the test.

  4. Capture the libvirt logs and attach them as plain text to the
     bug.


Note You need to log in before you can comment on or make changes to this bug.