Bug 1576464
Summary: | Hash operation not allowed during iteration | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Derek Higgins <derekh> | ||||||
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | ||||||
Status: | CLOSED ERRATA | QA Contact: | yafu <yafu> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.4 | CC: | bfournie, chhu, dyuan, hjensas, jdenemar, jherrman, kchamart, kevin, lmen, mflusche, michele, mprivozn, mtessun, sasha, wznoinsk, xuzhang | ||||||
Target Milestone: | rc | Keywords: | Upstream, ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-4.3.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: |
Prior to this update, guest virtual machine actions that use a python library in some cases failed and "Hash operation not allowed during iteration" error messages were logged. Several redundant thread access checks have been removed, and the problem no longer occurs.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1579460 1581364 (view as bug list) | Environment: | |||||||
Last Closed: | 2018-10-30 09:55:31 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1579460, 1581364 | ||||||||
Attachments: |
|
Description
Derek Higgins
2018-05-09 14:03:52 UTC
Created attachment 1433878 [details]
vbmc errors
I think this is fixed upstream by: commit 4d7384eb9ddef2008cb0cc165eb808f74bc83d6b Author: Vincent Bernat <vincent> AuthorDate: Tue Apr 10 08:27:15 2018 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Wed Apr 11 11:18:37 2018 +0200 util: don't check for parallel iteration in hash-related functions This is the responsability of the caller to apply the correct lock before using these functions. Moreover, the use of a simple boolean was still racy: two threads may check the boolean and "lock" it simultaneously. Users of functions from src/util/virhash.c have to be checked for correctness. Lookups and iteration should hold a RO lock. Modifications should hold a RW lock. Most important uses seem to be covered. Callers have now a greater responsability, notably the ability to execute some operations while iterating were reliably forbidden before are now accepted. Signed-off-by: Vincent Bernat <vincent> libvirt.git $ git describe --contains 4d7384eb9ddef2008cb0cc165eb808f74bc83d6b v4.3.0-rc1~369 A reproducer would be useful. That said, seems like something in this area was very recently fixed in upstream libvirt, in tis commit: 4d7384e -- "util: don't check for parallel iteration in hash-related functions" But note that the above commit isn't in the libvirt version (3.9.0, package: 14.el7_5.2) running in your setup. Maybe Michal will be able to tell us a bit more. (In reply to Kashyap Chamarthy from comment #4) > A reproducer would be useful. > > That said, seems like something in this area was very recently fixed in > upstream libvirt, in tis commit: > > 4d7384e -- "util: don't check for parallel iteration in > hash-related functions" > > But note that the above commit isn't in the libvirt version (3.9.0, package: > 14.el7_5.2) running in your setup. > > Maybe Michal will be able to tell us a bit more. Ah, Michal already pointed to the same commit in his earlier comment. And he elaborated further on IRC: I think the problem is that we've switched to RW locks which allows multiple readers to work over list of domains (in fact a hash table), however the rest of the code (hash table impl) had this check preventing access from multiple threads. For instance, if one thread is listing running VMs while the other is fetching stats for all running VMs it may so happen that these threads will clash on the bogus check - hence the error message. Reproduced with libvirt-3.9.0-14.el7_5.4.x86_64. Test steps: 1.Start a guest: #virsh start test1 2.Do 'virsh list' in a loop: #for i in {1..1000}; do virsh list; done 3.Open another terminal, do 'virsh domstats' in a loop: #for i in {1..1000}; do virsh domstats; done 4.Check the libvirtd.log: #cat /var/log/libvirt/libvirtd.log | grep -i virhash 2018-05-14 07:28:48.812+0000: 2177: error : virHashForEach:597 : Hash operation not allowed during iteration 2018-05-14 07:28:50.426+0000: 2175: error : virHashForEach:597 : Hash operation not allowed during iteration 2018-05-14 07:29:19.708+0000: 2175: error : virHashForEach:597 : Hash operation not allowed during iteration Do you know when you will have a build available with this fix, or is it possible to get another scratch build built? The output from the scratch build from comment no longer seems to be there. Can not reproduce the issue with the scratch build in comment 12. *** Bug 1571384 has been marked as a duplicate of this bug. *** Does this bz need to be cloned for 7.5? We would like this patch to be considered for 7.5.z. This bug has been causing many failures of Openstack deployments that use Virtualbmc. As shown in comments 7 and 8, this patch works well and fixes these deployment failures. Verified with libvirt-4.3.0-1.el7.x86_64. Test steps: 1.Start a guest: #virsh start test1 2.Do 'virsh list' in a loop: #for i in {1..1000}; do virsh list; done 3.Open another terminal, do 'virsh domstats' in a loop: #for i in {1..1000}; do virsh domstats; done 4.Check the libvirtd.log after step 2&3, no error "Hash operation not allowed during iteration": #cat /var/log/libvirt/libvirtd.log | grep -i "Hash operation not allowed during iteration" no output Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3113 |