RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 892901 - Concurrency/locking causes segfault
Summary: Concurrency/locking causes segfault
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 892649
Blocks: 903238
TreeView+ depends on / blocked
 
Reported: 2013-01-08 08:03 UTC by Michal Privoznik
Modified: 2014-06-18 00:43 UTC (History)
8 users (show)

Fixed In Version: libvirt-1.0.2-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of: 892649
: 903238 (view as bug list)
Environment:
Last Closed: 2014-06-13 11:29:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michal Privoznik 2013-01-08 08:03:01 UTC
+++ This bug was initially created as a clone of Bug #892649 +++

Description of problem:

When running multiple virsh create/destroy loops, sometimes (if the timing is right) a segfault will occur, causing libvirtd to crash. 

Version-Release number of selected component (if applicable):

This problem was introduced with v0.9.12. I cannot reproduce this issue under v0.9.11.X or older. I am able to reproduce this problem as well with the latest code from master.

How reproducible:

This posting has the steps to reproduce the problem:

http://www.redhat.com/archives/libvir-list/2012-December/msg01365.html

Steps to Reproduce:
1. Go to above link, follow steps outlined.
  
Actual results:

When the script is running and doing its operations with libvirtd, within 10 or 20 minutes libvirtd will segfault. 

Expected results:

The script outlined all get ran and complete without libvirtd crashing.

Additional info:

All additional info is in the list; including multiple GDB output from the crashes I reproduced. In addition, there was a patch by Michal Privoznik (http://www.redhat.com/archives/libvir-list/2012-December/msg01372.html) that attempted to fix this problem, however the issue still occurs after applying this patch on top of v1.0.0 or v1.0.1. 

Here was Michals response once I told him his patch wasn't working for me:

http://www.redhat.com/archives/libvir-list/2012-December/msg01378.html

Comment 1 Scott Sullivan 2013-01-22 17:33:46 UTC
As the original reporter of this bug, I can say for me at least this issue was fixed with this commit:

http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=81621f3e6e45e8681cc18ae49404736a0e772a11

Comment 2 Dave Allan 2013-01-22 17:53:17 UTC
(In reply to comment #1)
> As the original reporter of this bug, I can say for me at least this issue
> was fixed with this commit:
> 
> http://libvirt.org/git/?p=libvirt.git;a=commitdiff;
> h=81621f3e6e45e8681cc18ae49404736a0e772a11

Many thanks for reporting that datapoint, that's extremely helpful.

Comment 4 Michal Privoznik 2013-01-23 15:31:52 UTC
Moving to POST:

commit 81621f3e6e45e8681cc18ae49404736a0e772a11
Author:     Daniel P. Berrange <berrange>
AuthorDate: Fri Jan 18 14:33:51 2013 +0000
Commit:     Daniel P. Berrange <berrange>
CommitDate: Fri Jan 18 15:45:38 2013 +0000

    Fix race condition when destroying guests
    
    When running virDomainDestroy, we need to make sure that no other
    background thread cleans up the domain while we're doing our work.
    This can happen if we release the domain object while in the
    middle of work, because the monitor might detect EOF in this window.
    For this reason we have a 'beingDestroyed' flag to stop the monitor
    from doing its normal cleanup. Unfortunately this flag was only
    being used to protect qemuDomainBeginJob, and not qemuProcessKill
    
    This left open a race condition where either libvirtd could crash,
    or alternatively report bogus error messages about the domain already
    having been destroyed to the caller
    
    Signed-off-by: Daniel P. Berrange <berrange>

v1.0.1-349-g81621f3

Comment 5 yanbing du 2013-02-05 07:28:10 UTC
Reproduce this bug with libvirt-1.0.1-1.el7.x86_64, when repeat create/destroy guests, it will lead to libvirtd crash:

#systemctl status  libvirtd.service
libvirtd.service - Virtualization daemon
	  Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
	  Active: failed (Result: core-dump) since Tue, 2013-02-05 14:32:10 CST; 50s ago
	 Process: 4049 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=dumped, signal=SEGV)
	  CGroup: name=systemd:/system/libvirtd.service
		  └ 2886 /sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf

Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.265+00004050: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f794418be70
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.265+00004050: debug : virObjectUnref:137 : OBJECT_DISPOSE: obj=0x7f794418be70
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004050: debug : qemuDomainObjEndJob:936 : Stopping job: modify (async=none)
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004050: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f7944118650
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004050: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f7944118650
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004050: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f79440373a0
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004054: debug : virObjectUnref:135 : OBJECT_UNREF: obj=0x7f7944118650
Feb 05 14:32:10 localhost.localdomain libvirtd[4049]: 2013-02-05 06:32:10.266+00004054: debug : virObjectUnref:137 : OBJECT_DISPOSE: obj=0x7f7944118650
Feb 05 14:32:10 localhost.localdomain systemd[1]: libvirtd.service: main process exited, code=dumped, status=11/SEGV
Feb 05 14:32:10 localhost.localdomain systemd[1]: Unit libvirtd.service entered failed state


After update libvirt to libvirt-1.0.2-1.el7.x86_64, this bug can not reproduce anymore. So bug VERIFIED.

Comment 6 yanbing du 2013-07-29 08:45:29 UTC
Move bug to VERIFIED according to comment5.

Comment 7 Ludek Smid 2014-06-13 11:29:28 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.


Note You need to log in before you can comment on or make changes to this bug.