Bug 892649 - Concurrency/locking causes segfault
Summary: Concurrency/locking causes segfault
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Michal Privoznik
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 892901 903238
TreeView+ depends on / blocked
 
Reported: 2013-01-07 14:04 UTC by Scott Sullivan
Modified: 2013-01-23 15:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 892901 (view as bug list)
Environment:
Last Closed: 2013-01-23 15:31:08 UTC
Embargoed:


Attachments (Terms of Use)

Description Scott Sullivan 2013-01-07 14:04:38 UTC
Description of problem:

When running multiple virsh create/destroy loops, sometimes (if the timing is right) a segfault will occur, causing libvirtd to crash. 

Version-Release number of selected component (if applicable):

This problem was introduced with v0.9.12. I cannot reproduce this issue under v0.9.11.X or older. I am able to reproduce this problem as well with the latest code from master.

How reproducible:

This posting has the steps to reproduce the problem:

http://www.redhat.com/archives/libvir-list/2012-December/msg01365.html

Steps to Reproduce:
1. Go to above link, follow steps outlined.
  
Actual results:

When the script is running and doing its operations with libvirtd, within 10 or 20 minutes libvirtd will segfault. 

Expected results:

The script outlined all get ran and complete without libvirtd crashing.

Additional info:

All additional info is in the list; including multiple GDB output from the crashes I reproduced. In addition, there was a patch by Michal Privoznik (http://www.redhat.com/archives/libvir-list/2012-December/msg01372.html) that attempted to fix this problem, however the issue still occurs after applying this patch on top of v1.0.0 or v1.0.1. 

Here was Michals response once I told him his patch wasn't working for me:

http://www.redhat.com/archives/libvir-list/2012-December/msg01378.html

Comment 1 Michal Privoznik 2013-01-23 15:31:08 UTC
The fix has been pushed:

commit 81621f3e6e45e8681cc18ae49404736a0e772a11
Author:     Daniel P. Berrange <berrange>
AuthorDate: Fri Jan 18 14:33:51 2013 +0000
Commit:     Daniel P. Berrange <berrange>
CommitDate: Fri Jan 18 15:45:38 2013 +0000

    Fix race condition when destroying guests
    
    When running virDomainDestroy, we need to make sure that no other
    background thread cleans up the domain while we're doing our work.
    This can happen if we release the domain object while in the
    middle of work, because the monitor might detect EOF in this window.
    For this reason we have a 'beingDestroyed' flag to stop the monitor
    from doing its normal cleanup. Unfortunately this flag was only
    being used to protect qemuDomainBeginJob, and not qemuProcessKill
    
    This left open a race condition where either libvirtd could crash,
    or alternatively report bogus error messages about the domain already
    having been destroyed to the caller
    
    Signed-off-by: Daniel P. Berrange <berrange>


v1.0.1-349-g81621f3


Note You need to log in before you can comment on or make changes to this bug.