Bug 737881 - After issue "event-test.py qemu:///system", opening virt-viewer will kill service libvirtd
Summary: After issue "event-test.py qemu:///system", opening virt-viewer will kill ser...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Daniel Veillard
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 741533 746556 (view as bug list)
Depends On:
Blocks: 667620
TreeView+ depends on / blocked
 
Reported: 2011-09-13 10:26 UTC by kjia
Modified: 2011-12-06 11:35 UTC (History)
9 users (show)

Fixed In Version: libvirt-0.9.4-13.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 11:28:58 UTC


Attachments (Terms of Use)
The log message untill libvird doesn't work. (64.12 KB, text/plain)
2011-09-13 10:26 UTC, kjia
no flags Details
Test Case (9.28 KB, text/plain)
2011-10-17 14:15 UTC, IBM Bug Proxy
no flags Details
Backported fix (1/2) (6.49 KB, text/plain)
2011-10-17 14:16 UTC, IBM Bug Proxy
no flags Details
Backported fix (2/2) (3.74 KB, text/plain)
2011-10-17 14:16 UTC, IBM Bug Proxy
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description kjia 2011-09-13 10:26:32 UTC
Created attachment 522885 [details]
The log message untill libvird doesn't work.

Description of problem:

After issue "event-test.py qemu:///system", opening virt-viewer will make service libvirtd unavailable.

Version-Release number of selected component (if applicable):

libvirt-0.9.4-11.el6.x86_64

How reproducible:

Sometimes but not 100%

Steps to Reproduce:

1.Start the libvirtd service, and then open two terminals.

2.In the first terminal run:
# python /usr/share/doc/libvirt-python-x.x.x/events-python/event-test.py  qemu:///system

3.In another terminal run:
# virsh start $name_of_domin
# virt-viewer $name_of_domin
  
Actual results:

python /usr/share/doc/libvirt-python-0.9.4/events-python/event-test.py
Using uri:qemu:///system
myDomainEventCallback1 EVENT: Domain guest(8) Started Booted
myDomainEventCallback2 EVENT: Domain guest(8) Started Booted
myDomainEventGraphicsCallback: Domain guest(8) 0 none 

# service libvirtd status
libvirtd dead but pid file exists 

Expected results:

events-test.py show the correct message about shutdown of domin.
And libvirtd works fine.

Additional info:

This is a regression bug , since libvirt-0.8.7-18.el6 didn't encounter this bug .

And this should be the problem on libvirt-python , without start the libvirt event handler , won't meet this problem .

Comment 3 Vivian Bian 2011-09-13 11:18:38 UTC
bug is filed with following pkgs 

libvirt-0.9.4-11.el6.x86_64
libvirt-python-0.9.4-11.el6.x86_64
qemu-kvm-0.12.1.2-2.185.el6.x86_64
kernel-2.6.32-196.el6.x86_64
virt-viewer-0.4.1-4.el6.x86_64

Comment 4 Daniel Veillard 2011-09-21 07:49:43 UTC
Okay I could reproduce this easilly, worked first time, here is
the stack trace I captured on gdb:

Program received signal SIGSEGV, Segmentation fault.
0x00007f98e1ac22a5 in malloc_consolidate () from /lib64/libc.so.6
(gdb) where
#0  0x00007f98e1ac22a5 in malloc_consolidate () from /lib64/libc.so.6
#1  0x00007f98e1ac4a48 in _int_free () from /lib64/libc.so.6
#2  0x00007f98e36f12b9 in virFree (ptrptr=0x7fff604fe848) at util/memory.c:310
#3  0x000000000044288f in virNetMessageFree (msg=0x15aa540)
    at rpc/virnetmessage.c:69
#4  0x000000000043e928 in virNetServerClientDispatchWrite (
    sock=<value optimized out>, events=2, opaque=0x15276a0)
    at rpc/virnetserverclient.c:902
#5  virNetServerClientDispatchEvent (sock=<value optimized out>, events=2, 
    opaque=0x15276a0) at rpc/virnetserverclient.c:956
#6  0x00007f98e36e8022 in virEventPollDispatchHandles ()
    at util/event_poll.c:470
#7  virEventPollRunOnce () at util/event_poll.c:611
#8  0x00007f98e36e6ed7 in virEventRunDefaultImpl () at util/event.c:247
#9  0x000000000043f97d in virNetServerRun (srv=0x151ee50)
    at rpc/virnetserver.c:701
#10 0x000000000041ed04 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at libvirtd.c:1591

I also managed to get valgrind errors with more details, the errors occurs
only when doing the virt-viewer connection

==15543== Invalid free() / delete / delete[]
==15543==    at 0x4A0595D: free (vg_replace_malloc.c:366)
==15543==    by 0x3025A4A2B8: virFree (memory.c:310)
==15543==    by 0x3025A7EC49: virDomainEventFree (domain_event.c:489)
==15543==    by 0x3025A7EF42: virDomainEventQueueDispatch (domain_event.c:1154)
==15543==    by 0x3025A8013D: virDomainEventStateFlush (domain_event.c:1195)
==15543==    by 0x4778B1: qemuDomainEventFlush (qemu_domain.c:134)
==15543==    by 0x3025A40DA5: virEventPollRunOnce (event_poll.c:421)
==15543==    by 0x3025A3FED6: virEventRunDefaultImpl (event.c:247)
==15543==    by 0x43F97C: virNetServerRun (virnetserver.c:701)
==15543==    by 0x41ED03: main (libvirtd.c:1591)
==15543==  Address 0x4e30d90 is 0 bytes inside a block of size 10 free'd
==15543==    at 0x4A0595D: free (vg_replace_malloc.c:366)
==15543==    by 0x300D3149E7: xdr_string (in /lib64/libc-2.12.so)
==15543==    by 0x43898D: xdr_remote_nonnull_string (remote_protocol.c:30)
==15543==    by 0x438C5B: xdr_remote_domain_event_graphics_address (remote_protocol.c:3907)
==15543==    by 0x43C37B: xdr_remote_domain_event_graphics_msg (remote_protocol.c:3934)
==15543==    by 0x300D314194: xdr_free (in /lib64/libc-2.12.so)
==15543==    by 0x4344E8: remoteRelayDomainEventGraphics (remote.c:333)
==15543==    by 0x3025A7F0DA: virDomainEventDispatchDefaultFunc (domain_event.c:1064)
==15543==    by 0x477907: qemuDomainEventDispatchFunc (qemu_domain.c:125)
==15543==    by 0x3025A7EECA: virDomainEventDispatch (domain_event.c:1136)
==15543==    by 0x3025A7EF31: virDomainEventQueueDispatch (domain_event.c:1153)
==15543==    by 0x3025A8013D: virDomainEventStateFlush (domain_event.c:1195)

 I think it's teh same kind of problem I tried to fix just before 0.9.5
release, i.e. remoteRelayDomainEventGraphics() doesn't strdup the strings,
xdr_free frees them now and when the event is finally freed the strings have
already been deallocated.

   related to the following:

https://www.redhat.com/archives/libvir-list/2011-September/msg00750.html

Daniel

Comment 5 Daniel Veillard 2011-09-21 11:01:06 UTC
Patch based on the commits 675464b183f006fd805644075503f2d9bd647576
and 2b0803c64f8fdbbbf0f135ef9be610579fd8fe8f fixes the issue for me,
the resulting patch was sent to rhvirt-patches

And yes that's a blocker !

Daniel

Comment 7 Vivian Bian 2011-09-27 03:19:54 UTC
tested with 
libvirt-0.9.4-13.el6.x86_64
qemu-kvm-0.12.1.2-2.192.el6.x86_64
kernel-2.6.32-197.el6.x86_64
virt-viewer-0.2.1-3.el6.x86_64

Steps:
Steps to Reproduce:

1.Start the libvirtd service, and then open two terminals.

2.In the first terminal run:
# python /usr/share/doc/libvirt-python-x.x.x/events-python/event-test.py 
qemu:///system

3.In another terminal run:
# virsh start $name_of_domin
# virt-viewer $name_of_domin

4. # service libvirtd status 
libvirtd (pid  22977) is running...

Tried about 10 times , and libvirtd never crashed , so set bug status to VERIFIED

Comment 8 Peter Krempa 2011-10-03 13:55:29 UTC
*** Bug 741533 has been marked as a duplicate of this bug. ***

Comment 9 Eric Blake 2011-10-17 14:13:39 UTC
*** Bug 746556 has been marked as a duplicate of this bug. ***

Comment 10 IBM Bug Proxy 2011-10-17 14:15:58 UTC
Created attachment 528548 [details]
Test Case

Comment 11 IBM Bug Proxy 2011-10-17 14:16:05 UTC
Created attachment 528549 [details]
Backported fix (1/2)

Comment 12 IBM Bug Proxy 2011-10-17 14:16:10 UTC
Created attachment 528550 [details]
Backported fix (2/2)

Comment 13 errata-xmlrpc 2011-12-06 11:28:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.