Created attachment 1186985 [details] Minimal example of program crashing Description of problem: It seems that if I close a connection while a domain event callback is in progress, I can easily have a crash. Here is a backtrace: #v+ #0 virFree (ptrptr=0x0) at ../../../src/util/viralloc.c:582 save_errno = <optimized out> #1 0x00007fc8328a4ad2 in virObjectEventCallbackListPurgeMarked (cbList=0xadfc30) at ../../../src/conf/object_event.c:282 freecb = <optimized out> n = 0 #2 virObjectEventStateFlush (state=0xaf5380) at ../../../src/conf/object_event.c:819 tempQueue = { count = 0, events = 0x0 } #3 virObjectEventTimer (timer=<optimized out>, opaque=0xaf5380) at ../../../src/conf/object_event.c:560 state = 0xaf5380 #4 0x00007fc83280b7aa in virEventPollDispatchTimeouts () at ../../../src/util/vireventpoll.c:457 cb = 0x7fc8328a48d0 <virObjectEventTimer> timer = 1 opaque = 0xaf5380 now = 1470212691501 i = 0 ntimeouts = 1 #5 virEventPollRunOnce () at ../../../src/util/vireventpoll.c:653 fds = 0x7fc824000920 ret = <optimized out> timeout = <optimized out> nfds = 1 __func__ = "virEventPollRunOnce" __FUNCTION__ = "virEventPollRunOnce" #6 0x00007fc83280a141 in virEventRunDefaultImpl () at ../../../src/util/virevent.c:314 __func__ = "virEventRunDefaultImpl" #7 0x0000000000400b37 in loop (arg=0x0) at crash.c:8 __PRETTY_FUNCTION__ = "loop" #v- And the state of cbList: #v+ >>> print *cbList $2 = { nextID = 11419456, count = 1, callbacks = 0x0 } #v- I have another thread, but it is just sleeping when the crash happens. Source code to trigger the problem is provided as an attachment. Running the program in loop triggers the bug in less than a second. Version-Release number of selected component (if applicable): 2.0.0 (from Debian) How reproducible: Always Steps to Reproduce: 1. compile the attached program with "gcc -Wall crash.c -lvirt -lpthread -o crash" 2. while ./crash; do echo -n "." ; done Actual results: Crashes after a few seconds. Usually after displaying "leak...". Expected results: Should not crash, despite the "leak..." message.
Tiny bit of analysis and a patch that fixes it for me (although I can't explain it): https://www.redhat.com/archives/libvir-list/2016-October/msg00318.html
This should be fixed with the following series: https://www.redhat.com/archives/libvir-list/2016-October/msg00469.html
This should be fixed by v2.3.0-76-g1827f2ac5de3..v2.3.0-79-g44bf83e313b7, feel free to reopen this BZ if that doesn't work for you. commit 1827f2ac5de36da06d0246e862910c3a69065752 Author: Martin Kletzander <mkletzan> Date: Tue Oct 11 09:48:36 2016 +0200 Change virDomainEventState to virObjectLockable commit 3d279e23e7fb353720e1866bd3b2160f104817b4 Author: Martin Kletzander <mkletzan> Date: Tue Oct 11 13:30:11 2016 +0200 Reference state when using it as opaque commit 6fecf9523a1c05fa71e7c4713282ae59c5e670d4 Author: Martin Kletzander <mkletzan> Date: Tue Oct 11 13:35:18 2016 +0200 De-duplicate code into virObjectEventStateCleanupTimer() commit 44bf83e313b7567be1097c3e971b3e2b6c8e3601 Author: Martin Kletzander <mkletzan> Date: Tue Oct 11 13:44:21 2016 +0200 Clean timer in virObjectEventStateFlush