Created attachment 503906 [details]
Patch for first problem
Description of problem:
in confdb_object_iter result of object_find_create is now properly
checked. object_find_create can return -1 if object doesn't exists.
Without this check, incorrect handle (memory garbage) was directly
passed to object_find_next.
Following situation could happen:
- process 1 thru confdb creates find handle
- calls find iteration once
- different process 2 deletes object pointed by process 1 iterator
- process 1 calls iteration again ->
object_find_instance->find_child_list is invalid pointer
Now object_find_create creates array of matching object handlers and
object_find_next uses that array together with check for name. This
prevents situation where between steps 2 and 3 new object is created
with different name but sadly with same handler.
Version-Release number of selected component (if applicable):
Often but it's race so depends on HW, ... Problem 1 is visible in valgrind.
Steps to Reproduce:
# for i in `seq 1 5`;do (while true;do corosync-objctl -a | grep closed;done)& done
# corosync -f
Created attachment 503907 [details]
Patch for second problem
Created attachment 503909 [details]
test-confdb patch which checks first problem in valgrind
Corosync must be running thru valgrind
Patches posted to ML
Created attachment 504088 [details]
First patch backprted to current RHEL 6 package
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Cause: A race condition in the internal confdb data storage system would had incorrect mutual exclusion.
Consequence: Corosync would segfault under rare and contrived circumstances.
Fix: The race condition was fixed.
Result: Corosync no longer segfaults.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.