Red Hat Bugzilla – Bug 712115
corosync confdb connection can cause segfault
Last modified: 2011-12-06 06:51:01 EST
Created attachment 503906 [details] Patch for first problem Description of problem: Problem 1: in confdb_object_iter result of object_find_create is now properly checked. object_find_create can return -1 if object doesn't exists. Without this check, incorrect handle (memory garbage) was directly passed to object_find_next. Problem 2: Following situation could happen: - process 1 thru confdb creates find handle - calls find iteration once - different process 2 deletes object pointed by process 1 iterator - process 1 calls iteration again -> object_find_instance->find_child_list is invalid pointer -> segfault Now object_find_create creates array of matching object handlers and object_find_next uses that array together with check for name. This prevents situation where between steps 2 and 3 new object is created with different name but sadly with same handler. Version-Release number of selected component (if applicable): Corosync master How reproducible: Often but it's race so depends on HW, ... Problem 1 is visible in valgrind. Steps to Reproduce: One node. # for i in `seq 1 5`;do (while true;do corosync-objctl -a | grep closed;done)& done # corosync -f Actual results: segfault Expected results: no segfault Additional info:
Created attachment 503907 [details] Patch for second problem
Created attachment 503909 [details] test-confdb patch which checks first problem in valgrind Corosync must be running thru valgrind
Patches posted to ML
Created attachment 504088 [details] First patch backprted to current RHEL 6 package
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: A race condition in the internal confdb data storage system would had incorrect mutual exclusion. Consequence: Corosync would segfault under rare and contrived circumstances. Fix: The race condition was fixed. Result: Corosync no longer segfaults.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1515.html