Bug 712115

Summary:

corosync confdb connection can cause segfault

Product:

Red Hat Enterprise Linux 6

Reporter:

Jan Friesse <jfriesse>

Component:

corosync

Assignee:

Jan Friesse <jfriesse>

Status:

CLOSED ERRATA

QA Contact:

Cluster QE <mspqa-list>

Severity:

high

Docs Contact:

Priority:

high

Version:

6.1

CC:

cluster-maint, djansa, jkortus, sdake

Target Milestone:

Target Release:

6.2

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

corosync-1.4.0-1.el6

Doc Type:

Bug Fix

Doc Text:

Cause: A race condition in the internal confdb data storage system would had incorrect mutual exclusion. Consequence: Corosync would segfault under rare and contrived circumstances. Fix: The race condition was fixed. Result: Corosync no longer segfaults.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-12-06 11:51:01 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Patch for first problem	none
Patch for second problem	none
test-confdb patch which checks first problem in valgrind	none
First patch backprted to current RHEL 6 package	none

Description Jan Friesse 2011-06-09 14:32:59 UTC

Created attachment 503906 [details]
Patch for first problem

Description of problem:
Problem 1:
in confdb_object_iter result of object_find_create is now properly
checked. object_find_create can return -1 if object doesn't exists.
Without this check, incorrect handle (memory garbage) was directly
passed to object_find_next.

Problem 2:
Following situation could happen:
- process 1 thru confdb creates find handle
- calls find iteration once
- different process 2 deletes object pointed by process 1 iterator
- process 1 calls iteration again ->
  object_find_instance->find_child_list is invalid pointer

-> segfault

Now object_find_create creates array of matching object handlers and
object_find_next uses that array together with check for name. This
prevents situation where between steps 2 and 3 new object is created
with different name but sadly with same handler.

Version-Release number of selected component (if applicable):
Corosync master

How reproducible:
Often but it's race so depends on HW, ... Problem 1 is visible in valgrind.

Steps to Reproduce:
One node.
# for i in `seq 1 5`;do (while true;do corosync-objctl -a | grep closed;done)& done 
# corosync -f
  
Actual results:
segfault

Expected results:
no segfault

Additional info:

Comment 1 Jan Friesse 2011-06-09 14:33:43 UTC

Created attachment 503907 [details]
Patch for second problem

Comment 2 Jan Friesse 2011-06-09 14:35:39 UTC

Created attachment 503909 [details]
test-confdb patch which checks first problem in valgrind

Corosync must be running thru valgrind

Comment 3 Jan Friesse 2011-06-09 14:36:01 UTC

Patches posted to ML

Comment 5 Jan Friesse 2011-06-10 12:15:28 UTC

Created attachment 504088 [details]
First patch backprted to current RHEL 6 package

Comment 10 Steven Dake 2011-10-27 18:47:37 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: A race condition in the internal confdb data storage system would had incorrect mutual exclusion.
  Consequence: Corosync would segfault under rare and contrived circumstances.
  Fix: The race condition was fixed.
  Result: Corosync no longer segfaults.

Comment 11 errata-xmlrpc 2011-12-06 11:51:01 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1515.html