Bug 606335

Summary: glibc detected *** corosync: malloc(): memory corruption
Product: Red Hat Enterprise Linux 6 Reporter: Milos Malik <mmalik>
Component: corosyncAssignee: Jan Friesse <jfriesse>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.0CC: cluster-maint, sdake, ssaha
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: corosync-1.2.3-6.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-10 22:07:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 271561    
Bug Blocks:    
Attachments:
Description Flags
Proposed patch none

Description Milos Malik 2010-06-21 12:50:04 UTC
Description of problem:


Version-Release number of selected component (if applicable):
corosync-1.2.3-2.el6.i686
corosynclib-1.2.3-2.el6.i686

How reproducible:
always

Steps to Reproduce:
# mv /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
# service corosync status
corosync is stopped
# service corosync start
Starting Corosync Cluster Engine (corosync): *** glibc detected *** corosync: free(): invalid next size (fast): 0x09b34138 ***
*** glibc detected *** corosync: malloc(): memory corruption: 0x09b34148 ***

  
Actual results:


Expected results:

Comment 3 Steven Dake 2010-06-21 16:47:07 UTC
Did you set selinux to permissive before reproducing this bug?  There are currently issues with selinux and service corosync start.  Can you try corosync -f instead?

Comment 4 Nate Straz 2010-06-21 17:21:45 UTC
SELinux issues for the entire cluster stack are being handled in bug 271561.  There is currently one outstanding issue regarding the way corosync libraries create a communications socket with corosync.

Comment 5 Steven Dake 2010-06-21 19:27:49 UTC
The issue with corosync and selinux ATM is that service corosync start doesn't start creating all sorts of AVC denials.

Milos,

Can you verify you tried with permissive mode and received this error?

Comment 6 Milos Malik 2010-06-22 06:19:33 UTC
This bug seems to be reproducible in enforcing mode only:

# setenforce 0
# service corosync start
Starting Corosync Cluster Engine (corosync): [  OK  ]
# service corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ]
Waiting for corosync services to unload:.[  OK  ]
# setenforce 1
# service corosync start
Starting Corosync Cluster Engine (corosync): *** glibc detected *** corosync: free(): invalid next size (fast): 0x094dc260 ***
*** glibc detected *** corosync: malloc(): memory corruption: 0x094dc270 ***

Comment 7 Jan Friesse 2010-06-22 09:13:27 UTC
Created attachment 425871 [details]
Proposed patch

Main problem was hidden in calling pathconf (internally calls statfs) which fails. After this fail, newly allocated memory for readdir_r was smaller than expected and memory was overwritten by readdir_r.

Patch removes calling of pathconf and rather use NAME_MAX constant which is always large enough for all file systems.

Also return value of malloc SHOULD be checked.

Comment 8 Jan Friesse 2010-06-23 08:45:58 UTC
Patch committed revision 2962.

Comment 10 Nate Straz 2010-07-08 20:58:35 UTC
[root@morph-01 ~]# rpm -q corosync
corosync-1.2.3-9.el6.i686
[root@morph-01 ~]# setenforce 1
[root@morph-01 ~]# sestatus
SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   enforcing
Mode from config file:          permissive
Policy version:                 24
Policy from config file:        targeted
[root@morph-01 ~]# service corosync start
Starting Corosync Cluster Engine (corosync): [  OK  ]
[root@morph-01 ~]# service corosync status
corosync (pid  1820) is running...

Comment 11 releng-rhel@redhat.com 2010-11-10 22:07:16 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.