405781 – calloc() broken when process address space is locked

Bug 405781 - calloc() broken when process address space is locked

Summary: calloc() broken when process address space is locked

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	5.2
Hardware:	i386
OS:	Linux
Priority:	low
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Jakub Jelinek
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-11-30 11:20 UTC by Roland Westrelin
Modified:	2008-05-21 16:52 UTC (History)
CC List:	1 user (show)
Fixed In Version:	RHBA-2008-0083
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-05-21 16:52:52 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2008:0083	0	normal	SHIPPED_LIVE	glibc bug fix update	2008-05-20 13:20:00 UTC

Description Roland Westrelin 2007-11-30 11:20:07 UTC

Description of problem:

calloc() sometimes returns buffers that are not zero-filled. This happens only
when the process address space is locked by a call to mlockall.

When the glibc calloc implementation knows that part of a buffer it is about to
return has just been allocated by growing the heap (acquiring new pages from the
kernel that are zero filled), it does not clear that part of the buffer (because
presumably it is already filled with zeros).

When the C-heap is shrunk, memory not needed anymore is marked with a
call to madvise with the MADV_DONTNEED flag.

Subsequent use of this space will be assumed to be filled with zeros and
won't be cleared by the C-heap allocator on a calloc. Except, that when
memory is locked (with a call to mlockall), the kernel considers madvise
with the MADV_DONTNEED flag to be unsupported and returns an error that
is ignored by the libc.

Roughly, the chain of events that leads to the crash is:

1- the application allocates a bunch of buffers with malloc etc. and uses them

2- when it's done, it calls free on those buffers. The C-heap is shrunk. The
space that composed those buffers is released by calling madvise(MADV_DONTNEED).

3- the application calls calloc(). The C-heap needs to be grown again. It
reclaims some of the space that was released by the call to madvise. Because it
is the result of an extension of the heap, calloc does not fill the buffer with
zeros.

When the madvise call in 2- succeeds, all this works fine because space marked
with madvise when it is accessed again is filled with zeros. When the madvise
call failed in 2- because the memory of the application is locked, calloc() in
3- returns a buffer filled with garbage. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Tim Burke 2007-12-17 01:01:19 UTC

Do you get this same behavior when running the standard RHEL5 kernel?  Just
trying to initially narrow things down to kernel vs glibc.

Comment 2 Tim Burke 2007-12-17 13:19:23 UTC

There is not enough information contained in this bug report for us to
understand the problem and attempt to resolve.  Please provide the test case and
accompanying descriptive information.

Comment 3 Ulrich Drepper 2007-12-17 18:47:02 UTC

I've looked at the problem.  After ignoring the misleading subject I've found a
problem which is likely the cause for the issue which is observed.  We have to
be less optimistic about the state of memory in arenas other than the main arena
when  madvise is used.  The upstream glibc cvs contains a fix.  Testing with the
next rawhide build when it's done would be appreciated.

Comment 4 Tim Burke 2007-12-20 14:13:32 UTC

Just checked to see where specifically the package can be found.  The answer was
that this package has not yet been built and provided externally.  This bugzilla
will be updated when the new package is available.

Here's some instructions on accessing the Fedora rawhide to get the latest
available packages.  Easiest is to download from koji.  Go to
http://koji.fedoraproject.org/koji/, search for glibc (in the search field),
select the most recent version, and then download the individual RPMs

Comment 6 Roland Westrelin 2007-12-21 14:38:51 UTC

I'll try the fix either with rawhide or the cvs libc and let you not if the
issue is gone.

Comment 7 Jakub Jelinek 2008-01-11 12:43:55 UTC

Backported in glibc-2.5-20.

Comment 8 RHEL Program Management 2008-01-11 12:45:11 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Petr Muller 2008-01-11 13:42:28 UTC

qe_ack+ for rhel5.2
we'll need some testcase (or at least some testing hints), or verification from 
Sun (comment 6)

Comment 11 Roland Westrelin 2008-01-17 09:19:37 UTC

I've tried the 2.7 glibc from:
http://koji.fedoraproject.org/koji/buildinfo?buildID=31131

No more crashes. The bug is fixed as far as I can tell.

Comment 14 errata-xmlrpc 2008-05-21 16:52:52 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0083.html

Note You need to log in before you can comment on or make changes to this bug.