Description of problem:
calloc() sometimes returns buffers that are not zero-filled. This happens only
when the process address space is locked by a call to mlockall.
When the glibc calloc implementation knows that part of a buffer it is about to
return has just been allocated by growing the heap (acquiring new pages from the
kernel that are zero filled), it does not clear that part of the buffer (because
presumably it is already filled with zeros).
When the C-heap is shrunk, memory not needed anymore is marked with a
call to madvise with the MADV_DONTNEED flag.
Subsequent use of this space will be assumed to be filled with zeros and
won't be cleared by the C-heap allocator on a calloc. Except, that when
memory is locked (with a call to mlockall), the kernel considers madvise
with the MADV_DONTNEED flag to be unsupported and returns an error that
is ignored by the libc.
Roughly, the chain of events that leads to the crash is:
1- the application allocates a bunch of buffers with malloc etc. and uses them
2- when it's done, it calls free on those buffers. The C-heap is shrunk. The
space that composed those buffers is released by calling madvise(MADV_DONTNEED).
3- the application calls calloc(). The C-heap needs to be grown again. It
reclaims some of the space that was released by the call to madvise. Because it
is the result of an extension of the heap, calloc does not fill the buffer with
When the madvise call in 2- succeeds, all this works fine because space marked
with madvise when it is accessed again is filled with zeros. When the madvise
call failed in 2- because the memory of the application is locked, calloc() in
3- returns a buffer filled with garbage.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Do you get this same behavior when running the standard RHEL5 kernel? Just
trying to initially narrow things down to kernel vs glibc.
There is not enough information contained in this bug report for us to
understand the problem and attempt to resolve. Please provide the test case and
accompanying descriptive information.
I've looked at the problem. After ignoring the misleading subject I've found a
problem which is likely the cause for the issue which is observed. We have to
be less optimistic about the state of memory in arenas other than the main arena
when madvise is used. The upstream glibc cvs contains a fix. Testing with the
next rawhide build when it's done would be appreciated.
Just checked to see where specifically the package can be found. The answer was
that this package has not yet been built and provided externally. This bugzilla
will be updated when the new package is available.
Here's some instructions on accessing the Fedora rawhide to get the latest
available packages. Easiest is to download from koji. Go to
http://koji.fedoraproject.org/koji/, search for glibc (in the search field),
select the most recent version, and then download the individual RPMs
I'll try the fix either with rawhide or the cvs libc and let you not if the
issue is gone.
Backported in glibc-2.5-20.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
qe_ack+ for rhel5.2
we'll need some testcase (or at least some testing hints), or verification from
Sun (comment 6)
I've tried the 2.7 glibc from:
No more crashes. The bug is fixed as far as I can tell.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.