Bug 450094 - Patch for bug 360281 "Odd behaviour in mmap" introduces regression
Patch for bug 360281 "Odd behaviour in mmap" introduces regression
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
x86_64 Linux
urgent Severity medium
: rc
: ---
Assigned To: Vitaly Mayatskikh
Martin Jenner
: ZStream
Depends On:
Blocks: 450759 450760
  Show dependency treegraph
Reported: 2008-06-05 07:08 EDT by Vitaly Mayatskikh
Modified: 2010-10-22 21:42 EDT (History)
5 users (show)

See Also:
Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-07-24 15:30:06 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
proposed patch (2.14 KB, patch)
2008-06-07 07:47 EDT, Vitaly Mayatskikh
no flags Details | Diff

  None (edit)
Description Vitaly Mayatskikh 2008-06-05 07:08:01 EDT
Process hang waiting of semaphore was found in latest RHEL4U6 async kernels
(upcoming RHEL-4.7 is also affected). The hung processes appear to be waiting
for the mm->mmap_sem. eg:

Jun  3 16:09:05 atlddm19 kernel: ps            D ffffffff8030ee5c
0 22869 22825                     (NOTLB)
Jun  3 16:09:05 atlddm19 kernel: 00000102feb8fdd8 0000000000000006
00000102feb8fd98 0006000000000002
Jun  3 16:09:05 atlddm19 kernel:        00000102feb8fda8
ffffffff801d4610 0000000100000001 0000000200000048
Jun  3 16:09:05 atlddm19 kernel:        00000103f42d0030 0000000000d41df1
Jun  3 16:09:05 atlddm19 kernel: Call
Jun  3 16:09:05 atlddm19 kernel:
Jun  3 16:09:05 atlddm19 kernel:
<ffffffff801ae9bb>{proc_info_read+85} <ffffffff8017b0d0>{vfs_read+207}
Jun  3 16:09:05 atlddm19 kernel:
<ffffffff8017b32c>{sys_read+69} <ffffffff8011026a>{system_call+126}

The customer reports that this was not seen on the earlier
2.6.9-67.0.7 kernel. On closer look at the show cpu sysrq which was
sent, I see the following process which is holding the mmap_sem


Takahiro Yasui found a difference between generic and arch-specific
implementations of arch_get_unmapped_area_topdown():

Comment 8 Linda Wang 2008-06-06 09:33:34 EDT
Vitaly, can you please post the patch for review?
Comment 9 Vitaly Mayatskikh 2008-06-06 09:45:25 EDT
Now I'm not sure if this not a new bug. I don't know which condition causes this
loop and have no fix for it at the moment.
Comment 10 Vitaly Mayatskikh 2008-06-06 20:13:22 EDT
Ok, it is a loop in the arch_get_unmapped_area_topdown().

        do {
                 * Lookup failure means no vma is above this address,
                 * else if new region fits below vma->vm_start,
                 * return with success:
                vma = find_vma(mm, addr);
                if (!vma || addr+len <= vma->vm_start)
                        /* remember the address as a hint for next time */
                        return (mm->free_area_cache = addr);

                /* remember the largest hole we saw so far */
                if (addr + mm->cached_hole_size < vma->vm_start)
                        mm->cached_hole_size = vma->vm_start - addr;

                /* try just below the current vma->vm_start */
                addr = vma->vm_start-len;
        } while (len <= vma->vm_start);

The condition in "while" statement is absolutely correct. However,
find_vma_prev() does not produce lookup failure!

/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */
struct vm_area_struct *
find_vma_prev(struct mm_struct *mm, unsigned long addr,
                        struct vm_area_struct **pprev)
        struct vm_area_struct *vma = NULL, *prev = NULL;
        struct rb_node * rb_node;
        if (!mm)
                goto out;
        /* Guard against addr being lower than the first VMA */
        vma = mm->mmap;
        /* Go through the RB tree quickly. */
        rb_node = mm->mm_rb.rb_node;
        while (rb_node) {
                struct vm_area_struct *vma_tmp;
                vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb);
                if (addr < vma_tmp->vm_end) {
                        rb_node = rb_node->rb_left;
                } else {
                        prev = vma_tmp;
                        if (!prev->vm_next || (addr < prev->vm_next->vm_end))
                        rb_node = rb_node->rb_right;

        *pprev = prev;
        return prev ? prev->vm_next : vma;

So, in case of there's no vma below given address, find_vma_prev() just returns
the first vma. I don't understand comment "Guard against addr being lower than
the first VMA" and what it tries to guard, but each user of find_prev_vma()
checks return value for NULL.
Comment 12 Vitaly Mayatskikh 2008-06-07 07:47:15 EDT
Created attachment 308606 [details]
proposed patch
Comment 19 Vivek Goyal 2008-06-12 14:32:21 EDT
Committed in 73.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 22 errata-xmlrpc 2008-07-24 15:30:06 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.