Bug 450094

Summary:

Patch for bug 360281 "Odd behaviour in mmap" introduces regression

Product:

Red Hat Enterprise Linux 4

Reporter:

Vitaly Mayatskikh <vmayatsk>

Component:

kernel

Assignee:

Vitaly Mayatskikh <vmayatsk>

Status:

CLOSED ERRATA

QA Contact:

Martin Jenner <mjenner>

Severity:

medium

Docs Contact:

Priority:

urgent

Version:

4.7

CC:

ahecox, jplans, qcai, sprabhu, tao

Target Milestone:

Keywords:

ZStream

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

GSSApproved

Fixed In Version:

RHSA-2008-0665

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2008-07-24 19:30:06 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

450759, 450760

Attachments:

Description	Flags
proposed patch	none

Description Vitaly Mayatskikh 2008-06-05 11:08:01 UTC

Process hang waiting of semaphore was found in latest RHEL4U6 async kernels
(upcoming RHEL-4.7 is also affected). The hung processes appear to be waiting
for the mm->mmap_sem. eg:

Jun  3 16:09:05 atlddm19 kernel: ps            D ffffffff8030ee5c
0 22869 22825                     (NOTLB)
Jun  3 16:09:05 atlddm19 kernel: 00000102feb8fdd8 0000000000000006
00000102feb8fd98 0006000000000002
Jun  3 16:09:05 atlddm19 kernel:        00000102feb8fda8
ffffffff801d4610 0000000100000001 0000000200000048
Jun  3 16:09:05 atlddm19 kernel:        00000103f42d0030 0000000000d41df1
Jun  3 16:09:05 atlddm19 kernel: Call
Trace:<ffffffff801d4610>{avc_has_perm+70}
<ffffffff803104be>{__down_read+134}
Jun  3 16:09:05 atlddm19 kernel:
<ffffffff8013fe93>{access_process_vm+90}
<ffffffff801ae429>{proc_pid_cmdline+99}
Jun  3 16:09:05 atlddm19 kernel:
<ffffffff801ae9bb>{proc_info_read+85} <ffffffff8017b0d0>{vfs_read+207}
Jun  3 16:09:05 atlddm19 kernel:
<ffffffff8017b32c>{sys_read+69} <ffffffff8011026a>{system_call+126}

The customer reports that this was not seen on the earlier
2.6.9-67.0.7 kernel. On closer look at the show cpu sysrq which was
sent, I see the following process which is holding the mmap_sem

<ffffffff8023c8b4>{showacpu+45}
<ffffffff8011c6f2>{smp_call_function_interrupt+64}
<ffffffff80110b69>{call_function_interrupt+133}
<ffffffff8011700c>{arch_get_unmapped_area_topdown+0}
<ffffffff8016d4d4>{find_vma_prev+26}
<ffffffff8011710e>{arch_get_unmapped_area_topdown+258}
<ffffffff8016e358>{do_mmap_pgoff+333}
<ffffffff803103ce>{__down_write+52}
<ffffffff801284b9>{sys32_mmap2+252}
<ffffffff801265bb>{sysenter_do_call+27}

Takahiro Yasui found a difference between generic and arch-specific
implementations of arch_get_unmapped_area_topdown():

http://post-office.corp.redhat.com/archives/rhkernel-list/2008-March/msg01248.html

Comment 8 Linda Wang 2008-06-06 13:33:34 UTC

Vitaly, can you please post the patch for review?

Comment 9 Vitaly Mayatskikh 2008-06-06 13:45:25 UTC

Now I'm not sure if this not a new bug. I don't know which condition causes this
loop and have no fix for it at the moment.

Comment 10 Vitaly Mayatskikh 2008-06-07 00:13:22 UTC

Ok, it is a loop in the arch_get_unmapped_area_topdown().

        do {
                /*
                 * Lookup failure means no vma is above this address,
                 * else if new region fits below vma->vm_start,
                 * return with success:
                 */
                vma = find_vma(mm, addr);
                if (!vma || addr+len <= vma->vm_start)
                        /* remember the address as a hint for next time */
                        return (mm->free_area_cache = addr);

                /* remember the largest hole we saw so far */
                if (addr + mm->cached_hole_size < vma->vm_start)
                        mm->cached_hole_size = vma->vm_start - addr;

                /* try just below the current vma->vm_start */
                addr = vma->vm_start-len;
        } while (len <= vma->vm_start);

The condition in "while" statement is absolutely correct. However,
find_vma_prev() does not produce lookup failure!

/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */
struct vm_area_struct *
find_vma_prev(struct mm_struct *mm, unsigned long addr,
                        struct vm_area_struct **pprev)
{
        struct vm_area_struct *vma = NULL, *prev = NULL;
        struct rb_node * rb_node;
        if (!mm)
                goto out;
        
        /* Guard against addr being lower than the first VMA */
        vma = mm->mmap;
                                
        /* Go through the RB tree quickly. */
        rb_node = mm->mm_rb.rb_node;
        
        while (rb_node) {
                struct vm_area_struct *vma_tmp;
                vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb);
         
                if (addr < vma_tmp->vm_end) {
                        rb_node = rb_node->rb_left;
                } else {
                        prev = vma_tmp;
                        if (!prev->vm_next || (addr < prev->vm_next->vm_end))
                                break;
                        rb_node = rb_node->rb_right;
                }
        }

out:
        *pprev = prev;
        return prev ? prev->vm_next : vma;
}


So, in case of there's no vma below given address, find_vma_prev() just returns
the first vma. I don't understand comment "Guard against addr being lower than
the first VMA" and what it tries to guard, but each user of find_prev_vma()
checks return value for NULL.

Comment 12 Vitaly Mayatskikh 2008-06-07 11:47:15 UTC

Created attachment 308606 [details]
proposed patch

Comment 19 Vivek Goyal 2008-06-12 18:32:21 UTC

Committed in 73.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 22 errata-xmlrpc 2008-07-24 19:30:06 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html