Bug 552257

Summary: Process-shared futex on a huge page causes livelock
Product: [Fedora] Fedora Reporter: r6144 <rainy6144>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: anton, dougsland, gansalmon, itamar, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-04 00:52:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Test program none

Description r6144 2010-01-04 14:10:25 UTC
Created attachment 381551 [details]
Test program

Description of problem:

The attached multi-threaded program, which uses a process-shared semaphore, locks up when the semaphore resides on a huge page.  The process encountering this bug cannot be interrupted or killed and takes about 250% of CPU, although the rest of the system remains usable.

Strace results show that the lockup occurs when dealing with a non-FUTEX_PRIVATE futex residing in a huge page, 0x1800000 in this case:

[pid 20513] futex(0x1802220, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 20514] futex(0x1802220, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 20512] <... futex resumed> )       = 1
[pid 20512] nanosleep({0, 100000000}, 0x7fff140f0890) = 0
[pid 20512] futex(0x1802220, FUTEX_WAKE, 1^C <unfinished ...>
^C^C^C^C^C^C^C^C^C^C^Z

Apparently the bug does not occur without huge pages, or if only process-private futexes are used.

oprofile shows a large number of calls to __wake_up(), read_hpet(), get_futex_key(), etc.

A related bug regarding futexes on huge pages is described in http://osdir.com/ml/linux-kernel/2009-07/msg04679.html, but the fix is apparently already in my kernel.

(This bug affects me because my numerical simulation program, which uses process-shared semaphores in fftw, runs quite a bit faster on huge pages, and this bug makes this impossible.  On Fedora 10 there was no problem.)

Version-Release number of selected component (if applicable):
kernel-2.6.31.9-174.fc12.x86_64
libhugetlbfs-devel-2.6-3.fc12.x86_64
glib2-devel-2.22.3-2.fc12.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install the packages shown above.
2. Set up hugetlbfs by running the following as root:

# hugeadm --create-global-mounts
# hugeadm --pool-pages-min=2M:10

(It is an x86_64 system so the huge page size is 2MiB.  Somehow the second command can fail and/or cause the system to thrash for minutes if it has been running for a long time, even though I have set up a 760MB ZONE_MOVABLE according to /proc/zoneinfo and have turned on the sysctl vm.hugepages_treat_as_movable.  Is this a bug as well?)

3. Compile and run the attached program with 

$ gcc -O2 -Wall -Wextra -fopenmp -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include      test_sem.c  -pthread -lgthread-2.0 -lrt -lglib-2.0   -lrt -o test_sem
$ LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes ./test_sem
  
Actual results:

Nothing is displayed, the test_sem process hangs with 250% CPU usage, and it cannot be killed or interrupted.

Expected results:

Several "Thread * got work." messages should be displayed, after which the test_sem process will hang, but it should use no CPU and be killable with Ctrl+C.

Additional info:

Comment 1 r6144 2010-01-23 13:10:16 UTC
The above test case still fails in kernel-2.6.31.12-174.2.3.fc12.x86_64.  Looking at ftrace results with the function graph tracer, it seems that the kernel is stuck in the following loop in get_futex_key():

again:
        err = get_user_pages_fast(address, 1, rw == VERIFY_WRITE, &page);
        if (err < 0)
                return err;

        page = compound_head(page);
        lock_page(page);
        if (!page->mapping) {
                unlock_page(page);
                put_page(page);
                goto again;
        }

Comment 2 r6144 2010-02-07 13:50:33 UTC
(I have not bisected the bug, and the following is just my speculation based on a limited understanding of the Linux VM.  Please correct me if I'm wrong.)

This looks like a regression introduced in commit 38d47c1b ("futex: rely on get_user_pages() for shared futexes"), which was merged in 2.6.29.  Previously get_futex_key() deems a page to be anonymous if it comes from a private mapping, which was reliable but needed mmap_sem.  With that patch, page->mapping is used directly.  However, anonymous huge pages are created in hugetlb.c:hugetlb_cow(), and their page->mapping is apparently never set, so the above code goes into an infinite loop.

Comment 3 r6144 2010-04-16 09:07:31 UTC
The bug still exists in kernel 2.6.32.11-99.fc12 and libhugetlbfs-2.8-1.fc12.

Comment 4 r6144 2010-04-20 09:18:52 UTC
Reported upstream on LKML.

Comment 5 Chuck Ebbert 2010-04-21 20:48:38 UTC
Proposed fix:

https://patchwork.kernel.org/patch/93525/

Comment 6 Chuck Ebbert 2010-04-27 12:11:58 UTC
Fixed in 2.6.32.12-112

Comment 7 Fedora Update System 2010-04-28 04:38:06 UTC
kernel-2.6.32.12-114.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.32.12-114.fc12

Comment 8 Fedora Update System 2010-05-17 05:49:39 UTC
kernel-2.6.32.12-115.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.32.12-115.fc12

Comment 9 Fedora Update System 2010-05-18 21:58:33 UTC
kernel-2.6.32.12-115.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 10 Bug Zapper 2010-11-04 01:53:32 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Bug Zapper 2010-12-04 00:52:57 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.