New POSIX semaphore destruction semantics
Previously, the implementation of POSIX semaphores in *glibc* did not follow the current POSIX requirements for semaphores to be self-synchronizing. As a consequence, the *sem_post()* and *sem_wait()* functions could terminate unexpectedly or return the EINVAL error code because they accessed the semaphore
after it has been destroyed. This update provides an implementation of the new POSIX semaphore destruction semantics which keeps track of waiters, avoiding premature destruction of the semaphore. The semaphores implemented by *glibc* are now self-synchronizing, thus fixing this bug.
DescriptionMartin Schuppert
2013-11-06 15:41:55 UTC
Description of problem:
There appears to be a race in the implementation of sem_post/sem_wait (nptl/sysdeps/unix/sysv/linux/x86_64/sem_post.S in the source code) which sometimes causes sem_post to access freed memory and to fail with EINVAL.
In a nutshell, if sem_post happens to go to sleep right after it increments sem->value but before it looks at sem->nwaiters,
another thread can sail through a sem_wait without blocking and destroy the semaphore, so that when the sem_post thread wakes up and looks at sem->nwaiters, it is looking at already-freed (and possibly unmapped) memory.
Version-Release number of selected component (if applicable):
* RHEL6.4
* glibc-2.12-1.107.el6_4.5.x86_64
How reproducible:
always
Steps to Reproduce:
1. Compile the code in the attachement ( gcc -Wall -g sem.c -lpthread -o sem )
2. Stop the poster thread in sem_post right after it increments sem->value but before it looks at sem->nwaiters
Needs gdb 7.6 to reproduce (the one from devtoolset-2 can be used)
# /opt/rh/devtoolset-2/root/usr/bin/gdb ./sem
GNU gdb (GDB) Red Hat Enterprise Linux (7.6-34.el6)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/sem...done.
(gdb) set target-async 1
(gdb) set pagination off
(gdb) set non-stop on
(gdb) b poster
Breakpoint 1 at 0x400860: file sem.c, line 11.
(gdb) b sem_wait
Breakpoint 2 at 0x4006a0
(gdb) r
Starting program: /root/sem
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.5.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fee700 (LWP 5486)]
Breakpoint 2, 0x000000380400d6f0 in sem_wait () from /lib64/libpthread.so.0
(gdb)
Breakpoint 1, poster (unused=0x0) at sem.c:11
11 if (sem_post(varsem) != 0) {
(gdb) disas sem_post
Dump of assembler code for function sem_post:
0x000000380400d990 <+0>: mov (%rdi),%eax
0x000000380400d992 <+2>: cmp $0x7fffffff,%eax
0x000000380400d997 <+7>: je 0x380400d9cc <sem_post+60>
0x000000380400d999 <+9>: lea 0x1(%rax),%esi
0x000000380400d99c <+12>: lock cmpxchg %esi,(%rdi)
0x000000380400d9a0 <+16>: jne 0x380400d992 <sem_post+2>
0x000000380400d9a2 <+18>: cmpq $0x0,0x8(%rdi)
0x000000380400d9a7 <+23>: je 0x380400d9c2 <sem_post+50>
0x000000380400d9a9 <+25>: mov $0xca,%eax
0x000000380400d9ae <+30>: mov $0x1,%esi
0x000000380400d9b3 <+35>: or 0x4(%rdi),%esi
0x000000380400d9b6 <+38>: mov $0x1,%edx
0x000000380400d9bb <+43>: syscall
0x000000380400d9bd <+45>: test %rax,%rax
0x000000380400d9c0 <+48>: js 0x380400d9c5 <sem_post+53>
0x000000380400d9c2 <+50>: xor %eax,%eax
0x000000380400d9c4 <+52>: retq
0x000000380400d9c5 <+53>: mov $0x16,%eax
0x000000380400d9ca <+58>: jmp 0x380400d9d1 <sem_post+65>
0x000000380400d9cc <+60>: mov $0x4b,%eax
0x000000380400d9d1 <+65>: mov 0x20a5b0(%rip),%rdx # 0x3804217f88
0x000000380400d9d8 <+72>: mov %eax,%fs:(%rdx)
0x000000380400d9db <+75>: or $0xffffffff,%eax
0x000000380400d9de <+78>: retq
End of assembler dump.
(gdb) b *(sem_post+18) thread 2
Breakpoint 3 at 0x380400d9a2
(gdb) t 2
[Switching to thread 2 (Thread 0x7ffff7fee700 (LWP 5486))]
#0 poster (unused=0x0) at sem.c:11
11 if (sem_post(varsem) != 0) {
(gdb) c
Continuing.
Breakpoint 3, 0x000000380400d9a2 in sem_post () from /lib64/libpthread.so.0
(gdb) b free thread 1
Breakpoint 4 at 0x3803415f40 (2 locations)
(gdb) t 1
[Switching to thread 1 (Thread 0x7ffff7ff0700 (LWP 5482))]
#0 0x000000380400d6f0 in sem_wait () from /lib64/libpthread.so.0
(gdb) c
Continuing.
Breakpoint 4, 0x0000003803c7b710 in free () from /lib64/libc.so.6
(gdb) t 2
[Switching to thread 2 (Thread 0x7ffff7fee700 (LWP 5486))]
#0 0x000000380400d9a2 in sem_post () from /lib64/libpthread.so.0
(gdb) c
Continuing.
sem_post() in poster: Invalid argument
[Thread 0x7ffff7fee700 (LWP 5486) exited]
[Inferior 1 (process 5482) exited with code 01]
(gdb)
Actual results:
sem_post exits with EINVAL errno, or, when the memory is un-mmap'ed or un-sbrk'ed, SIGSEGVs
Expected results:
sem_post exits successfully
Additional info:
Tested on F18, RHEL6 and RHEL5
* glibc-2.16-34.fc18
* glibc-2.12-*
* glibc-2.5-*
Also discussed on sourceware.org: Bug 12674 - sem_post/sem_wait race causing sem_post to return EINVAL ( https://sourceware.org/bugzilla/show_bug.cgi?id=12674 )
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHSA-2016-2573.html