Bug 1401665 - Fix process shared robust mutex defects. [NEEDINFO]
Summary: Fix process shared robust mutex defects.
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc
Version: 7.4
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: 7.6
Assignee: Carlos O'Donell
QA Contact: Sergey Kolosov
Tomas Capek
URL:
Whiteboard:
Keywords: Patch
: 1591087 (view as bug list)
Depends On:
Blocks: 1420851 1477664 1505884 1405095
TreeView+ depends on / blocked
 
Reported: 2016-12-05 19:34 UTC by VL
Modified: 2018-11-02 14:28 UTC (History)
23 users (show)

(edit)
Fixes of bugs in process-shared robust mutex

Previously, bugs in the implementation of POSIX thread robust mutexes could cause such mutexes to fail to be recoverable. Consequently, users of the interface could see a process terminate but not return *EOWNERDEAD* to another process that was trying to recover the mutex. This update fixes the threading library and all known and fixable bugs in robust mutexes.
Clone Of:
: 1405095 (view as bug list)
(edit)
Last Closed: 2018-10-30 09:36:24 UTC
tcapek: needinfo? (codonell)


Attachments (Terms of Use)
Test program (4.62 KB, text/x-csrc)
2016-12-05 19:34 UTC, VL
no flags Details
Test Code v2 (5.43 KB, text/plain)
2016-12-17 04:55 UTC, Amitay Isaacs
no flags Details
Test Code v3 (7.13 KB, text/x-csrc)
2018-06-13 19:45 UTC, Carlos O'Donell
no flags Details
Robust mutex test (v4) (9.84 KB, text/x-csrc)
2018-08-28 14:10 UTC, Carlos O'Donell
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:3092 None None None 2018-10-30 09:38 UTC
Sourceware 14485 None None None 2019-04-24 18:48 UTC
Sourceware 17463 None None None 2019-04-24 18:48 UTC
Sourceware 19089 None None None 2019-04-24 18:48 UTC
Sourceware 19402 None None None 2019-04-24 18:48 UTC
Sourceware 20973 None None None 2019-04-24 18:48 UTC
Sourceware 20985 None None None 2019-04-24 18:48 UTC
Sourceware 21083 None None None 2019-04-24 18:48 UTC
Sourceware 21778 None None None 2019-04-24 18:48 UTC
Red Hat Bugzilla 1405095 None CLOSED Process shared robust mutexes miss wakeup events 2019-04-24 18:48 UTC
Red Hat Bugzilla 1591087 None CLOSED glibc: SIGABRT when qemu-kvm calls sem_timedwait() 2019-04-24 18:48 UTC

Internal Trackers: 1405095 1591087

Description VL 2016-12-05 19:34:59 UTC
Created attachment 1228146 [details]
Test program

Description of problem:

Process Shared Robust Mutexes can fail to wake up waiting mutex_lock callers when a process dies under SIGKILL

Version-Release number of selected component (if applicable):


How reproducible:

A few seconds of running a test program in a while : ; do ; done loop shows the problem

Steps to Reproduce:
1. Compile the attached program test_mutex_raw.c
2. Run it in a tight while true ; do ./test_mutex_raw; done
3. Wait a few iterations, watch it hang

Actual results:

At some point the program hangs with a few children sitting in a kernel mutex lock call with nobody holding the lock

Expected results:

Reliably run the program forever

Additional info:

https://lists.samba.org/archive/samba-technical/2016-December/117466.html together with thread-folluw-ups has some background. Under FreeBSD 11 with robust mutexes it runs fine. The program was found to hang under RHEL7, RHEL6, (unknown subversions, test was not done by me), Debian Jessie X86_64 as a VMWare guest, and also on a Raspi2 with Debian Testing as of December 1.

Comment 1 VL 2016-12-05 19:38:45 UTC
An additional comment: I've assigned it to glibc first, because that is our (Samba/tdb)'s point of contact. I know that robust mutexes are closely related to both glibc and kernel, so it might well be that this is in fact a kernel issue.

Comment 2 Florian Weimer 2016-12-05 19:48:35 UTC
(In reply to VL from comment #0)
> 1. Compile the attached program test_mutex_raw.c
> 2. Run it in a tight while true ; do ./test_mutex_raw; done

Could you please double-check if you have attached the right reproducer?  The C program expects a few command line arguments.

Comment 3 VL 2016-12-05 19:59:09 UTC
I am soo sorry to waste your time. In the additional info there was the correct line:

while true ; do ./bin/test_mutex_raw /tmp/foobar 10 0 ; done

Please tell me about your exact paths where you are compiling the program and where your account on your test machine is allowed to create files, such that I can get you better information on how to reproduce the issue.

Comment 4 Florian Weimer 2016-12-06 09:15:10 UTC
I can reproduce this.  The test case looks like it should not hang.

Comment 6 Torvald Riegel 2016-12-07 16:38:55 UTC
This may be related to statements about design issues in robust mutexes:
https://sourceware.org/ml/libc-alpha/2015-04/msg00151.html

I haven't had time to review these statements in detail yet, so I can't yet comment in detail.

Comment 7 Michael Adam 2016-12-13 23:21:26 UTC
It seems to me that this glibc bug is related if not actually the same
phenomenon:

https://sourceware.org/bugzilla/show_bug.cgi?id=19402

@Torvald, it seems that the class of problems that the post you cited mentions is more about data corruption than about hanging, but it may have related causes.

(e.g. https://sourceware.org/bugzilla/show_bug.cgi?id=14485)

Is there any libc expert who can look more deeply into these issues?
We do have two reproducers, and this is really critical for us (Samba).

Thanks!

Comment 8 Torvald Riegel 2016-12-14 11:56:00 UTC
(In reply to Michael Adam from comment #7)
> Is there any libc expert who can look more deeply into these issues?
> We do have two reproducers, and this is really critical for us (Samba).

I'm currently reviewing the synchronization in robust mutexes (and I'm glibc's concurrency maintainer).  I believe I have found the bug that the test case attached to this BZ triggers.  This one can be fixed rather easily.

However, I've found other bugs in the synchronization that need to be fixed too.  One is a problem in the kernel, there's another potential problem in glibc (although that may happen to just work right now), and the biggest one is that the mutex destruction requirements are not guaranteed currently (which will need both glibc and kernel changes to be fixed).

So, stay tuned :)

Comment 9 Michael Adam 2016-12-14 13:17:21 UTC
is this one related?

https://bugzilla.redhat.com/show_bug.cgi?id=1302389

Comment 10 Torvald Riegel 2016-12-15 22:09:20 UTC
I created an upstream bug for this bug here:
https://sourceware.org/bugzilla/show_bug.cgi?id=20973

(In reply to Michael Adam from comment #9)
> is this one related?
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1302389

No, this seems to be a separate issue.  I'm looking into that one as well.

Comment 11 Carlos O'Donell 2016-12-16 00:17:20 UTC
We understand this issue is critical for Samba and we are continuing to look at the robust shared mutex support code in glibc and the linux kernel. When we have a concrete result we'll update this issue.

Comment 13 Torvald Riegel 2016-12-16 10:51:28 UTC
swbz 19402 is a separate bug, at least regarding the reproducer posted in this bug here.  The reproducer from this bug worked flawlessly for me for an hour after applying this patch to upstream glibc:
https://sourceware.org/ml/libc-alpha/2016-12/msg00587.html

Comment 14 Florian Weimer 2016-12-16 14:51:07 UTC
The test case has its own race condition:

	while (sig_child_count < num_children) {
		if (sig_received) {
			int status;

			while (1) {
				pid = waitpid(-1, &status, WNOHANG);
				if (pid <= 0) {
					break;
				}

				sig_child_count += 1;
			}
			sig_received = false;
		}
		usleep(100*1000);
	}

The assignment to sig_received could hide the arrival of a new SIGCHLD signal and the update of the same variable in the signal handler.

Comment 15 Torvald Riegel 2016-12-16 22:10:07 UTC
(In reply to Florian Weimer from comment #14)
> The assignment to sig_received could hide the arrival of a new SIGCHLD
> signal and the update of the same variable in the signal handler.

This is true (and I've observed this to happen while testing).  I've been testing with a loop like this one:

	while (sig_child_count < num_children) {
	    int status;
	    pid = wait (&status);
	    if (pid < 0)
	      {
		if (errno == EINTR)
		  continue;
		fprintf (stderr, "waitpid failed\n");
		exit (1);
	      }
	    sig_child_count++;
	}

Comment 16 Amitay Isaacs 2016-12-17 04:55 UTC
Created attachment 1232820 [details]
Test Code v2

Comment 17 Amitay Isaacs 2016-12-17 04:58:05 UTC
(In reply to Torvald Riegel from comment #15)
> (In reply to Florian Weimer from comment #14)
> > The assignment to sig_received could hide the arrival of a new SIGCHLD
> > signal and the update of the same variable in the signal handler.
> 
> This is true (and I've observed this to happen while testing).  I've been
> testing with a loop like this one:

I have attached slightly modified version of the test code which does not use signal handler.  This one should be free from the race condition.

Comment 19 Carlos O'Donell 2016-12-19 23:58:13 UTC
We have a build of a test fix for this issue on top of what will be released as the final rhel-6.9 glibc.

For now this fix is not for use in production environments and should only be used to validate in RHEL 6 that the issue is fixed.

You can download the new RPMs by adding a new repo here:

cat >> /etc/yum.repos.d/rhbz1401665.repo <<EOF
[rhbz1401665]
name=RHEL 6.9 hotfix for bug 1401665
baseurl=http://people.redhat.com/codonell/rhel-6.9-rhbz1401665
enabled=1
gpgcheck=0
protect=1
EOF

Then you can upgrade glibc e.g. yum upgrade glibc, to install the new packages.

The fixed packages include Torvald's upstream fix from comment #13 and have gone through some minimal engineering sanity testing along with validation that the fixes allow the reproducer in comment #16 to run without issue.

Comment 20 Carlos O'Donell 2016-12-20 00:48:31 UTC
We have a build of a test fix for this issue on top rhel-7.3 glibc.

For now this fix is not for use in production environments and should only be used to validate in RHEL 7 that the issue is fixed.

You can download the new RPMs by adding a new repo here:

cat >> /etc/yum.repos.d/rhbz1401665.repo <<EOF
[rhbz1401665]
name=RHEL 7.3 hotfix for bug 1401665
baseurl=http://people.redhat.com/codonell/rhel-7.3-rhbz1401665
enabled=1
gpgcheck=0
protect=1
EOF

Then you can upgrade glibc e.g. yum upgrade glibc, to install the new packages.

The fixed packages include Torvald's upstream fix from comment #13 and have gone through some minimal engineering sanity testing along with validation that the fixes allow the reproducer in comment #16 to run without issue.

Comment 22 Amitay Isaacs 2017-02-27 01:02:07 UTC
I have verified with RHEL 6.9 hotfix RPMs that the bug is fixed.

What versions of RHEL6 and RHEL7 will have this fix backported?

Comment 23 Carlos O'Donell 2017-03-09 03:31:46 UTC
(In reply to Amitay Isaacs from comment #22)
> I have verified with RHEL 6.9 hotfix RPMs that the bug is fixed.
> 
> What versions of RHEL6 and RHEL7 will have this fix backported?

We will review this bug for inclusion in RHEL 7.5 and RHEL 6.10.

Comment 31 Carlos O'Donell 2018-02-27 17:21:51 UTC
This issue is included in our Red Hat Enterprise Linux 7.6 review, and the robust mutex fixes will be considered for backporting here after we evaluate the risk and depth of the backport.

Comment 34 Carlos O'Donell 2018-06-13 09:06:35 UTC
The following upstream bug fixes were backported:

Bug 20973 - robust mutexes: Lost wake-ups
https://sourceware.org/bugzilla/show_bug.cgi?id=20973
- Commit 353683a22ed8a493a6bd1d78d63e144bc3e85d2f

Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on wrong condition
https://sourceware.org/bugzilla/show_bug.cgi?id=20985
- Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
  35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
- Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be fine.

Bug 19402 - Deadlock with robust shared mutex and asynchronous termination
https://sourceware.org/bugzilla/show_bug.cgi?id=19402
- Commit 8e31cafb268938729a1314806a924d73fb1991c5

"Add compiler barriers around modifications of the robust mutex list."
- Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e

Bug 21778 - mutex: Fix robust mutex lock acquire
https://sourceware.org/bugzilla/show_bug.cgi?id=21778
- Commit 5920a4a624b1f4db310d1c44997b640e2a4653e5
- Fixes bug in the latest fix.

For testing we add:

commit faf0e9c84119742dd9ebb79060faa22c52ae80a1
Author: Florian Weimer <fweimer@redhat.com>
Date:   Fri Jan 27 06:53:19 2017 +0100

    nptl: Add tst-robust-fork

Which tests the fork issues.

Otherwise we rely on the existing tests to verify robust mutext support.

The 'test code v2' attached to this issue will still hang, but always with a thread as owner of the robust mutex.

Comment 35 Carlos O'Donell 2018-06-13 19:45 UTC
Created attachment 1451011 [details]
Test Code v3

I rewrote the test code to use a single shared memory segment, and to avoid some of the undefined behaviour, like initializing the same mutex twice which happens when the low and high priority processes start (both do the initialization and that's dangerous.) Instead now we start an "init" process to start the shared memory segment, and then everything else just uses it without truncation or unlinking (same mutex).

Comment 36 Carlos O'Donell 2018-06-14 12:32:26 UTC
I am going to drop these two fixes for rhel-7.6 since they have caused some regression in the semaphore uses which rely on the existing assembly futex operations.

Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on wrong condition
https://sourceware.org/bugzilla/show_bug.cgi?id=20985
- Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
  35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
- Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be fine.

"Add compiler barriers around modifications of the robust mutex list."
- Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e

We will file another issue to go back to the fix for bug 20985 in rhel-7.7.

At the very least in rhel-7.6 we will fix the lost wakeup and the fork synchronization issues.

Comment 37 Carlos O'Donell 2018-06-14 18:51:45 UTC
(In reply to Carlos O'Donell from comment #36)
> I am going to drop these two fixes for rhel-7.6 since they have caused some
> regression in the semaphore uses which rely on the existing assembly futex
> operations.
> 
> Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on
> wrong condition
> https://sourceware.org/bugzilla/show_bug.cgi?id=20985
> - Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
>   35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
> - Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be
> fine.
> 
> "Add compiler barriers around modifications of the robust mutex list."
> - Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> 
> We will file another issue to go back to the fix for bug 20985 in rhel-7.7.
> 
> At the very least in rhel-7.6 we will fix the lost wakeup and the fork
> synchronization issues.

Thanks to Florian Weimer we found the source of the bug that was blocking these issues. I've verified with v3 reproducer that it now passes and runs for hours without any failure on a >8 core boxes. Looks like we have stable robust mutexes in RHEL 7.6 for the set of problems we set out to solve this release.

Comment 38 Carlos O'Donell 2018-06-14 19:43:24 UTC
All fixed in -258 now.

Comment 39 Carlos O'Donell 2018-06-14 19:45:01 UTC
*** Bug 1591087 has been marked as a duplicate of this bug. ***

Comment 41 Carlos O'Donell 2018-08-28 14:10 UTC
Created attachment 1479264 [details]
Robust mutex test (v4)

Comment 44 errata-xmlrpc 2018-10-30 09:36:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3092


Note You need to log in before you can comment on or make changes to this bug.