1401665 – Fix process shared robust mutex defects.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1401665 - Fix process shared robust mutex defects.

Summary: Fix process shared robust mutex defects.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	7.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	7.6
Assignee:	Carlos O'Donell
QA Contact:	Sergey Kolosov
Docs Contact:	Tomas Capek
URL:
Whiteboard:
Duplicates (1):	1591087 (view as bug list)
Depends On:
Blocks:	1405095 1420851 1477664 1505884
TreeView+	depends on / blocked

Reported:	2016-12-05 19:34 UTC by VL
Modified:	2021-03-11 14:51 UTC (History)
CC List:	23 users (show)
Fixed In Version:	glibc-2.17-258
Doc Type:	Bug Fix
Doc Text:	Fixes of bugs in process-shared robust mutex Previously, bugs in the implementation of POSIX thread robust mutexes could cause such mutexes to fail to be recoverable. Consequently, users of the interface could see a process terminate but not return EOWNERDEAD to another process that was trying to recover the mutex. This update fixes the threading library and all known and fixable bugs in robust mutexes.
Clone Of:
Clones:	1405095 (view as bug list)
Environment:
Last Closed:	2018-10-30 09:36:24 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Test program (4.62 KB, text/x-csrc) 2016-12-05 19:34 UTC, VL	no flags	Details
Test Code v2 (5.43 KB, text/plain) 2016-12-17 04:55 UTC, Amitay Isaacs	no flags	Details
Test Code v3 (7.13 KB, text/x-csrc) 2018-06-13 19:45 UTC, Carlos O'Donell	no flags	Details
Robust mutex test (v4) (9.84 KB, text/x-csrc) 2018-08-28 14:10 UTC, Carlos O'Donell	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1405095	unspecified	CLOSED	Process shared robust mutexes miss wakeup events	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1591087	unspecified	CLOSED	glibc: SIGABRT when qemu-kvm calls sem_timedwait()	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHSA-2018:3092	None	None	None	2018-10-30 09:38:20 UTC
Sourceware	14485	P2	ASSIGNED	File corruption race condition in robust mutex unlocking	2020-12-22 08:50:42 UTC
Sourceware	17463	P2	NEW	mutexes owned by a terminated thread are supposed to be owned by an imaginary thread	2020-12-22 08:50:42 UTC
Sourceware	19089	P2	NEW	Robust mutexes do not take ROBUST_LIST_LIMIT into account	2020-12-22 08:50:44 UTC
Sourceware	19402	P2	RESOLVED	Deadlock with robust shared mutex and asynchronous termination	2020-12-22 08:50:44 UTC
Sourceware	20973	P2	RESOLVED	robust mutexes: Lost wake-ups	2020-12-22 08:50:44 UTC
Sourceware	20985	P2	RESOLVED	robust mutexes: lowlevelrobustlock assembly on x86 blocks on wrong condition	2020-12-22 08:50:44 UTC
Sourceware	21083	P2	UNCONFIRMED	robust timed lock may leave other waiters no chance to wake up	2020-12-22 08:50:45 UTC
Sourceware	21778	P2	RESOLVED	Robust mutex may deadlock	2020-12-22 08:50:46 UTC

Internal Links: 1405095 1591087

Description VL 2016-12-05 19:34:59 UTC

Created attachment 1228146 [details]
Test program

Description of problem:

Process Shared Robust Mutexes can fail to wake up waiting mutex_lock callers when a process dies under SIGKILL

Version-Release number of selected component (if applicable):


How reproducible:

A few seconds of running a test program in a while : ; do ; done loop shows the problem

Steps to Reproduce:
1. Compile the attached program test_mutex_raw.c
2. Run it in a tight while true ; do ./test_mutex_raw; done
3. Wait a few iterations, watch it hang

Actual results:

At some point the program hangs with a few children sitting in a kernel mutex lock call with nobody holding the lock

Expected results:

Reliably run the program forever

Additional info:

https://lists.samba.org/archive/samba-technical/2016-December/117466.html together with thread-folluw-ups has some background. Under FreeBSD 11 with robust mutexes it runs fine. The program was found to hang under RHEL7, RHEL6, (unknown subversions, test was not done by me), Debian Jessie X86_64 as a VMWare guest, and also on a Raspi2 with Debian Testing as of December 1.

Comment 1 VL 2016-12-05 19:38:45 UTC

An additional comment: I've assigned it to glibc first, because that is our (Samba/tdb)'s point of contact. I know that robust mutexes are closely related to both glibc and kernel, so it might well be that this is in fact a kernel issue.

Comment 2 Florian Weimer 2016-12-05 19:48:35 UTC

(In reply to VL from comment #0)
> 1. Compile the attached program test_mutex_raw.c
> 2. Run it in a tight while true ; do ./test_mutex_raw; done

Could you please double-check if you have attached the right reproducer?  The C program expects a few command line arguments.

Comment 3 VL 2016-12-05 19:59:09 UTC

I am soo sorry to waste your time. In the additional info there was the correct line:

while true ; do ./bin/test_mutex_raw /tmp/foobar 10 0 ; done

Please tell me about your exact paths where you are compiling the program and where your account on your test machine is allowed to create files, such that I can get you better information on how to reproduce the issue.

Comment 4 Florian Weimer 2016-12-06 09:15:10 UTC

I can reproduce this.  The test case looks like it should not hang.

Comment 6 Torvald Riegel 2016-12-07 16:38:55 UTC

This may be related to statements about design issues in robust mutexes:
https://sourceware.org/ml/libc-alpha/2015-04/msg00151.html

I haven't had time to review these statements in detail yet, so I can't yet comment in detail.

Comment 7 Michael Adam 2016-12-13 23:21:26 UTC

It seems to me that this glibc bug is related if not actually the same
phenomenon:

https://sourceware.org/bugzilla/show_bug.cgi?id=19402

@Torvald, it seems that the class of problems that the post you cited mentions is more about data corruption than about hanging, but it may have related causes.

(e.g. https://sourceware.org/bugzilla/show_bug.cgi?id=14485)

Is there any libc expert who can look more deeply into these issues?
We do have two reproducers, and this is really critical for us (Samba).

Thanks!

Comment 8 Torvald Riegel 2016-12-14 11:56:00 UTC

(In reply to Michael Adam from comment #7)
> Is there any libc expert who can look more deeply into these issues?
> We do have two reproducers, and this is really critical for us (Samba).

I'm currently reviewing the synchronization in robust mutexes (and I'm glibc's concurrency maintainer).  I believe I have found the bug that the test case attached to this BZ triggers.  This one can be fixed rather easily.

However, I've found other bugs in the synchronization that need to be fixed too.  One is a problem in the kernel, there's another potential problem in glibc (although that may happen to just work right now), and the biggest one is that the mutex destruction requirements are not guaranteed currently (which will need both glibc and kernel changes to be fixed).

So, stay tuned :)

Comment 9 Michael Adam 2016-12-14 13:17:21 UTC

is this one related?

https://bugzilla.redhat.com/show_bug.cgi?id=1302389

Comment 10 Torvald Riegel 2016-12-15 22:09:20 UTC

I created an upstream bug for this bug here:
https://sourceware.org/bugzilla/show_bug.cgi?id=20973

(In reply to Michael Adam from comment #9)
> is this one related?
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1302389

No, this seems to be a separate issue.  I'm looking into that one as well.

Comment 11 Carlos O'Donell 2016-12-16 00:17:20 UTC

We understand this issue is critical for Samba and we are continuing to look at the robust shared mutex support code in glibc and the linux kernel. When we have a concrete result we'll update this issue.

Comment 13 Torvald Riegel 2016-12-16 10:51:28 UTC

swbz 19402 is a separate bug, at least regarding the reproducer posted in this bug here.  The reproducer from this bug worked flawlessly for me for an hour after applying this patch to upstream glibc:
https://sourceware.org/ml/libc-alpha/2016-12/msg00587.html

Comment 14 Florian Weimer 2016-12-16 14:51:07 UTC

The test case has its own race condition:

	while (sig_child_count < num_children) {
		if (sig_received) {
			int status;

			while (1) {
				pid = waitpid(-1, &status, WNOHANG);
				if (pid <= 0) {
					break;
				}

				sig_child_count += 1;
			}
			sig_received = false;
		}
		usleep(100*1000);
	}

The assignment to sig_received could hide the arrival of a new SIGCHLD signal and the update of the same variable in the signal handler.

Comment 15 Torvald Riegel 2016-12-16 22:10:07 UTC

(In reply to Florian Weimer from comment #14)
> The assignment to sig_received could hide the arrival of a new SIGCHLD
> signal and the update of the same variable in the signal handler.

This is true (and I've observed this to happen while testing).  I've been testing with a loop like this one:

	while (sig_child_count < num_children) {
	    int status;
	    pid = wait (&status);
	    if (pid < 0)
	      {
		if (errno == EINTR)
		  continue;
		fprintf (stderr, "waitpid failed\n");
		exit (1);
	      }
	    sig_child_count++;
	}

Comment 16 Amitay Isaacs 2016-12-17 04:55:15 UTC

Created attachment 1232820 [details]
Test Code v2

Comment 17 Amitay Isaacs 2016-12-17 04:58:05 UTC

(In reply to Torvald Riegel from comment #15)
> (In reply to Florian Weimer from comment #14)
> > The assignment to sig_received could hide the arrival of a new SIGCHLD
> > signal and the update of the same variable in the signal handler.
> 
> This is true (and I've observed this to happen while testing).  I've been
> testing with a loop like this one:

I have attached slightly modified version of the test code which does not use signal handler.  This one should be free from the race condition.

Comment 19 Carlos O'Donell 2016-12-19 23:58:13 UTC

We have a build of a test fix for this issue on top of what will be released as the final rhel-6.9 glibc.

For now this fix is not for use in production environments and should only be used to validate in RHEL 6 that the issue is fixed.

You can download the new RPMs by adding a new repo here:

cat >> /etc/yum.repos.d/rhbz1401665.repo <<EOF
[rhbz1401665]
name=RHEL 6.9 hotfix for bug 1401665
baseurl=http://people.redhat.com/codonell/rhel-6.9-rhbz1401665
enabled=1
gpgcheck=0
protect=1
EOF

Then you can upgrade glibc e.g. yum upgrade glibc, to install the new packages.

The fixed packages include Torvald's upstream fix from comment #13 and have gone through some minimal engineering sanity testing along with validation that the fixes allow the reproducer in comment #16 to run without issue.

Comment 20 Carlos O'Donell 2016-12-20 00:48:31 UTC

We have a build of a test fix for this issue on top rhel-7.3 glibc.

For now this fix is not for use in production environments and should only be used to validate in RHEL 7 that the issue is fixed.

You can download the new RPMs by adding a new repo here:

cat >> /etc/yum.repos.d/rhbz1401665.repo <<EOF
[rhbz1401665]
name=RHEL 7.3 hotfix for bug 1401665
baseurl=http://people.redhat.com/codonell/rhel-7.3-rhbz1401665
enabled=1
gpgcheck=0
protect=1
EOF

Then you can upgrade glibc e.g. yum upgrade glibc, to install the new packages.

The fixed packages include Torvald's upstream fix from comment #13 and have gone through some minimal engineering sanity testing along with validation that the fixes allow the reproducer in comment #16 to run without issue.

Comment 22 Amitay Isaacs 2017-02-27 01:02:07 UTC

I have verified with RHEL 6.9 hotfix RPMs that the bug is fixed.

What versions of RHEL6 and RHEL7 will have this fix backported?

Comment 23 Carlos O'Donell 2017-03-09 03:31:46 UTC

(In reply to Amitay Isaacs from comment #22)
> I have verified with RHEL 6.9 hotfix RPMs that the bug is fixed.
> 
> What versions of RHEL6 and RHEL7 will have this fix backported?

We will review this bug for inclusion in RHEL 7.5 and RHEL 6.10.

Comment 31 Carlos O'Donell 2018-02-27 17:21:51 UTC

This issue is included in our Red Hat Enterprise Linux 7.6 review, and the robust mutex fixes will be considered for backporting here after we evaluate the risk and depth of the backport.

Comment 34 Carlos O'Donell 2018-06-13 09:06:35 UTC

The following upstream bug fixes were backported:

Bug 20973 - robust mutexes: Lost wake-ups
https://sourceware.org/bugzilla/show_bug.cgi?id=20973
- Commit 353683a22ed8a493a6bd1d78d63e144bc3e85d2f

Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on wrong condition
https://sourceware.org/bugzilla/show_bug.cgi?id=20985
- Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
  35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
- Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be fine.

Bug 19402 - Deadlock with robust shared mutex and asynchronous termination
https://sourceware.org/bugzilla/show_bug.cgi?id=19402
- Commit 8e31cafb268938729a1314806a924d73fb1991c5

"Add compiler barriers around modifications of the robust mutex list."
- Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e

Bug 21778 - mutex: Fix robust mutex lock acquire
https://sourceware.org/bugzilla/show_bug.cgi?id=21778
- Commit 5920a4a624b1f4db310d1c44997b640e2a4653e5
- Fixes bug in the latest fix.

For testing we add:

commit faf0e9c84119742dd9ebb79060faa22c52ae80a1
Author: Florian Weimer <fweimer>
Date:   Fri Jan 27 06:53:19 2017 +0100

    nptl: Add tst-robust-fork

Which tests the fork issues.

Otherwise we rely on the existing tests to verify robust mutext support.

The 'test code v2' attached to this issue will still hang, but always with a thread as owner of the robust mutex.

Comment 35 Carlos O'Donell 2018-06-13 19:45:36 UTC

Created attachment 1451011 [details]
Test Code v3

I rewrote the test code to use a single shared memory segment, and to avoid some of the undefined behaviour, like initializing the same mutex twice which happens when the low and high priority processes start (both do the initialization and that's dangerous.) Instead now we start an "init" process to start the shared memory segment, and then everything else just uses it without truncation or unlinking (same mutex).

Comment 36 Carlos O'Donell 2018-06-14 12:32:26 UTC

I am going to drop these two fixes for rhel-7.6 since they have caused some regression in the semaphore uses which rely on the existing assembly futex operations.

Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on wrong condition
https://sourceware.org/bugzilla/show_bug.cgi?id=20985
- Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
  35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
- Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be fine.

"Add compiler barriers around modifications of the robust mutex list."
- Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e

We will file another issue to go back to the fix for bug 20985 in rhel-7.7.

At the very least in rhel-7.6 we will fix the lost wakeup and the fork synchronization issues.

Comment 37 Carlos O'Donell 2018-06-14 18:51:45 UTC

(In reply to Carlos O'Donell from comment #36)
> I am going to drop these two fixes for rhel-7.6 since they have caused some
> regression in the semaphore uses which rely on the existing assembly futex
> operations.
> 
> Bug 20985 - robust mutexes: lowlevelrobustlock assembly on x86 blocks on
> wrong condition
> https://sourceware.org/bugzilla/show_bug.cgi?id=20985
> - Commit 65810f0ef05e8c9e333f17a44e77808b163ca298, and
>   35df5a77f3ad2a35761631928440d2994a9e4bc5 (typo fix).
> - Yes, RHEL 7.6 is using the lll_futex_timed_wait fallback, which should be
> fine.
> 
> "Add compiler barriers around modifications of the robust mutex list."
> - Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> 
> We will file another issue to go back to the fix for bug 20985 in rhel-7.7.
> 
> At the very least in rhel-7.6 we will fix the lost wakeup and the fork
> synchronization issues.

Thanks to Florian Weimer we found the source of the bug that was blocking these issues. I've verified with v3 reproducer that it now passes and runs for hours without any failure on a >8 core boxes. Looks like we have stable robust mutexes in RHEL 7.6 for the set of problems we set out to solve this release.

Comment 38 Carlos O'Donell 2018-06-14 19:43:24 UTC

All fixed in -258 now.

Comment 39 Carlos O'Donell 2018-06-14 19:45:01 UTC

*** Bug 1591087 has been marked as a duplicate of this bug. ***

Comment 41 Carlos O'Donell 2018-08-28 14:10:43 UTC

Created attachment 1479264 [details]
Robust mutex test (v4)

Comment 44 errata-xmlrpc 2018-10-30 09:36:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3092

Note You need to log in before you can comment on or make changes to this bug.

amitay
ashankar
asn
bhubbard
codonell
cww
fweimer
gdeschner
hannsj_uhl
hhan
hmatsumo
madam
martin
mcermak
metze
mnewsome
pfrankli
rcyriac
skolosov
slow
triegel
vl
vslavik