Bug 1889892 - glibc: pthread_cond_wait missed wakeup (swbz#25847)
Summary: glibc: pthread_cond_wait missed wakeup (swbz#25847)
Keywords:
Status: ASSIGNED
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 42
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Frédéric Bérat
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1979990
TreeView+ depends on / blocked
 
Reported: 2020-10-20 20:30 UTC by Michael Bacarella
Modified: 2025-04-29 17:51 UTC (History)
12 users (show)

Fixed In Version: glibc-2.40.9000-30.fc42, glibc-2.40-22.fc41
Clone Of:
: 1979990 (view as bug list)
Environment:
Last Closed: 2025-03-13 01:10:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
test case repro from sourceware entry (7.67 KB, text/x-csrc)
2020-10-20 20:34 UTC, Michael Bacarella
no flags Details
one-line patch to glibc that fixes the deadlock (863 bytes, patch)
2020-10-20 20:35 UTC, Michael Bacarella
no flags Details | Diff
testcase with abort() on stuck (7.84 KB, text/x-csrc)
2020-11-01 17:59 UTC, Török Edwin
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-1592 0 None None None 2025-04-15 13:27:07 UTC
Sourceware 25847 0 P2 UNCONFIRMED pthread_cond_signal failed to wake up pthread_cond_wait due to a bug in undoing stealing 2021-02-09 14:07:36 UTC

Description Michael Bacarella 2020-10-20 20:30:15 UTC
Description of problem:

This bug was submitted by Qin Li to glibc bugzilla earlier this year, with a one-line patch, though it hasn't been merged into glibc yet:

https://sourceware.org/bugzilla/show_bug.cgi?id=25847

Version-Release number of selected component: glibc-2.27 onwards

How reproducible: reliably, try the repro from the sourceware url above

Actual results: deadlocks after 30-120 minutes on a 4-core Fedora 32 box

Expected results: should never deadlock

Additional info:

This bug in pthread conditions will deadlock the OCaml runtime, as well as Python and .NET applications.

The bug was introduced in glibc 2.27 and is still present in glibc 2.31.

I confirm the repro from the above deadlocks on Fedora 32. Takes about 30-180 minutes on a 4 core server.

I further confirm that the one-line fix to glibc at the above applies cleanly to Fedora 32's glibc source rpm, and does not deadlock after running the repro for more than 30 hours.

Please kindly consider merging the one-line fix into Fedora glibc.

More background about this bug, for the sake of future internet searchers:
* https://discuss.ocaml.org/t/is-there-a-known-recent-linux-locking-bug-that-affects-the-ocaml-runtime

Comment 1 Michael Bacarella 2020-10-20 20:34:52 UTC
Created attachment 1722977 [details]
test case repro from sourceware entry

will deadlock

Comment 2 Michael Bacarella 2020-10-20 20:35:47 UTC
Created attachment 1722978 [details]
one-line patch to glibc that fixes the deadlock

Comment 3 Carlos O'Donell 2020-10-27 13:21:56 UTC
We are looking to fix this for Fedora and Red Hat Enterprise Linux 8 as this has impact to users on both platforms.

Comment 4 Török Edwin 2020-11-01 17:59:31 UTC
Created attachment 1725573 [details]
testcase with abort() on stuck

Small modification to upstream testcase that abort()s when the loop is stuck for several iterations.

Comment 6 Fedora Program Management 2021-04-29 17:06:51 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Carlos O'Donell 2021-04-29 20:14:38 UTC
Still a bug, and still in Rawhide.

Comment 8 Ben Cotton 2021-08-10 13:42:25 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 35 development cycle.
Changing version to 35.

Comment 9 Ben Cotton 2022-02-08 21:45:17 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle.
Changing version to 36.

Comment 10 Ben Cotton 2023-04-25 16:40:44 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 11 Fedora Release Engineering 2023-08-16 08:08:28 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle.
Changing version to 39.

Comment 12 Carlos O'Donell 2025-02-24 13:55:03 UTC
This is fixed in Fedora Rawhide (f43), and in Fedora 42 (f42), but not yet in f41 or f40.

Comment 13 Aoife Moloney 2025-02-26 12:50:58 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 42 development cycle.
Changing version to 42.

Comment 14 Carlos O'Donell 2025-03-07 14:56:27 UTC
Fedora 40 and Fedora 41 still need this fix.

Comment 15 Carlos O'Donell 2025-03-07 22:59:43 UTC
Fedora 41 scratch build with the fix:
https://koji.fedoraproject.org/koji/taskinfo?taskID=129956568

We can take this upstream to get a backport into glibc 2.40 and glibc 2.39, but upstream has to agree it would consider the upgrade scenario acceptable.

Until that point the next step is a backport into F41, and F40.

Comment 16 Fedora Update System 2025-03-11 16:41:04 UTC
FEDORA-2025-85313b82b2 (glibc-2.40-22.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-85313b82b2

Comment 17 Fedora Update System 2025-03-12 02:41:19 UTC
FEDORA-2025-85313b82b2 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2025-85313b82b2`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2025-85313b82b2

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2025-03-13 01:10:56 UTC
FEDORA-2025-85313b82b2 (glibc-2.40-22.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 19 Fedora Update System 2025-03-13 07:25:19 UTC
FEDORA-2025-0a780eb540 (glibc-2.40-23.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-0a780eb540

Comment 20 Florian Weimer 2025-03-13 19:07:07 UTC
Fedora 40 still needs to be fixed.

Comment 21 Fedora Update System 2025-03-14 04:39:01 UTC
FEDORA-2025-0a780eb540 (glibc-2.40-23.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 23 Carlos O'Donell 2025-04-29 17:51:42 UTC
Upstream release/2.40/master, and release/2.39/master are now fixed.

This issue can be resolved by syncing from release/2.39/master to F40.


Note You need to log in before you can comment on or make changes to this bug.