Bug 452693
| Summary: | POSIX timer set to fire immediately does not fire | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Roland Westrelin <roland.westrelin> |
| Component: | realtime-kernel | Assignee: | Steven Rostedt <srostedt> |
| Status: | CLOSED ERRATA | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | 1.0 | CC: | bhu, David.Holmes, pzijlstr, srostedt, tglx |
| Target Milestone: | 1.0.1 | ||
| Target Release: | --- | ||
| Hardware: | i386 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2008-08-26 19:57:46 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
|
Description
Roland Westrelin
2008-06-24 14:52:53 UTC
Created attachment 310481 [details]
reproducer for timer_settime(3p) bug
Test program that shows the timer_settime(3p) bug
Created attachment 310594 [details]
updated reproducer for timer_settime(3p) problem
Changed to just detect signal fire rather than use pause
When i wrote the above reproducer, I noticed that I was seeing the printf from the signal handler before entering the pause. Luis changed the test case to just set a variable, usleep for a bit after setting the timer, then check if the signal handler had modified the variable. So far we have not seen a case on our -rt kernels where the signal has not been delivered. Do you have a different reproducer we could try? Created attachment 310598 [details]
shell script to run reproducer until it fails
Shell script to run the reproducer either as sched_other or sched_fifo until it
fails or until ctl-C is hit
Update since I last posted; we have not seen this behavior on a SCHED_OTHER thread, but can reliably reproduce it on a SCHED_FIFO thread. After doing different tests I also noticed that when the reproducer runs as SCHED_FIFO we have some eventual fails. It doesn't matter whether it runs at priority 2, 30 or 97, as long as it runs as SCHED_FIFO. I started a new set of tests to narrow this issue down. As a side note, I was unable to reproduce this behavior with the rt-vanilla kernel. Created attachment 310963 [details]
hrtimer: prevent migration for raising CPU
From: Steven Rostedt <srostedt>
Subject: hrtimer: prevent migration for raising CPU
Due to a possible deadlock, the waking of the softirq was pushed outside
of the hrtimer base locks. Unfortunately this allows the task to migrate
after setting up the softirq and raising it. Since softirqs run a queue that
is per-cpu we may raise the softirq on the wrong CPU and this will keep
the queued softirq task from running.
To solve this issue, this patch disables preemption around the releasing
of the hrtimer lock and raising of the softirq.
I confirm this is fixed in -72. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0585.html |