Bug 454679 - RHEL3.9: mttr race causes kernel hang on boot
RHEL3.9: mttr race causes kernel hang on boot
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
x86_64 Linux
low Severity low
: ---
: ---
Assigned To: Don Howard
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2008-07-09 13:57 EDT by Eli Collins
Modified: 2012-06-20 12:09 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-06-20 12:09:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Boot logging (41.58 KB, text/plain)
2008-07-09 13:57 EDT, Eli Collins
no flags Details

  None (edit)
Description Eli Collins 2008-07-09 13:57:20 EDT
There's a race in arch/x86_64/kernel/mtrr.c that can result in a kernel hang and
subsequent NMI lockup detection on boot on SMP systems. I've uploaded boot
output with loglevel 7 that illustrates this hang. It may also exist in i386.

The race is caused by set_mtrr_smp getting re-entered before other cpus have a
chance to leave ipi_handler. Here's pseudo code annotated with where the CPUs in
the uploaded serial logging are hung. I determined this from the RIPs in the
panics and by disassembling smp_call_function and ipi_handler.

The function set_mtrr_smp is executed by a single master cpu:

1. wait_barrier_mtrr_disable = TRUE
   wait_barrier_execute = TRUE 
   wait_barrier_cache_enable = TRUE

2. undone_count = 7
3. send ipis and wait for responses      # CPU 3
4. disable interrupts
5. spin while undone_count > 0

6. undone_count = 7
7. wait_barrier_mtrr_disable = FALSE
8. spin while undone_count > 0

9. undone_count = 7
10. wait_barrier_execute = FALSE
11. spin while undone_count > 0

12. wait_barrier_cache_enable = FALSE
13. enable interrupts

The function ipi_handler is executed by slave cpus:

1. disable interrupts
2. undone_count--
3. spin while wait_barrier_mtrr_disable  # CPUS 0,1,2,5,7

4. undone_count--
5. spin while wait_barrier_execute

6. undone_count--
7. spin while wait_barrier_cache_enable  # CPUS 4,6

8. enable interrupts

When all slave cpus reach step 6 they unblock the master cpu at step 11. If the
master can leave and re-enter set_mtrr_smp (via say back-to-back calls to
mtrr_add_page in mtrr_write) and set wait_barrier_cache_enable to TRUE (step 1)
then any slave that has not yet gone from step 6 to step 7 will hang in step 7.
The master gets to step 3, sends IPIs and waits for all other cpus to respond,
which they won't since they're spinning. The system is hung at this point and
NMI lockup detection will kick in.

This race is unlikely to occur since all slave cpus are likely to execute from
step 6 to step 7 before the master cpu re-enters set_mtrr_smp. However this may
occur more frequently in an overcommited virtual enviornment (the uploaded
example occurred in a VM).

A simple fix would be to have the master cpu defer exiting set_mtrr_smp until
all slave cpus have left ipi_handler.
Comment 1 Eli Collins 2008-07-09 13:57:20 EDT
Created attachment 311397 [details]
Boot logging
Comment 2 Jiri Pallich 2012-06-20 12:09:20 EDT
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.