Red Hat Bugzilla – Bug 454679
RHEL3.9: mttr race causes kernel hang on boot
Last modified: 2012-06-20 12:09:20 EDT
There's a race in arch/x86_64/kernel/mtrr.c that can result in a kernel hang and
subsequent NMI lockup detection on boot on SMP systems. I've uploaded boot
output with loglevel 7 that illustrates this hang. It may also exist in i386.
The race is caused by set_mtrr_smp getting re-entered before other cpus have a
chance to leave ipi_handler. Here's pseudo code annotated with where the CPUs in
the uploaded serial logging are hung. I determined this from the RIPs in the
panics and by disassembling smp_call_function and ipi_handler.
The function set_mtrr_smp is executed by a single master cpu:
1. wait_barrier_mtrr_disable = TRUE
wait_barrier_execute = TRUE
wait_barrier_cache_enable = TRUE
2. undone_count = 7
3. send ipis and wait for responses # CPU 3
4. disable interrupts
5. spin while undone_count > 0
6. undone_count = 7
7. wait_barrier_mtrr_disable = FALSE
8. spin while undone_count > 0
9. undone_count = 7
10. wait_barrier_execute = FALSE
11. spin while undone_count > 0
12. wait_barrier_cache_enable = FALSE
13. enable interrupts
The function ipi_handler is executed by slave cpus:
1. disable interrupts
3. spin while wait_barrier_mtrr_disable # CPUS 0,1,2,5,7
5. spin while wait_barrier_execute
7. spin while wait_barrier_cache_enable # CPUS 4,6
8. enable interrupts
When all slave cpus reach step 6 they unblock the master cpu at step 11. If the
master can leave and re-enter set_mtrr_smp (via say back-to-back calls to
mtrr_add_page in mtrr_write) and set wait_barrier_cache_enable to TRUE (step 1)
then any slave that has not yet gone from step 6 to step 7 will hang in step 7.
The master gets to step 3, sends IPIs and waits for all other cpus to respond,
which they won't since they're spinning. The system is hung at this point and
NMI lockup detection will kick in.
This race is unlikely to occur since all slave cpus are likely to execute from
step 6 to step 7 before the master cpu re-enters set_mtrr_smp. However this may
occur more frequently in an overcommited virtual enviornment (the uploaded
example occurred in a VM).
A simple fix would be to have the master cpu defer exiting set_mtrr_smp until
all slave cpus have left ipi_handler.
Created attachment 311397 [details]
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life.
Please See https://access.redhat.com/support/policy/updates/errata/
If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.