Red Hat Bugzilla – Bug 305831
excessive cpu usage in recovery code
Last modified: 2009-09-03 12:51:27 EDT
Description of problem:
The dlm recovery code calls schedule() in a lot of lock loops
1) the softlock watchdog from going off
2) openais cluster membership messages for being delayed past
the configured timeout
We want to investigate:
- will cond_resched() work as well, and more efficiently than schedule()?
(I expect so)
- exactly what loops are taking so long (watchdog is 10 sec) and why?
are there really that many locks and/or are we doing that much work
on each one that it can take 10 sec?
- why does this seem to appear on ia64 regularly and other arch's rarely?
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Dean, do you still get softlockups on ia64?