Bug 145194 - dlm_recvd stuck spinning during recovery
dlm_recvd stuck spinning during recovery
Status: CLOSED WORKSFORME
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: dlm (Show other bugs)
4
All Linux
medium Severity high
: ---
: ---
Assigned To: David Teigland
Cluster QE
:
Depends On:
Blocks: 144795
  Show dependency treegraph
 
Reported: 2005-01-14 23:38 EST by David Teigland
Modified: 2009-04-16 16:30 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-27 23:24:20 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description David Teigland 2005-01-14 23:38:16 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
Daniel McNeil originally reported this.
https://www.redhat.com/archives/linux-cluster/2005-January/msg00040.html

I reproduced it after 10 iterations of revolver on 7 bench nodes
using 3 fs's:
revolver -f /etc/cluster/cluster.conf -l /root/sistina-test -r
/root/sistina-test -b no -t 2 -x 2

On bench-27 dlm_recvd has entered an infinite loop consuming
the entire cpu during recovery:

root      3985  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3986  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3987  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3988  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3686  0.0  0.0     0    0 ?        SW<  10:59   0:00 [dlm_astd]
root      3687 99.0  0.0     0    0 ?        RW<  10:59 655:15 [dlm_recvd]
root      3688  0.0  0.0     0    0 ?        SW<  10:59   0:00 [dlm_sendd]

top shows:

3687 root      15 -10     0    0    0 R 98.5  0.0 657:30.61 dlm_recvd

Stack traceback for pid 3687
0xdb7e1ab0     3687        3  1    0   R  0xdb7e1cb0 *dlm_recvd
EBP        EIP        Function (args)
0xd8641e9c 0xc010f485 sched_clock+0x45 (0x2, 0xe04248ae, 0xc0142575,
0xd8641eb8, 0xc01058c1)
0xd8641ef0 0xc039fb60 schedule+0x50 (0x0, 0xd8641fd8, 0x0, 0xfffffffc,
0xe0427150)
0xd8641fec 0xc0138e5a kthread+0xaa
           0xc01012c5 kernel_thread_helper+0x5


Unfortunately, kdb was somehow misconfigured so information from
our modules is missing.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. see description of test above
2.
3.
    

Actual Results:  dlm recovery comes to a standstill

Additional info:
Comment 2 David Teigland 2005-01-27 23:24:20 EST
Haven't been able to reproduce this for a couple weeks.

Note You need to log in before you can comment on or make changes to this bug.