Bug 145194 - dlm_recvd stuck spinning during recovery
Summary: dlm_recvd stuck spinning during recovery
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: dlm   
(Show other bugs)
Version: 4
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: David Teigland
QA Contact: Cluster QE
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 144795
TreeView+ depends on / blocked
 
Reported: 2005-01-15 04:38 UTC by David Teigland
Modified: 2009-04-16 20:30 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-28 04:24:20 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description David Teigland 2005-01-15 04:38:16 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
Daniel McNeil originally reported this.
https://www.redhat.com/archives/linux-cluster/2005-January/msg00040.html

I reproduced it after 10 iterations of revolver on 7 bench nodes
using 3 fs's:
revolver -f /etc/cluster/cluster.conf -l /root/sistina-test -r
/root/sistina-test -b no -t 2 -x 2

On bench-27 dlm_recvd has entered an infinite loop consuming
the entire cpu during recovery:

root      3985  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3986  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3987  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3988  0.0  0.0     0    0 ?        SW<  11:04   0:00
[dlm_recoverd]
root      3686  0.0  0.0     0    0 ?        SW<  10:59   0:00 [dlm_astd]
root      3687 99.0  0.0     0    0 ?        RW<  10:59 655:15 [dlm_recvd]
root      3688  0.0  0.0     0    0 ?        SW<  10:59   0:00 [dlm_sendd]

top shows:

3687 root      15 -10     0    0    0 R 98.5  0.0 657:30.61 dlm_recvd

Stack traceback for pid 3687
0xdb7e1ab0     3687        3  1    0   R  0xdb7e1cb0 *dlm_recvd
EBP        EIP        Function (args)
0xd8641e9c 0xc010f485 sched_clock+0x45 (0x2, 0xe04248ae, 0xc0142575,
0xd8641eb8, 0xc01058c1)
0xd8641ef0 0xc039fb60 schedule+0x50 (0x0, 0xd8641fd8, 0x0, 0xfffffffc,
0xe0427150)
0xd8641fec 0xc0138e5a kthread+0xaa
           0xc01012c5 kernel_thread_helper+0x5


Unfortunately, kdb was somehow misconfigured so information from
our modules is missing.


Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1. see description of test above
2.
3.
    

Actual Results:  dlm recovery comes to a standstill

Additional info:

Comment 2 David Teigland 2005-01-28 04:24:20 UTC
Haven't been able to reproduce this for a couple weeks.


Note You need to log in before you can comment on or make changes to this bug.