214574 – cluster name mismatch makes gfs_controld use 100% CPU

Bug 214574 - cluster name mismatch makes gfs_controld use 100% CPU

Summary: cluster name mismatch makes gfs_controld use 100% CPU

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	cman
Sub Component:
Version:	6
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	David Teigland
QA Contact:
Docs Contact:
URL:
Whiteboard:	bzcl34nup
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2006-11-08 13:33 UTC by Bertho
Modified:	2008-05-06 16:44 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-05-06 16:44:57 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Bertho 2006-11-08 13:33:30 UTC

Description of problem:
Mounting a GFS2 filesystem with dlm locking with a cluster name mismatch causes
(an expected) failure at mount time, but gfs_controld uses 100% CPU thereafter
and does not seem to sleep anymore. Restarting cman/gfs_controld resolves the
problem.

Version-Release number of selected component (if applicable):
- FC6 x86_64 updated
- cman-2.0.18-2.fc6

How reproducible:
Every time you try to mount a GFS2 filesystem that has a locking table name that
is not the cluster's name.

Steps to Reproduce:
1. mkfs.gfs2 -p lock_dlm -t mtprodbox0:xenhome -j 8 /dev/vgxentest/lvxenhome
2. in /etc/cluster/cluster.conf: <cluster alias="BigBrother" config_version="5"
name="BigMamma">...etc...
3. /etc/init.d/cman restart
4. mount -tgfs2 -onoatime /dev/vgxentest/lvxenhome /mnt

(name mismatch: mtprodbox0 vs BigMamma)

  
Actual results:
gfs_controld uses 100% CPU.

Expected results:
gfs_controld notes/logs failure and goes back to sleep.

Additional info:
Running the control daemon with debug on the command line revealed the logic
error (the clustername mismatch). However, it did not reveal why it kept
spinning. It would be nice if the control daemon would log to syslog by default.
[root@xentest ~]# gfs_controld -D
1162991451 listen 3
1162991451 cpg 6
1162991451 groupd 8
1162991451 uevent 9
1162991451 plocks 12
1162991451 setup done
1162991475 client 6: join /mnt gfs2 lock_dlm mtprodbox0:xenhome rw,noatime
1162991475 mount: /mnt gfs2 lock_dlm mtprodbox0:xenhome rw,noatime
1162991475 mount: fs requires cluster="mtprodbox0" current="BigMamma"
1162991475 mount: failed -1
1162991475 client 6:
1162991475 client 6 fd 13 read error -1 104
[gfs_controld keeps spinning; top says approx 20/80% user/system time]

Comment 1 Bug Zapper 2008-04-04 04:31:44 UTC

Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 2 Bug Zapper 2008-05-06 16:44:56 UTC

This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.