Bug 172894 - Misc. cluster locking performance + bugfixes
Summary: Misc. cluster locking performance + bugfixes
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: clumanager (Show other bugs)
(Show other bugs)
Version: 3
Hardware: All Linux
medium
medium
Target Milestone: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 172895
TreeView+ depends on / blocked
 
Reported: 2005-11-10 21:46 UTC by Lon Hohberger
Modified: 2009-04-16 20:18 UTC (History)
3 users (show)

Fixed In Version: RHBA-2006-0196
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-27 18:07:25 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fixes all three conditions (15.42 KB, patch)
2005-11-10 21:46 UTC, Lon Hohberger
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0196 normal SHIPPED_LIVE clumanager bug fix update 2006-03-27 05:00:00 UTC

Description Lon Hohberger 2005-11-10 21:46:35 UTC
Description of problem:

(a) Cluster locking leaves hundreds of sockets open in the TIME_WAIT state on
clusters with many services.  Switching to UNIX domain sockets would not only
improve performance, but also significantly reduce this number

(b) Cluster lock is not correctly released from cludb in all cases.

(c) Clulockd panics during unlock if the connection times out in this case.


Version-Release number of selected component (if applicable): 1.2.28


How reproducible:


Steps to Reproduce:
(a) Create 10+ services and watch the number of sockets in TIME_WAIT as output
by netstat -an .

(b) Indeterminate / difficult to reproduce.  This is a timing issue.

(c) Difficult, but possible to reproduce.  Edit the clulockd code to sleep(15)
or so after replying to a LOCK_LOCK request (with a LOCK_ACK...).  The
connecting node will fail to release the lock and raise:

  <alert> Unhandled Exception at clulock.c:408 in _clu_process_unlock
  
Actual results:  Stated.

Comment 1 Lon Hohberger 2005-11-10 21:46:36 UTC
Created attachment 120909 [details]
Fixes all three conditions

Comment 2 Lon Hohberger 2005-11-10 21:47:37 UTC
Setting to MODIFIED

Comment 3 Lon Hohberger 2005-11-11 20:58:47 UTC
A new *TEST* package build with several bugfixes (fixes bugzillas: 171637 172735
172893 172894 ) is available.  Gulm-bridge support has been disabled in this
release to prevent having to install with the "--nodeps" option:

http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.i386.rpm
http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.x86_64.rpm
http://people.redhat.com/lhh/clumanager-1.2.28.6-0.1nogfs.src.rpm

Let us know if this works for you.

Comment 7 Red Hat Bugzilla 2006-03-27 18:07:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0196.html



Note You need to log in before you can comment on or make changes to this bug.