Bug 156872
Summary: | lt_high_locks setting | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Wendy Cheng <nobody+wcheng> | ||||||
Component: | gulm | Assignee: | michael conrad tadpol tilstra <mtilstra> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3 | CC: | cluster-maint, tao | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2005-05-25 16:41:14 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Wendy Cheng
2005-05-04 20:27:15 UTC
Symptoms: 1) Numerous messages shown up in /var/log/messages on the lock_gulm MASTER: Apr 28 10:51:51 icla2g lock_gulmd_LT000[11621]: Lock count is at 2127373 which is more than the max 2097152. Sending Drop all req to clients .............. 2) performance is drastically reduced since lock server keeps requesting nodes to drop their unncessary locks. The culprit seems to be in this routine where val should have been (casted) set to ulong ? unsigned long bound_to_ulong(int val, unsigned long min, unsigned long max) { if( val < min ) return min; if( val > max ) return max; return val; } its not there actually. ccs doesn't have a function to find a long, only int or float values. So ccs is actually not reading the number correctly. For fixing the code, I think libccs will need to add a find long function. Another temporary thing the customer can do is decrease the rate at when the drop lock req are sent. This would be the lt_drop_req_rate, set this to the number of seconds between each drop req. ccs in 6.0 can only find int, float, or string. so to get a long form ccs, either we need to have a string passed in and parse it ourselves, or we need to change the ccs libs. Second report on GFS-6.0.2-25-i686 2.4.21-27.0.2.ELhugemem kernel. The problem causes failover to occur. also, setting lt_high_locks to -1 will max it out. (although if you dump the config with either -C or SIGUSR1, it will show -1 instead of 4294967295. One more thing to fix. wheeeee.) Created attachment 114165 [details]
kludge around ccs's lack of find_css_long
This is a quick patch that can fix this bug. It kludges around things by
letting users specify unsigned longs as a string. ie lt_high_locks =
"4294967296" (which would be the maximum value)
This also changes a bunch of %d to %u is the config dump function.
checked this into cvs. oh, you can still use numbers to set lt_hight_locks, just in case that wasn't clear. Without the patch, the following two settings effectively turn the HighWater lock drop request off. lt_high_locks = -1 lt_drop_req_rate = -1 This actually sets both values to the maximum of an unsigned integer. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2005-466.html |