Bug 1207405
Summary: | RFE: please adjust timeouts for pcsd check (or allow to disable them) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaroslav Kortus <jkortus> | ||||
Component: | pcs | Assignee: | Ivan Devat <idevat> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.2 | CC: | c.handel, cluster-maint, glennduffy, jpokorny, jruemker, jss, royoung, rsteiger, sbradley, tojeline | ||||
Target Milestone: | rc | Keywords: | FutureFeature | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | pcs-0.9.151-1.el7 | Doc Type: | Enhancement | ||||
Doc Text: |
Feature:
Do not check pcsd status in pcs status command unless --full option is there. If --full option is there, parallelize pcsd status check.
Reason:
Make it faster to run pcs status, when some nodes are down.
Result:
Command pcs status runs faster, when some nodes are down.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-11-03 20:53:55 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jaroslav Kortus
2015-03-30 20:55:42 UTC
I think it makes sense to lower the default to maybe 5 seconds, but also use --wait to allow for longer (or shorter times). Hi Chris, thanks for the quick reaction! I'd happy with --wait=0 for some kind of disabling the functionality completely (and move it to --full). I think that from the pure clustering perspective, the cluster operations are not affected at all by in which state the pcsd currently is. The scope of operations requiring (especially remote) pcsd is limited, correct? Ideally I would just add a check to these operations and remove it from pcs status completely (and this way get it on-par with pcs status xml). What do you think? Is it really that vital to have that information there? # time pcs status &> /dev/null; time pcs status xml &>/dev/null real 0m1.411s user 0m0.222s sys 0m0.087s real 0m0.271s user 0m0.210s sys 0m0.052s I like the 0.2s version much better :). Also the timeout (if introduced) should be for all checks in total (ideally done in parallel as bug 1188659 suggests). I'm also wondering if completely removing the pcsd checks from the default pcsd status would make sense as well. And only do them when doing 'pcs status --full' or something similar. But either way, we will want to default timeouts to 5 seconds (and allow changing with --wait). *** Bug 1188659 has been marked as a duplicate of this bug. *** *** Bug 1214492 has been marked as a duplicate of this bug. *** Created attachment 1128155 [details]
proposed fix
Test: [vm-rhel72-1 ~] # paralelize_pcsd_status $ pcs status | grep "PCSD Status:" [vm-rhel72-1 ~] # paralelize_pcsd_status $ pcs status --full | grep "PCSD Status:" PCSD Status: This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions Before fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.143-15.el7.x86_64 [vm-rhel72-1 ~] $ pcs status | grep "PCSD Status:" PCSD Status: [vm-rhel72-1 ~] $ pcs status --full | grep "PCSD Status:" PCSD Status: After Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.151-1.el7.x86_64 [vm-rhel72-1 ~] $ pcs status | grep "PCSD Status:" [vm-rhel72-1 ~] $ pcs status --full | grep "PCSD Status:" PCSD Status: Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2596.html NOT FIXED PCSD web GUI still incredibly slow after node(s) go down. Problem STILL exists in EL7.7 with all updates to 2019-10-11. |