Bug 1486869
| Summary: | RFE: crm_mon deal with inactive cluster better | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 7.6 | CC: | abeekhof, aherr, cfeist, cluster-maint, cmarthal, kwenning, mnovacek, phagara |
| Target Milestone: | rc | Keywords: | FutureFeature |
| Target Release: | 7.7 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | pacemaker-1.1.20-1.el7 | Doc Type: | No Doc Update |
| Doc Text: |
Minor self-explanatory enhancement
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-08-06 12:53:38 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Corey Marthaler
2017-08-30 16:24:02 UTC
"crm_mon" runs in interactive mode, which shouldn't exit until the user tells it to, but we could improve the message -- maybe "Waiting until cluster is available on this node ...". For "crm_mon -X", I'm thinking "Error: cluster is not available on this node". pcs might have some more intelligence like checking for cluster processes, but all crm_mon knows is that it couldn't connect. The most likely reason is that the cluster is not running, but it could be something else, like our connection being blocked somehow. I think "not available" covers all the bases. Let me know if you have a better suggestion. both above suggestions sound fine. Thanks. Moving to RHEL 8 only, as this will not make 7.7, which will be the last RHEL 7 feature release Hasn't that been taken care of upstream already both on the master as the 1.0 branch?
Guess it is already in 8.0 and should get into 7.7 via rebase.
pacemaker-2.0:
commit 3795ed48e40141fd72d869be05a03a758450dd68
Author: Klaus Wenninger <klaus.wenninger>
Date: Mon Aug 20 14:02:41 2018 +0200
Fix: crm_mon: rhbz#1486869 - common language on connection-errors
make a cluster-connection-failure CRIT instead of WARN for nagios
pacemaker-1.1:
commit 7d9806020a17556272a1ab7ee8eb4d4228ea8667
Author: Klaus Wenninger <klaus.wenninger>
Date: Mon Aug 20 14:02:41 2018 +0200
Fix: crm_mon: rhbz#1486869 - common language on connection-errors
make a cluster-connection-failure CRIT instead of WARN for nagios
Sorry seems as if I missed to put that in here.
QA: Trivial verification, run crm_mon when no cluster is running: % crm_mon -1 Error: cluster is not available on this node % crm_mon -X Error: cluster is not available on this node % crm_mon Waiting until cluster is available on this node ... before (1.1.19-8.el7): ====================== > [root@virt-130 ~]# rpm -q pacemaker > pacemaker-1.1.19-8.el7.x86_64 > [root@virt-130 ~]# pcs status > Error: cluster is not currently running on this node > [root@virt-130 ~]# crm_mon -1 > > Connection to cluster failed: Transport endpoint is not connected > [root@virt-130 ~]# crm_mon -X > > Connection to cluster failed: Transport endpoint is not connected > [root@virt-130 ~]# crm_mon > Attempting connection to the cluster... > [stuck] result: failure message unclear (connection error). after (1.1.20-5.el7): ===================== > [root@virt-054 ~]# rpm -q pacemaker > pacemaker-1.1.20-5.el7.x86_64 > [root@virt-054 ~]# pcs status > Error: cluster is not currently running on this node > [root@virt-054 ~]# crm_mon -1 > > Error: cluster is not available on this node > [root@virt-054 ~]# crm_mon -X > > Error: cluster is not available on this node > [root@virt-054 ~]# crm_mon > Waiting until cluster is available on this node ... > [stuck] result: better failure message (cluster not available). Marking as verified in 1.1.20-5.el7. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2129 |