Bug 1420565 - pgsql agent misuses crm_failcount
Summary: pgsql agent misuses crm_failcount
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: resource-agents
Version: 7.3
Hardware: All
OS: All
unspecified
medium
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-09 00:34 UTC by Ken Gaillot
Modified: 2017-08-01 14:57 UTC (History)
4 users (show)

Fixed In Version: resource-agents-3.9.5-88.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 14:57:40 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:1844 0 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2017-08-01 17:49:20 UTC

Description Ken Gaillot 2017-02-09 00:34:22 UTC
Description of problem: The ocf:heartbeat:pgsql resource agent calls crm_failcount in a certain error situation. This is not the appropriate means of indicating a node-fatal error, and could lead to unexpected behavior in a Pacemaker cluster.

Instead of using crm_failcount, the agent should return a hard error code such as OCF_ERR_ARGS or OCF_ERR_PERM.

The version of pacemaker that will be in RHEL 7.4 will have a change to crm_failcount which will break this usage, so it is important to have a fix in the same time frame.

This was discussed on the upstream mailing list, and a user offered to submit a fix, so a pull request may be forthcoming:

http://lists.clusterlabs.org/pipermail/users/2017-February/004958.html

Comment 4 michal novacek 2017-06-05 14:41:37 UTC
I have verified using our internal "pacemaker,resource,Postrgesql" test that it passes even with /usr/bin/crm_failcount having user rights 000 (resource-agents-3.9.5-104.el7)

---

[root@host-133 ~]# aq.sh 'ls -l /usr/sbin/crm_failcount'
----> using /usr/tests/resource-STSRHTS10447.xml
----------. 1 root root 2309 May  9 16:25 /usr/sbin/crm_failcount
----------. 1 root root 2309 May  9 16:25 /usr/sbin/crm_failcount
----------. 1 root root 2309 May  9 16:25 /usr/sbin/crm_failcount

/usr/tests/sts-rhel7.4/vedder/bin/vedder-ng -t pacemaker,resource,Postgres
...
------------------- Summary ---------------------
Testcase                                 Result    
--------                                 ------    
generic_setup                            PASS      
setup-clvmd                              PASS      
setup-pacemaker                          PASS      
setup_initscripts                        PASS      
pacemaker-resource-Postgres              PASS      
cleanup                                  PASS      
=================================================
Total Tests Run: 6
Total PASS:      6
Total FAIL:      0
Total TIMEOUT:   0
Total KILLED:    0
Total STOPPED:   0
Test output in /tmp/vedder.CHERRY.STSRHTS10447.201706050930
Killing XMLRPC server...
DEBUG:STSXMLRPC:Killing server with PID 12953 (SIGTERM)
INFO:STSXMLRPC:Server terminated.

Comment 5 errata-xmlrpc 2017-08-01 14:57:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1844


Note You need to log in before you can comment on or make changes to this bug.