Bug 1292858 - pcs should timeout during network requests
pcs should timeout during network requests
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs (Show other bugs)
7.3
Unspecified Unspecified
medium Severity unspecified
: rc
: ---
Assigned To: Ondrej Mular
cluster-qe@redhat.com
:
: 1395959 (view as bug list)
Depends On:
Blocks: 1334429 1395959
  Show dependency treegraph
 
Reported: 2015-12-18 09:51 EST by Chris Feist
Modified: 2017-08-01 14:22 EDT (History)
7 users (show)

See Also:
Fixed In Version: pcs-0.9.156-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 14:22:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Chris Feist 2015-12-18 09:51:59 EST
Description of problem:
If pcs connects to a remote node, but it hangs or doesn't respond, then 'pcs status' will hang.  We should probably have some sane timeouts (and error messages) if this happens.

This was original discovered with an MTU mismatch, so the opening TCP connection to pcs succeeded, but no other packets could get through.

So when running 'pcs status' the status just hung where pcsd was being queried.
Comment 5 Ondrej Mular 2017-02-03 11:19:11 EST
upstream patches:
https://github.com/ClusterLabs/pcs/commit/076b8b6ea473835810596f967bef41d7cf1f
https://github.com/ClusterLabs/pcs/commit/731127b8cfffd29c8546bd4a8a461f7aade5

New parameter --request-timeout has been added to pcs.

TEST:
2 node cluster: rhel7-node1 rhel7-node2
Block port 2224 (pcsd) on rhel7-node2
[root@rhel7-node2 ~]# iptables -I OUTPUT -p tcp --dport 2224 -j DROP
[root@rhel7-node2 ~]# iptables -I INPUT -p tcp --dport 2224 -j DROP

Then try to run some commands on rhel7-node1 (they should timed out instead of hang):
[root@rhel7-node1 ~]# pcs cluster auth rhel7-node2 -uhacluster --request-timeout=3
Password: 
Error: Operation timed out
Error: Unable to communicate with rhel7-node2

[root@rhel7-node1 ~]# pcs stonith sbd status --request-timeout=3
Warning: rhel7-node2: Connection timeout (Connection timed out after 3001 milliseconds)
Warning: Unable to get status of SBD from node 'rhel7-node2'
SBD STATUS
<node name>: <installed> | <enabled> | <running>
rhel7-node1:  NO |  NO |  NO
rhel7-node2: N/A | N/A | N/A
Comment 7 Ivan Devat 2017-02-20 03:19:28 EST
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.156-1.el7.x86_64

[vm-rhel72-1 ~] $ pcs stonith sbd status --request-timeout=3
Warning: vm-rhel72-3: Connection timeout (Connection timed out after 3001 milliseconds)
Warning: Unable to get status of SBD from node 'vm-rhel72-3'
SBD STATUS
<node name>: <installed> | <enabled> | <running>
vm-rhel72-1: YES |  NO |  NO
vm-rhel72-3: N/A | N/A | N/A
Comment 9 Tomas Jelinek 2017-02-20 08:53:54 EST
*** Bug 1395959 has been marked as a duplicate of this bug. ***
Comment 12 errata-xmlrpc 2017-08-01 14:22:57 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958

Note You need to log in before you can comment on or make changes to this bug.