Bug 1292858 - pcs should timeout during network requests
Summary: pcs should timeout during network requests
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcs
Version: 7.3
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: Ondrej Mular
QA Contact: cluster-qe@redhat.com
: 1395959 (view as bug list)
Depends On:
Blocks: 1334429 1395959
TreeView+ depends on / blocked
Reported: 2015-12-18 14:51 UTC by Chris Feist
Modified: 2017-08-01 18:22 UTC (History)
7 users (show)

Fixed In Version: pcs-0.9.156-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2017-08-01 18:22:57 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1463327 0 high CLOSED Starting a larger cluster times out 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2017:1958 0 normal SHIPPED_LIVE pcs bug fix and enhancement update 2017-08-01 18:09:47 UTC

Internal Links: 1463327

Description Chris Feist 2015-12-18 14:51:59 UTC
Description of problem:
If pcs connects to a remote node, but it hangs or doesn't respond, then 'pcs status' will hang.  We should probably have some sane timeouts (and error messages) if this happens.

This was original discovered with an MTU mismatch, so the opening TCP connection to pcs succeeded, but no other packets could get through.

So when running 'pcs status' the status just hung where pcsd was being queried.

Comment 5 Ondrej Mular 2017-02-03 16:19:11 UTC
upstream patches:

New parameter --request-timeout has been added to pcs.

2 node cluster: rhel7-node1 rhel7-node2
Block port 2224 (pcsd) on rhel7-node2
[root@rhel7-node2 ~]# iptables -I OUTPUT -p tcp --dport 2224 -j DROP
[root@rhel7-node2 ~]# iptables -I INPUT -p tcp --dport 2224 -j DROP

Then try to run some commands on rhel7-node1 (they should timed out instead of hang):
[root@rhel7-node1 ~]# pcs cluster auth rhel7-node2 -uhacluster --request-timeout=3
Error: Operation timed out
Error: Unable to communicate with rhel7-node2

[root@rhel7-node1 ~]# pcs stonith sbd status --request-timeout=3
Warning: rhel7-node2: Connection timeout (Connection timed out after 3001 milliseconds)
Warning: Unable to get status of SBD from node 'rhel7-node2'
<node name>: <installed> | <enabled> | <running>
rhel7-node1:  NO |  NO |  NO
rhel7-node2: N/A | N/A | N/A

Comment 7 Ivan Devat 2017-02-20 08:19:28 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs

[vm-rhel72-1 ~] $ pcs stonith sbd status --request-timeout=3
Warning: vm-rhel72-3: Connection timeout (Connection timed out after 3001 milliseconds)
Warning: Unable to get status of SBD from node 'vm-rhel72-3'
<node name>: <installed> | <enabled> | <running>
vm-rhel72-1: YES |  NO |  NO
vm-rhel72-3: N/A | N/A | N/A

Comment 9 Tomas Jelinek 2017-02-20 13:53:54 UTC
*** Bug 1395959 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2017-08-01 18:22:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.