Bug 1394273

Summary: [cli] connection interrupted by pcsd restart results in a traceback
Product: Red Hat Enterprise Linux 6 Reporter: Radek Steiger <rsteiger>
Component: pcsAssignee: Ondrej Mular <omular>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: high    
Version: 6.8CC: cfeist, cluster-maint, idevat, omular, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.155-2.el6 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-21 11:04:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radek Steiger 2016-11-11 14:34:18 UTC
> Description of problem:

An automatic pcsd daemon restart scheduled within the cluster setup execution causes any subsequent connection from pcs cli to produce a traceback due to an interruption. While timing is very much involved, it should be enough as a reproducer to just run cluster setup followed by a cluster start.


> Version-Release number of selected component (if applicable):

pcs-0.9.155-1.el6.x86_64


> Additional info:

[root@virt-011 ~]# pcs cluster setup --name STSRHTS30264 virt-011 virt-012 virt-038 && pcs cluster start --all --wait
Destroying cluster on nodes: virt-011, virt-012, virt-038...
virt-011: Stopping Cluster (pacemaker)...
virt-038: Stopping Cluster (pacemaker)...
virt-012: Stopping Cluster (pacemaker)...
virt-038: Successfully destroyed cluster
virt-012: Successfully destroyed cluster
virt-011: Successfully destroyed cluster

Sending cluster config files to the nodes...
virt-011: Updated cluster.conf...
virt-012: Updated cluster.conf...
virt-038: Updated cluster.conf...

Synchronizing pcsd certificates on nodes virt-011, virt-012, virt-038...
virt-011: Success
virt-012: Success
virt-038: Success

Restarting pcsd on the nodes in order to reload the certificates...
virt-011: Success
virt-012: Success
virt-038: Success
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1049, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 433, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 397, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1049, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 433, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 397, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1049, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 433, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 397, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Waiting for node(s) to start...
virt-038: Started
virt-012: Started
virt-011: Started

Comment 3 Ondrej Mular 2016-11-16 16:35:24 UTC
upstream fix:
https://github.com/ClusterLabs/pcs/commit/151fa853c2b56e71987b3cb4d8114c3ed6be

TEST:
[root@rhel6-node1 ~]# pcs cluster setup --name rh6 rh6-{1,2,3} && pcs cluster start --all --wait
Destroying cluster on nodes: rh6-1, rh6-2, rh6-3...
rh6-3: Stopping Cluster (pacemaker)...
rh6-1: Stopping Cluster (pacemaker)...
rh6-2: Stopping Cluster (pacemaker)...
rh6-1: Successfully destroyed cluster
rh6-2: Successfully destroyed cluster
rh6-3: Successfully destroyed cluster

Sending cluster config files to the nodes...
rh6-1: Updated cluster.conf...
rh6-2: Updated cluster.conf...
rh6-3: Updated cluster.conf...

Synchronizing pcsd certificates on nodes rh6-1, rh6-2, rh6-3...
rh6-1: Success
rh6-3: Success
rh6-2: Success

Restarting pcsd on the nodes in order to reload the certificates...
rh6-1: Success
rh6-3: Success
rh6-2: Success
rh6-2: Starting Cluster...
rh6-3: Unable to connect to rh6-3 (Connection error)
rh6-1: Unable to connect to rh6-1 (Connection error)
Error: unable to start all nodes
rh6-1: Unable to connect to rh6-1 (Connection error)
rh6-3: Unable to connect to rh6-3 (Connection error)

Comment 4 Ivan Devat 2016-11-25 11:04:33 UTC
Before Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.155-1.el6.x86_64

Run it multiple times...

[vm-rhel67-1 ~] $ pcs cluster setup --name=devcluster6 vm-rhel67-1 vm-rhel67-2 vm-rhel67-3 && pcs cluster start --all --wait
Destroying cluster on nodes: vm-rhel67-1, vm-rhel67-2, vm-rhel67-3...
vm-rhel67-1: Stopping Cluster (pacemaker)...
vm-rhel67-3: Stopping Cluster (pacemaker)...
vm-rhel67-2: Stopping Cluster (pacemaker)...
vm-rhel67-1: Successfully destroyed cluster
vm-rhel67-3: Successfully destroyed cluster
vm-rhel67-2: Successfully destroyed cluster

Sending cluster config files to the nodes...
vm-rhel67-1: Updated cluster.conf...
vm-rhel67-2: Updated cluster.conf...
vm-rhel67-3: Updated cluster.conf...

Synchronizing pcsd certificates on nodes vm-rhel67-1, vm-rhel67-2, vm-rhel67-3...
vm-rhel67-1: Success
vm-rhel67-2: Success
vm-rhel67-3: Success

Restarting pcsd on the nodes in order to reload the certificates...
vm-rhel67-1: Success
vm-rhel67-2: Success
vm-rhel67-3: Success
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1012, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 404, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 368, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1012, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 404, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 368, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 1112, in worker
    returncode, output = action(node, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 261, in startCluster
    return sendHTTPRequest(node, 'remote/cluster_start', None, False, not quiet)
  File "/usr/lib/python2.6/site-packages/pcs/utils.py", line 410, in sendHTTPRequest
    result = opener.open(url,data)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
    r = h.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 1012, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 404, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 368, in _read_status
    raise BadStatusLine(line)
BadStatusLine

Waiting for node(s) to start...
vm-rhel67-1: Error connecting to vm-rhel67-1 - (HTTP error: 400)
vm-rhel67-3: Started
vm-rhel67-2: Started
Error: unable to verify all nodes have started


After Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.155-2.el6.x86_64

Run it multiple times...

[vm-rhel67-1 ~] $ pcs cluster setup --name=devcluster6 vm-rhel67-1 vm-rhel67-2 vm-rhel67-3 && pcs cluster start --all --wait
Destroying cluster on nodes: vm-rhel67-1, vm-rhel67-2, vm-rhel67-3...
vm-rhel67-3: Stopping Cluster (pacemaker)...
vm-rhel67-1: Stopping Cluster (pacemaker)...
vm-rhel67-2: Stopping Cluster (pacemaker)...
vm-rhel67-3: Successfully destroyed cluster
vm-rhel67-1: Successfully destroyed cluster
vm-rhel67-2: Successfully destroyed cluster

Sending cluster config files to the nodes...
vm-rhel67-1: Updated cluster.conf...
vm-rhel67-2: Updated cluster.conf...
vm-rhel67-3: Updated cluster.conf...

Synchronizing pcsd certificates on nodes vm-rhel67-1, vm-rhel67-2, vm-rhel67-3...
vm-rhel67-1: Success
vm-rhel67-2: Success
vm-rhel67-3: Success

Restarting pcsd on the nodes in order to reload the certificates...
vm-rhel67-1: Success
vm-rhel67-2: Success
vm-rhel67-3: Success
vm-rhel67-2: Starting Cluster...
vm-rhel67-1: Starting Cluster...
vm-rhel67-3: Starting Cluster...
Waiting for node(s) to start...
vm-rhel67-2: Started
vm-rhel67-3: Started
vm-rhel67-1: Started

Comment 8 errata-xmlrpc 2017-03-21 11:04:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0707.html