Hide Forgot
> Description of problem: Running cluster setup may soemtimes get stuck with pcsd-cli.rb subprocesses never returning. It looks like the pipe between python caller and ruby subprocesses when calling for read_tokens. This can be best reproduced by running the cluster setup in a loop with 10 seconds delay between the attempts (timing seems to be a key here). It might take several minutes before the setup will get stuck. > Version-Release number of selected component (if applicable): [root@virt-031 ~]# rpm -q pcs pcs-0.9.148-3.el6.x86_64 > How reproducible: ~ 10% > Steps to Reproduce: 1. while [ $? -eq 0 ]; do sleep 10; date; pcs cluster setup --name STSRHTS24987 virt-030 virt-031 virt-033 --force --debug; done 2. wait until the loop above hangs > Actual results: Debug output get stuck here: <...snip...> Destroying cluster on nodes: virt-030, virt-031, virt-033... Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens --Debug Input Start-- {} --Debug Input End-- Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens --Debug Input Start-- {} --Debug Input End-- Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens --Debug Input Start-- {} --Debug Input End-- processes: root 6808 4.8 1.7 455108 18080 pts/3 Sl+ 15:57 0:00 \_ /usr/bin/python /usr/sbin/pcs cluster setup --name STSRHTS24987 virt-030 virt-031 virt-033 --force --debug root 6983 11.0 0.6 41772 6756 pts/3 R+ 15:57 0:00 \_ /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens root 6985 11.0 0.6 41776 6756 pts/3 R+ 15:57 0:00 \_ /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens root 6986 11.0 0.6 41768 6760 pts/3 R+ 15:57 0:00 \_ /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb read_tokens > Expected results: Cluster setup finishes in all cases.
patch in upstream: https://github.com/feist/pcs/commit/59dde9ba191bc079aee08ac25576b51b9a85681c test in bug description
The problem occurs only occasionally, it is not easy to reproduce. Tested only successful cluster setup after fix.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0739.html