Bug 1396462

Summary: [gui] jquery breaks if the first cluster_status request provides not enough data
Product: Red Hat Enterprise Linux 7 Reporter: Radek Steiger <rsteiger>
Component: pcsAssignee: Ondrej Mular <omular>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: 7.3CC: cfeist, cluster-maint, idevat, mlisik, omular, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.156-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 18:24:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radek Steiger 2016-11-18 11:42:37 UTC
> Description of problem:

If a cluster is under such a heavy load that the pcsd daemon is unable to provide useful daata, the jquery/ember framework inside a browser instance breaks in a way that it will always fail showing cluster status even if full data is provided by pcsd later. The only way to get out of this is to do a full page reload.

The supposedly "guilty" JSON response looks like this:

{"cluster_name":"STSRHTS29046","error_list":[],"warning_list":[],"quorate":false,"status":"error","node_list":[{"name":"virt-131","status":"unknown","warning_list":[],"error_list":[]},{"name":"virt-123","status":"unknown","warning_list":[],"error_list":[]}],"resource_list":[],"available_features":[]}

Following javascript error can be observed in browser's console:

Uncaught TypeError: Cannot read property 'length' of undefined
    at Function.each (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/jquery-1.9.1.min.js:3:5102)
    at Class.<anonymous> (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/nodes-ember.js:812:7)
    at ComputedPropertyPrototype.get (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:4628:38)
    at get (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:1916:17)
    at Ember._getPath (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:1994:12)
    at get (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:1911:12)
    at getWithGlobals (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:6812:10)
    at Binding._sync (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:7007:23)
    at DeferredActionQueues.flush (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:5678:24)
    at Backburner.end (https://virt-123.cluster-qe.lab.eng.brq.redhat.com:2224/js/ember-1.4.0.js:5769:27)


> Version-Release number of selected component (if applicable):

pcs-0.9.152-10.el7.x86_64


> How reproducible:

Sometimes (depends on the possibility of getting an almost-empty json response)


> Steps to Reproduce:

1. In this it was enough to follow setup case from bug 1395959
2. Halt one of the nodes
3. Open cluster details in a web interface served from the running node
4. Use browser's inspector to watch the requests


> Actual results:

The first cluster_status request returns empty JSON with no useful data and a JS error is shown in a debug console. Even if any other subsequent cluster_status actually returns a full JSON response (with size over 300kB in my case) the cluster management page still won't show any cluster details though.


> Expected results:

The cluster management page is updated as soon as the browser receives any valid response.


> Additional info:

A snap from the chrome's network log:

main	200	document	Other	147 KB	1.51 s	
style.css	304	stylesheet	main:4	344 B	36 ms	
overpass.css	304	stylesheet	main:5	344 B	50 ms	
liberation.css	304	stylesheet	main:6	344 B	64 ms	
jquery-ui-1.10.1.custom.css	304	stylesheet	main:7	344 B	80 ms	
jquery-1.9.1.min.js	304	script	main:12	344 B	155 ms	
jquery-ui-1.10.1.custom.min.js	304	script	main:13	344 B	95 ms	
handlebars-v1.2.1.js	304	script	main:14	344 B	151 ms	
ember-1.4.0.js	304	script	main:15	344 B	108 ms	
pcsd.js	304	script	main:16	344 B	137 ms	
nodes-ember.js	304	script	main:1252	344 B	103 ms	
overpass_regular-web.woff	200	font	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
overpass_bold-web.woff	200	font	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
LiberationSans-Regular.ttf	200	font	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
LiberationSans-Bold.ttf	200	font	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-bg_inset-soft_25_000000_1x100.png	200	png	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-bg_gloss-wave_25_333333_500x100.png	200	png	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
pbar-ani.gif	200	gif	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-icons_cccccc_256x240.png	200	png	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-bg_glass_20_555555_1x400.png	200	png	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-bg_flat_50_5c5c5c_40x100.png	200	png	jquery-1.9.1.min.js:5	(from memory cache)	0 ms	
ui-bg_glass_40_0078a3_1x400.png	200	png	jquery-1.9.1.min.js:3	(from memory cache)	0 ms	
ui-icons_ffffff_256x240.png	200	png	jquery-1.9.1.min.js:3	(from memory cache)	0 ms	
cluster_status	200	xhr	jquery-1.9.1.min.js:5	757 B	30.18 s	
HAM-logo.png	304	png	ember-1.4.0.js:20860	344 B	31 ms	
get_resource_agent_metadata?agent=ocf%3Aheartbeat%3Aapache	200	xhr	jquery-1.9.1.min.js:5	5.0 KB	3.49 s	
Shell_bg.png	200	png	main:-Infinity	(from memory cache)	0 ms	
action-icons.png	200	png	main:-Infinity	(from memory cache)	0 ms	
field_bg.png	200	png	main:-Infinity	(from memory cache)	0 ms	
get_resource_agent_metadata?agent=ocf%3Aheartbeat%3Aapache	200	xhr	jquery-1.9.1.min.js:5	5.0 KB	5.98 s	
get_fence_agent_metadata?agent=stonith%3Afence_apc	200	xhr	jquery-1.9.1.min.js:5	14.4 KB	4.05 s	
favicon.ico	200	vnd.microsoft.icon	Other	759 B	134 ms	
cluster_properties	200	xhr	jquery-1.9.1.min.js:5	12.8 KB	2.77 s	
cluster_status	200	xhr	jquery-1.9.1.min.js:5	315 KB	29.52 s	
get_fence_agent_metadata?agent=stonith%3Afence_xvm	200	xhr	jquery-1.9.1.min.js:5	13.3 KB	1.66 s	
get_resource_agent_metadata?agent=ocf%3Aheartbeat%3ADummy	200	xhr	jquery-1.9.1.min.js:5	1.5 KB	1.74 s	
cluster_status	200	xhr	jquery-1.9.1.min.js:5	315 KB	15.31 s

Comment 2 Tomas Jelinek 2017-01-11 13:37:23 UTC
It looks like the crash occurs on line:
$.each(self.get("group_list"), function(_, group) {
in nodes-ember.js when Pcs.resourcesContainer.group_list is not an array. This should be easy to fix. We can return an empty list from the groups_enum function if we cannot get the groups. But it would be better to fix the issue at its roots instead of where it manifests itself.

Ondrej, can you take a look at this and fix it in the place where the fix fits the best? Thanks!

Comment 4 Ondrej Mular 2017-02-03 16:26:38 UTC
Upstream patch:
https://github.com/ClusterLabs/pcs/commit/51580b1b38745b29e97229a2e938694ac4166b8

TEST:
2-node cluster, nodes: rhel7-node1, rhel7-node2
Open nodes page of tested cluster from web UI of node which is not part of the cluster.

Block port 2224 (pcsd) on both cluster nodes:
[root@rhel7-node1 ~]# iptables -I OUTPUT -p tcp --dport 2224 -j DROP
[root@rhel7-node1 ~]# iptables -I INPUT -p tcp --dport 2224 -j DROP
[root@rhel7-node2 ~]# iptables -I OUTPUT -p tcp --dport 2224 -j DROP
[root@rhel7-node2 ~]# iptables -I INPUT -p tcp --dport 2224 -j DROP

After next update of web UI, cluster nodes are marked as offline and there is no JS error in JS console.

Comment 7 Ivan Devat 2017-02-20 08:22:45 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.156-1.el7.x86_64

see comment 4

Comment 12 errata-xmlrpc 2017-08-01 18:24:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958