Bug 1484217

Summary: cns-deploy fails, failing to load the topology file
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: krishnaram Karthick <kramdoss>
Component: CNS-deploymentAssignee: Raghavendra Talur <rtalur>
Status: CLOSED ERRATA QA Contact: krishnaram Karthick <kramdoss>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: cns-3.6CC: akhakhar, annair, hchiramm, jarrpa, madam, mliyazud, mzywusko, pprakash, rhs-bugs, rreddy, rtalur, sselvan
Target Milestone: ---Keywords: TestBlocker
Target Release: CNS 3.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-11 07:12:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1445448    
Attachments:
Description Flags
cns logs
none
logs from build - cns-deploy-5.0.0-23.el7rhgs.x86_64 none

Description krishnaram Karthick 2017-08-23 03:42:34 UTC
Description of problem:
cns-deploy fails while trying to load the topology file. This issue is seen on multiple setups and manually trying to add a node fails too.

snippet of the log: (complete set of logs attached)
===================
Determining heketi service URL ... OK
oc -n storage-project exec -it deploy-heketi-1-vtvkg -- heketi-cli -s http://localhost:8080 --user admin --secret '' topology load --json=/etc/heketi/topology.json 2>&1
Creating cluster ... ID: 327c9c54c115a7feb11476fd684f980d
Allowing file volumes on cluster.
Allowing block volumes on cluster.
Creating node dhcp47-174.lab.eng.blr.redhat.com ... Unable to create node: New Node doesn't have glusterd running
Creating node dhcp47-183.lab.eng.blr.redhat.com ... Unable to create node: New Node doesn't have glusterd running
Creating node dhcp46-133.lab.eng.blr.redhat.com ... Unable to create node: New Node doesn't have glusterd running
Error loading the cluster topology.
Please check the failed node or device and rerun this script.

Version-Release number of selected component (if applicable):
cns-deploy-5.0.0-20.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Run cns-deploy to setup cns

Actual results:
topology load fails

Expected results:
topology should get loaded and cns-deploy should succeed.

Additional info:

Comment 4 krishnaram Karthick 2017-08-23 03:47:24 UTC
Created attachment 1316932 [details]
cns logs

Comment 5 Raghavendra Talur 2017-08-23 04:04:07 UTC
reason for error:

[kubeexec] ERROR 2017/08/22 22:22:02 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:303: Get https://172.30.0.1:443/api/v1/namespaces/storage-project/pods?labelSelector=glusterfs-node: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
[kubeexec] ERROR 2017/08/22 22:22:02 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:304: Failed to get list of pods
[negroni] Completed 400 Bad Request in 1.730791ms


This could be because:
1. the timeout change we made in heketi
2. setup has network/firewall issue
3. some labels on nodes are wrong.


Need to look further for more info.

Comment 6 Raghavendra Talur 2017-08-23 05:52:51 UTC
This is because of patch https://github.com/heketi/heketi/pull/778

Have sent a revert for the same https://github.com/heketi/heketi/pull/840

Comment 12 krishnaram Karthick 2017-08-24 16:43:34 UTC
Created attachment 1317840 [details]
logs from build - cns-deploy-5.0.0-23.el7rhgs.x86_64

Comment 16 errata-xmlrpc 2017-10-11 07:12:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2881