Bug 1400429 - heketi server doesn't respond after rebooting openshift worker nodes
Summary: heketi server doesn't respond after rebooting openshift worker nodes
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: CNS-deployment
Version: cns-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Mohamed Ashiq
QA Contact: Anoop
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-01 08:41 UTC by krishnaram Karthick
Modified: 2017-01-03 11:05 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-03 11:04:47 UTC
Target Upstream Version:


Attachments (Terms of Use)
curl_hellowworld (4.78 KB, text/html)
2016-12-01 08:41 UTC, krishnaram Karthick
no flags Details

Description krishnaram Karthick 2016-12-01 08:41:21 UTC
Created attachment 1226671 [details]
curl_hellowworld

Description of problem:
while setting up of openshift environment for CNS by following the installation guide - https://access.redhat.com/documentation/en/red-hat-gluster-storage/3.1/paged/container-native-storage-for-openshift-container-platform/ - deploy heketi container seems to be successful. However, upon rebooting the worker nodes, deploy-heketi server doesn't seem to respond. status of deploy-heketi pod shows running though. Refer 'steps to reproduce' to see the exact steps.

output of 'curl http://deploy-heketi-storage-project.cloudapps.mystorage.com/hello' has been attached.

Rebooting of nodes shouldn't cause heketi to go down.

[root@dhcp42-66 ~]# oc get pods
NAME                                                     READY     STATUS    RESTARTS   AGE
deploy-heketi-1-6hfv1                                    1/1       Running   0          2m
glusterfs-dc-dhcp42-190.lab.eng.blr.redhat.com-1-qwsb5   1/1       Running   0          1h
glusterfs-dc-dhcp42-34.lab.eng.blr.redhat.com-1-bdhhp    1/1       Running   1          1h
glusterfs-dc-dhcp43-168.lab.eng.blr.redhat.com-1-ieudu   1/1       Running   0          1h
storage-project-router-1-pihqk                           1/1       Running   2          5h
[root@dhcp42-66 ~]# oc get nodes
NAME                                STATUS                     AGE
dhcp42-190.lab.eng.blr.redhat.com   Ready                      5h
dhcp42-34.lab.eng.blr.redhat.com    Ready                      5h
dhcp42-66.lab.eng.blr.redhat.com    Ready,SchedulingDisabled   5h
dhcp43-168.lab.eng.blr.redhat.com   Ready                      5h


Version-Release number of selected component (if applicable):
# openshift version
openshift v3.4.0.32+d349492
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

[root@dhcp42-66 ~]# rpm -qa | grep 'heketi'
heketi-templates-3.1.0-3.el7rhgs.x86_64
heketi-client-3.1.0-3.el7rhgs.x86_64


How reproducible:
Always

Steps to Reproduce:
1. create openshift cluster
2. follow the steps to configure CNS - https://access.redhat.com/documentation/en/red-hat-gluster-storage/3.1/paged/container-native-storage-for-openshift-container-platform/
3. Before step 4.4, reboot master and worker nodes serially. i.e., reboot master node, wait for it to come up, then reboot first worker node.. so on and so forth.
4. Try running curl http://deploy-heketi-storage-project.cloudapps.mystorage.com/hello'

Actual results:
Refer attached output - [curl_helloworld]

Expected results:
heketi should respond

Additional info:
sosreports and other logs shall be attached shortly.

Comment 6 Humble Chirammal 2017-01-03 10:09:27 UTC
Ashiq, I rememember u were trying this in the setup. Can we close this bug?

Comment 7 Mohamed Ashiq 2017-01-03 11:05:39 UTC
Closing with comment#5

Just check if router is moved to different node and get the IP of the node and change the dnsmasq accordingly.

If you still face issue after doing the required step please reopen the bug.


Note You need to log in before you can comment on or make changes to this bug.