Bug 1565759

Summary: environment variable 'HEKETI_MONITOR_GLUSTER_NODES' is not set by heketi
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: krishnaram Karthick <kramdoss>
Component: heketiAssignee: Michael Adam <madam>
Status: CLOSED ERRATA QA Contact: krishnaram Karthick <kramdoss>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: cns-3.9CC: hchiramm, pprakash, rcyriac, rhs-bugs, rtalur, sselvan, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: CNS 3.9 Async   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhgs-volmanager-container-3.3.1-8.3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-19 03:34:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
heketi_logs none

Description krishnaram Karthick 2018-04-10 17:15:55 UTC
Description of problem:
On a 5 node CNS setup, volume creation fails when glusterd is stopped on 2 nodes. 

sh-4.2# heketi-cli volume create --size=2
Error: Unable to execute command on glusterfs-storage-djxwn: volume create: vol_83f171a561341a55c5ac087510ae0aa2: failed: Host 10.70.46.45 not connected


Version-Release number of selected component (if applicable):
rpm -qa | grep 'heketi'
heketi-6.0.0-7.2.el7rhgs.x86_64
python-heketi-6.0.0-7.2.el7rhgs.x86_64
heketi-client-6.0.0-7.2.el7rhgs.x86_64


How reproducible:
1/1 - This issue should be consistently reproducible

Steps to Reproduce:
1. create a 5 node cns cluster
2. stop glusterd service on any 2 random nodes - oc rsh <gluster pod> and systemctl stop glusterd
3. create volume using heketi

Actual results:
volume creation fails

Expected results:
volume creation should succeed

Additional info:
heketi logs shall be attached

Comment 2 krishnaram Karthick 2018-04-10 17:20:06 UTC
Created attachment 1420000 [details]
heketi_logs

Comment 5 Raghavendra Talur 2018-04-12 05:59:51 UTC
The fix is to have node monitoring on always. Hence the latest container should work without any addition/change in the ENV.

Comment 6 krishnaram Karthick 2018-04-16 10:54:49 UTC
heketi monitoring is enabled by default with rhgs-volmanager-container-3.3.1-8.4

[heketi] INFO 2018/04/16 09:27:29 Loaded kubernetes executor
[heketi] INFO 2018/04/16 09:27:29 Block: Auto Create Block Hosting Volume set to true
[heketi] INFO 2018/04/16 09:27:29 Block: New Block Hosting Volume size 100 GB
[heketi] INFO 2018/04/16 09:27:29 GlusterFS Application Loaded
[heketi] INFO 2018/04/16 09:27:29 Started Node Health Cache Monitor
Listening on port 8080

Moving the bug to verified.

Comment 9 errata-xmlrpc 2018-04-19 03:34:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1178