1659021 – heketidb brick offline after gluster pod reboot

Bug 1659021 - heketidb brick offline after gluster pod reboot

Summary: heketidb brick offline after gluster pod reboot

Keywords:
Status:	CLOSED DUPLICATE of bug 1658984
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	rhgs-server-container
Sub Component:
Version:	ocs-3.11
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Saravanakumar
QA Contact:	vinutha
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-13 11:27 UTC by vinutha
Modified:	2019-01-03 13:43 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-01-03 13:43:22 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description vinutha 2018-12-13 11:27:40 UTC

Description of problem:
******** This bug was hit while verifying bug https://bugzilla.redhat.com/show_bug.cgi?id=1632896 ***************

After editing the ds with the Brick-Multiplex variable set to No and restarting the gluster pods makes the bricks of heketidbstorage volume go offline. 

Version-Release number of selected component (if applicable):
# rpm -qa| grep gluster 
glusterfs-server-3.12.2-30.el7rhgs.x86_64
gluster-block-0.2.1-29.el7rhgs.x86_64
glusterfs-api-3.12.2-30.el7rhgs.x86_64
glusterfs-cli-3.12.2-30.el7rhgs.x86_64
python2-gluster-3.12.2-30.el7rhgs.x86_64
glusterfs-fuse-3.12.2-30.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-30.el7rhgs.x86_64
glusterfs-libs-3.12.2-30.el7rhgs.x86_64
glusterfs-3.12.2-30.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-30.el7rhgs.x86_64

# oc rsh heketi-storage-1-qkgnn rpm -qa | grep heketi
heketi-8.0.0-2.el7rhgs.x86_64
heketi-client-8.0.0-2.el7rhgs.x86_64

# oc version 
oc v3.11.43
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dhcp46-113.lab.eng.blr.redhat.com:8443
openshift v3.11.43
kubernetes v1.11.0+d4cacc0

How reproducible:
2X2 on this setup 

Steps to Reproduce:
1. 4 node ocp + ocs setup. Created a file and a block pvc

# oc get pvc 
NAME      STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
bl1       Bound     pvc-082548a3-fec8-11e8-86ad-005056a53ec9   2Gi        RWO            gluster-block   9m
fl1       Bound     pvc-204393a8-fec8-11e8-86ad-005056a53ec9   1Gi        RWO            gluster-file    8m


# heketi-cli volume list 
Id:6caca1b9814eee7bab6f812083c6f8dd    Cluster:5ca7a0269e0407efbcdbdaee8b343996    Name:vol_6caca1b9814eee7bab6f812083c6f8dd [block]
Id:9d5488a261a68eeea0d5574fa74efeab    Cluster:5ca7a0269e0407efbcdbdaee8b343996    Name:vol_glusterfs_fl1_2049e447-fec8-11e8-9fe3-005056a53ec9
Id:a1146f15b390206cd49181526881ab3f    Cluster:5ca7a0269e0407efbcdbdaee8b343996    Name:heketidbstorage

All bricks of heketidbstorage are online before gluster pod reboot 


# oc rsh glusterfs-storage-8gpz8 gluster v status
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.166:/var/lib/heketi/mounts/v
g_ae300b4cee27f06380cf70817f175733/brick_36
e0027ce5f499adf741944f45361fa7/brick        49152     0          Y       446  
Brick 10.70.47.145:/var/lib/heketi/mounts/v
g_962a1bb9c334107f9cdc89dcd97dac05/brick_91
9104cfe6b87ac48bfd616687c5c75e/brick        49152     0          Y       412  
Brick 10.70.47.27:/var/lib/heketi/mounts/vg
_f398741caf7abeb26708dda31c04175d/brick_064
b73523e803d81cb348a11916af98e/brick         49152     0          Y       395  
Self-heal Daemon on localhost               N/A       N/A        Y       1486 
Self-heal Daemon on 10.70.47.166            N/A       N/A        Y       1662 
Self-heal Daemon on dhcp46-237.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       1596 
Self-heal Daemon on 10.70.47.27             N/A       N/A        Y       1546 
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks

gluster pods before reboot 
# oc get pods
NAME                                          READY     STATUS    RESTARTS   AGE
glusterblock-storage-provisioner-dc-1-8zvnx   1/1       Running   0          2d
glusterfs-storage-8gpz8                       1/1       Running   1          7d
glusterfs-storage-bmllb                       1/1       Running   1          7d
glusterfs-storage-d6shs                       1/1       Running   1          7d
glusterfs-storage-ht77p                       1/1       Running   1          7d
heketi-storage-1-qkgnn                        1/1       Running   5          2d



2. Edited the ds with the following change
 
# oc edit ds glusterfs-storage
daemonset.extensions/glusterfs-storage edited

- name: GLUSTER_BRICKMULTIPLEX
  value: "No"

3. Restarted the gluster pod. Observed that the heketidb bricks are not online after the gluster pod reboots.

# oc delete pod glusterfs-storage-8gpz8 
pod "glusterfs-storage-8gpz8" deleted


# oc rsh glusterfs-storage-blmhf gluster v status 
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.166:/var/lib/heketi/mounts/v
g_ae300b4cee27f06380cf70817f175733/brick_36
e0027ce5f499adf741944f45361fa7/brick        49152     0          Y       417  
Brick 10.70.47.145:/var/lib/heketi/mounts/v
g_962a1bb9c334107f9cdc89dcd97dac05/brick_91
9104cfe6b87ac48bfd616687c5c75e/brick        N/A       N/A        N       N/A  
Brick 10.70.47.27:/var/lib/heketi/mounts/vg
_f398741caf7abeb26708dda31c04175d/brick_064
b73523e803d81cb348a11916af98e/brick         49152     0          Y       404  
Self-heal Daemon on localhost               N/A       N/A        Y       332  
Self-heal Daemon on 10.70.47.166            N/A       N/A        Y       3624 
Self-heal Daemon on dhcp47-145.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       673  
Self-heal Daemon on 10.70.47.27             N/A       N/A        Y       3836 


Gluster volume heal info 
# oc rsh glusterfs-storage-p7pj2 gluster v status
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.166:/var/lib/heketi/mounts/v
g_ae300b4cee27f06380cf70817f175733/brick_36
e0027ce5f499adf741944f45361fa7/brick        49152     0          Y       446  
Brick 10.70.47.145:/var/lib/heketi/mounts/v
g_962a1bb9c334107f9cdc89dcd97dac05/brick_91
9104cfe6b87ac48bfd616687c5c75e/brick        N/A       N/A        N       N/A  
Brick 10.70.47.27:/var/lib/heketi/mounts/vg
_f398741caf7abeb26708dda31c04175d/brick_064
b73523e803d81cb348a11916af98e/brick         49152     0          Y       395  
Self-heal Daemon on localhost               N/A       N/A        Y       636  
Self-heal Daemon on 10.70.46.237            N/A       N/A        Y       1596 
Self-heal Daemon on 10.70.47.166            N/A       N/A        Y       1662 
Self-heal Daemon on 10.70.47.27             N/A       N/A        Y       1546 
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks


Pods after gluster pod 1 rebooted 
# oc get po 
NAME                                          READY     STATUS    RESTARTS   AGE
glusterblock-storage-provisioner-dc-1-8zvnx   1/1       Running   0          2d
glusterfs-storage-bmllb                       1/1       Running   1          7d
glusterfs-storage-d6shs                       1/1       Running   1          7d
glusterfs-storage-ht77p                       1/1       Running   1          7d
glusterfs-storage-p7pj2                       1/1       Running   0          10m
heketi-storage-1-qkgnn                        1/1       Running   5          2d



Actual results:
Heketidbstorage bricks not online after gluster pod reboot 

Expected results:
Heketidbstorage bricks should be online after gluster pod reboot

Additional info:
logs will be attached

Note You need to log in before you can comment on or make changes to this bug.