Bug 1622447 - [Tracker-RHGS-BZ#1622452] Bricks for heketidb and some other volumes not ONLINE in gluster volume status
Summary: [Tracker-RHGS-BZ#1622452] Bricks for heketidb and some other volumes not ONL...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: rhgs-server-container
Version: cns-3.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: CNS 3.10
Assignee: Saravanakumar
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On: 1622452
Blocks: 1568862
TreeView+ depends on / blocked
 
Reported: 2018-08-27 08:37 UTC by Neha Berry
Modified: 2019-02-12 05:54 UTC (History)
7 users (show)

Fixed In Version: rhgs-server-rhel7:3.4.0-4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1622452 (view as bug list)
Environment:
Last Closed: 2018-09-12 10:54:03 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2688 0 None None None 2018-09-12 10:54:21 UTC

Description Neha Berry 2018-08-27 08:37:08 UTC
Bricks for heketidb and some other volumes not ONLINE in gluster volume status
============================================================


Description of problem:
++++++++++++++++++++++++


A fresh OCP 3.10 + CNS 3.10 setup (Gluster version - 3.12.2-17) was created. 2 File volumes and 2 block devices were present. Initially, It is seen that 2 bricks for heketidbstorage volume were DOWN.

Started pod restart scenarios and ultimately it is seen that all the 3 bricks for the heketidbstorage volume are DOWN. Also,the brick for one another vol is also not ONLINE.

Note: glusterfsd process are in running state for all the concerned bricks but gluster volume status lists thoe bricks as NOT ONLINE


[root@dhcp46-44 brick-issue]# oc rsh glusterfs-storage-hhtf9 
sh-4.2# gluster v status
Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_74eb681e28d6bdbfb38f19d87f30f99d/brick_3f1
9a8b21827f2130f3c8eefa0710cbd/brick         N/A       N/A        N       N/A  
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_c90663cbc57af82cb58eff9b3045c46d/brick_e9
746476ad14c0a8bb7a9c667752fcdf/brick        N/A       N/A        N       N/A  
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_6d1
4ec6ce7f3136974b85f889f4f55db/brick         N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: neha23
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_c90663cbc57af82cb58eff9b3045c46d/brick_9d
535743ef7658856cfcfb4c5c8181d4/brick        49154     0          Y       587  
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_cb1
368e2b416e7e869b425b5c97204aa/brick         49153     0          Y       536  
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_f5ae27a982344f4a9373883753eedc74/brick_27a
d4f75671cf62313bf7b98b7873b8c/brick         49153     0          Y       528  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
 
Task Status of Volume neha23
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_a5aa338f61db999e93d61ad3fa54b424
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_f5ae27a982344f4a9373883753eedc74/brick_bde
f74ff3c15bf676df28fe1f414c6f8/brick         49153     0          Y       528  
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_ea89a1ed513c4107fbed1a00179b0e95/brick_a5
e2373b6064bc2d27213d9933d0faca/brick        49154     0          Y       587  
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_e5d
bff677f9222a68a34a24894677a4f/brick         49153     0          Y       536  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
 
Task Status of Volume vol_a5aa338f61db999e93d61ad3fa54b424
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_b90072440c75b3cce6697c2c894c19e3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_cee
5459f87a9fc4133d29eff16863c60/brick         49153     0          Y       536  
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_74eb681e28d6bdbfb38f19d87f30f99d/brick_3aa
94e8adbe488f10ac4c019de764a5b/brick         49153     0          Y       528  
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_c90663cbc57af82cb58eff9b3045c46d/brick_55
08dbfabcf6de10ccb0ec2085f7da4d/brick        N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
 
Task Status of Volume vol_b90072440c75b3cce6697c2c894c19e3
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_d51a66f32feb990d2adf6b2a96586e8d
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_b29
e7cdf0b376758430a0d50fbdb5ebd/brick         49154     0          Y       544  
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_f5ae27a982344f4a9373883753eedc74/brick_e1d
f58f067c54ca94807a0d126f5d990/brick         49154     0          Y       537  
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_ea89a1ed513c4107fbed1a00179b0e95/brick_86
36988a891dcf539eae0acfe443d321/brick        49153     0          Y       578  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
 
Task Status of Volume vol_d51a66f32feb990d2adf6b2a96586e8d
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vol_dce287c3032d56bdf8cf8cc5686c649c
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.53:/var/lib/heketi/mounts/vg
_39e16b423c2dbf09b112819489a131a1/brick_411
0405d800d9d42b6dabc2a7339622f/brick         49153     0          Y       536  
Brick 10.70.47.79:/var/lib/heketi/mounts/vg
_74eb681e28d6bdbfb38f19d87f30f99d/brick_b50
0eec3dd5dcf871ee5cbd5f0ad5b5f/brick         49153     0          Y       528  
Brick 10.70.46.169:/var/lib/heketi/mounts/v
g_c90663cbc57af82cb58eff9b3045c46d/brick_18
49bfd47b8a1b4f0af78545ad2cfdd3/brick        49154     0          Y       587  
Self-heal Daemon on localhost               N/A       N/A        Y       25743
Self-heal Daemon on 10.70.46.169            N/A       N/A        Y       22839
Self-heal Daemon on dhcp46-53.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       23819
 
Task Status of Volume vol_dce287c3032d56bdf8cf8cc5686c649c
------------------------------------------------------------------------------
There are no active volume tasks
 
sh-4.2# 





Steps performed:
====================
1. Created 2 file and two blockvolumes.

# heketi-cli volume list
Id:b90072440c75b3cce6697c2c894c19e3    Cluster:225f8b3c6dcc9dcad4fec5800829d246    Name:vol_b90072440c75b3cce6697c2c894c19e3
Id:ca5c0b3b874a65e3c94ed921ea203cf0    Cluster:225f8b3c6dcc9dcad4fec5800829d246    Name:heketidbstorage
Id:d51a66f32feb990d2adf6b2a96586e8d    Cluster:225f8b3c6dcc9dcad4fec5800829d246    Name:vol_d51a66f32feb990d2adf6b2a96586e8d [block]
Id:dce287c3032d56bdf8cf8cc5686c649c    Cluster:225f8b3c6dcc9dcad4fec5800829d246    Name:vol_dce287c3032d56bdf8cf8cc5686c649c

2. Re-spinned all the glusterfs pods one by one.
3. Checked the gluster v status and ps -ef|grep glusterfsd
4. Also observed that the 2 file volumes do not seem to use the same glusterfsd PID as that of heketidbstorage.
5. Created some more file volumes.





Version-Release number of selected component (if applicable):
++++++++++++++++++++++++
sh-4.2# rpm -qa|grep gluster
glusterfs-client-xlators-3.12.2-17.el7rhgs.x86_64
glusterfs-cli-3.12.2-17.el7rhgs.x86_64
python2-gluster-3.12.2-17.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-17.el7rhgs.x86_64
glusterfs-debuginfo-3.12.2-17.el7rhgs.x86_64
glusterfs-libs-3.12.2-17.el7rhgs.x86_64
glusterfs-3.12.2-17.el7rhgs.x86_64
glusterfs-api-3.12.2-17.el7rhgs.x86_64
glusterfs-fuse-3.12.2-17.el7rhgs.x86_64
glusterfs-server-3.12.2-17.el7rhgs.x86_64
gluster-block-0.2.1-25.el7rhgs.x86_64
sh-4.2# 


[root@dhcp46-44 brick-issue]# oc describe pod glusterfs-storage-hhtf9 |grep -i image
    Image:          brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:3.4.0-3
    Image ID:       docker-pullable://brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7@sha256:3e1a2ed1c7f235989dd02af6386e51d64e7b4b48129960a522e362741b0b40cf
[root@dhcp46-44 brick-issue]# 



How reproducible:
++++++++++++++++++++++++
it is intermittently seen.



Actual results:
++++++++++++++++++++++++
The bricks are NOT ONLINE for heketidbstorage and 1 brick for  vol_b90072440c75b3cce6697c2c894c19e3(File volume)

Expected results:
++++++++++++++++++++++++
All bricks should be UP in gluster volume status, even when the pods are retarted.

Comment 4 Michael Adam 2018-08-27 12:14:00 UTC
Accepting as a blocker for CNS 3.10.
Tracking RHGS BZ#1622452

Comment 12 errata-xmlrpc 2018-09-12 10:54:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2688


Note You need to log in before you can comment on or make changes to this bug.