Bug 1619392 - [Tracker-RHGS-BZ#1620469] Brick process NOT ONLINE for heketidb and block-hosting volume
Summary: [Tracker-RHGS-BZ#1620469] Brick process NOT ONLINE for heketidb and block-hos...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rhgs-server-container
Version: cns-3.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: CNS 3.10
Assignee: Saravanakumar
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On: 1620469
Blocks: 1568862
TreeView+ depends on / blocked
 
Reported: 2018-08-20 17:29 UTC by Neha Berry
Modified: 2019-02-12 05:54 UTC (History)
7 users (show)

Fixed In Version: rhgs3/rhgs-server-rhel7:3.4.0-3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1620469 (view as bug list)
Environment:
Last Closed: 2018-09-12 10:54:03 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2688 None None None 2018-09-12 10:54:21 UTC

Description Neha Berry 2018-08-20 17:29:02 UTC
Description of problem:
++++++++++++++++++++++++
We had an OCP 3.10 + OCS 3.10 setup with gluster-bits=3.12.2-15. Gluser-block version = gluster-block-0.2.1-24.el7rhgs.x86_64. The setup has logging pods configured and the metrics pods couldn't come up.

# oc get pod -o wide|grep gluster
glusterblock-storage-provisioner-dc-1-l567j   1/1       Running             1          3d        10.128.4.11    dhcp47-86.lab.eng.blr.redhat.com
glusterfs-storage-9j9nk                       1/1       Running             1          3d        10.70.46.150   dhcp46-150.lab.eng.blr.redhat.com
glusterfs-storage-hr8ht                       1/1       Running             1          3d        10.70.46.219   dhcp46-219.lab.eng.blr.redhat.com
glusterfs-storage-q22cl                       1/1       Running             1          3d        10.70.46.231   dhcp46-231.lab.eng.blr.redhat.com

Steps Performed
==================

1. We updated the docker and gluster client packages on each oc node which resulted in the gluster pod restarts as well.

2. Created around 50 block pvcs in two loops and then attached them to app pods 

For 50 pvcs of 2 gb each, a total of 2 block hosting vols were created:

1. First block-host = vol_9f93ae4c845f3910f5d1558cc5ae9f0a
2. 2nd block-hosting = vol_1fda560284e932cae1e384fe779b430f

Details :
=============
A) For pvc bk101 -bk148, space was allocated from vol_9f93ae4c845f3910f5d1558cc5ae9f0a. Each block-volume creation succeeded(as seen from heketi-logs). 

B) For pvc bk149, a new block-hosting vol was created = vol_1fda560284e932cae1e384fe779b430f and was used for pvcs bk149 and bk150 (Time = 2018/08/20 10:01:10  UTC)


Following issues was seen on gluster pod 10.70.46.150
========================================================

1. for gluster node 10.70.46.150, the brick process for heketidbstorage was DOWN (even though ps -ef |grep glusterfsd reported it as running)
2. for gluster node 10.70.46.150, the brick process for vol_9f93ae4c845f3910f5d1558cc5ae9f0a is DOWN.
3. In a node, With brick-mux enabled, all block-hosting volumes should have same brick PID. But, in 10.70.46.150, 2 block-hosting volumes had 2 different PIDs.

Thus, on creation of 2nd block-hosting volume(vol_1fda560284e932cae1e384fe779b430f) , it should have used PID#540 of 1st block-hosting volume(vol_9f93ae4c845f3910f5d1558cc5ae9f0a). Instead a new PID was used -  12654. This resulted in brick process for vol_9f93ae4c845f3910f5d1558cc5ae9f0a going into NOT ONLINE status.



Some outputs from gluster and heketi end
==========================================

Brick PIDS from PODS
----------------------------
-----------------------------

[root@dhcp46-137 nitin]# for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- ps -ef|grep glusterfsd; echo ""; done

glusterfs-storage-9j9nk
+++++++++++++++++++++++
root       540     1 95 06:03 ?        04:18:55 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.150.var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.150-var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick.pid -S /var/run/gluster/6a8f7b2cb8da0e6a3ae398160998af29.socket --brick-name /var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_0cb321e2f4b4290bda1c2f9ae5085544/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153


root       558     1  0 06:03 ?        00:00:04 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id heketidbstorage.10.70.46.150.var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.150-var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick.pid -S /var/run/gluster/6fe959498daa7ffa24cb0ec026f845e3.socket --brick-name /var/lib/heketi/mounts/vg_6064162e01514ddd000da6dafdc79216/brick_c8dd81dd3761dd8212327131c4009716/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152


root     12654     1 36 10:01 ?        00:11:54 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id vol_1fda560284e932cae1e384fe779b430f.10.70.46.150.var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick -p /var/run/gluster/vols/vol_1fda560284e932cae1e384fe779b430f/10.70.46.150-var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick.pid -S /var/run/gluster/d16b6a7370a888823fc43060eeed1b2e.socket --brick-name /var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_013ad53a5ed578fd8f1275525e2c5916/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49154 --xlator-option vol_1fda560284e932cae1e384fe779b430f-server.listen-port=49154



glusterfs-storage-hr8ht
+++++++++++++++++++++++

root       527     1  0 07:49 ?        00:00:05 /usr/sbin/glusterfsd -s 10.70.46.219 --volfile-id heketidbstorage.10.70.46.219.var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.219-var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick.pid -S /var/run/gluster/49760ea523e2d92425cd374e21f7c6a6.socket --brick-name /var/lib/heketi/mounts/vg_a1297c7a138dcac578308e8afada5161/brick_8484360626b40cc47f707c683391f8b8/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick.log --xlator-option *-posix.glusterd-uuid=09aed8ae-858b-4d53-b7fa-745aa9443f18 --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152


root       535     1 99 07:49 ?        04:15:43 /usr/sbin/glusterfsd -s 10.70.46.219 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.219.var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.219-var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick.pid -S /var/run/gluster/c3f1138994a30170b2d5759b8bdbc313.socket --brick-name /var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_3319cdb494d7c201d2991173d18f2575/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick.log --xlator-option *-posix.glusterd-uuid=09aed8ae-858b-4d53-b7fa-745aa9443f18 --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153



glusterfs-storage-q22cl
+++++++++++++++++++++++

root       549     1  0 07:10 ?        00:00:05 /usr/sbin/glusterfsd -s 10.70.46.231 --volfile-id heketidbstorage.10.70.46.231.var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.231-var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick.pid -S /var/run/gluster/26c7477db0159fa810a1574951aa87cc.socket --brick-name /var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_6ee29707729f1c338abc1604473d6059/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick.log --xlator-option *-posix.glusterd-uuid=2018bea2-c934-4cb1-b19f-df8fb79752cc --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152


root       557     1 99 07:10 ?        04:33:11 /usr/sbin/glusterfsd -s 10.70.46.231 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.231.var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.231-var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick.pid -S /var/run/gluster/e88f8f4b09da12a41b04762db8e06ada.socket --brick-name /var/lib/heketi/mounts/vg_4525875aacd97d4337fb6a9c5f13eba6/brick_8c0881f317eb11607eff74a16027663d/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick.log --xlator-option *-posix.glusterd-uuid=2018bea2-c934-4cb1-b19f-df8fb79752cc --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153


_________________________________________________________________________________


Gluster v status
-------------------
--------------------


glusterfs-storage-9j9nk
+++++++++++++++++++++++
#gluster v status


Status of volume: heketidbstorage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.231:/var/lib/heketi/mounts/v
g_357556739aad4d3b81d3e935a27339dc/brick_6e
e29707729f1c338abc1604473d6059/brick        49152     0          Y       549  
Brick 10.70.46.219:/var/lib/heketi/mounts/v
g_a1297c7a138dcac578308e8afada5161/brick_84
84360626b40cc47f707c683391f8b8/brick        49152     0          Y       527  
Brick 10.70.46.150:/var/lib/heketi/mounts/v
g_6064162e01514ddd000da6dafdc79216/brick_c8
dd81dd3761dd8212327131c4009716/brick        N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       12681
Self-heal Daemon on 10.70.46.231            N/A       N/A        Y       10348
Self-heal Daemon on 10.70.46.219            N/A       N/A        Y       9154 
 
Task Status of Volume heketidbstorage
------------------------------------------------------------------------------
There are no active volume tasks


 
Status of volume: vol_1fda560284e932cae1e384fe779b430f
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.219:/var/lib/heketi/mounts/v
g_a32d29646c91834eeac64529870c71cd/brick_a4
2d82892274b1e498d54eb791bec7e5/brick        49153     0          Y       535  
Brick 10.70.46.150:/var/lib/heketi/mounts/v
g_29d26d418f4ec01cbd8805704313e5e0/brick_01
3ad53a5ed578fd8f1275525e2c5916/brick        49154     0          Y       12654
Brick 10.70.46.231:/var/lib/heketi/mounts/v
g_357556739aad4d3b81d3e935a27339dc/brick_e1
fa6fec4ca03e73f937ed35bcfd51a3/brick        49153     0          Y       557  
Self-heal Daemon on localhost               N/A       N/A        Y       12681
Self-heal Daemon on 10.70.46.219            N/A       N/A        Y       9154 
Self-heal Daemon on 10.70.46.231            N/A       N/A        Y       10348
 
Task Status of Volume vol_1fda560284e932cae1e384fe779b430f
------------------------------------------------------------------------------
There are no active volume tasks
 


Status of volume: vol_9f93ae4c845f3910f5d1558cc5ae9f0a
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.46.231:/var/lib/heketi/mounts/v
g_4525875aacd97d4337fb6a9c5f13eba6/brick_8c
0881f317eb11607eff74a16027663d/brick        49153     0          Y       557  
Brick 10.70.46.219:/var/lib/heketi/mounts/v
g_a32d29646c91834eeac64529870c71cd/brick_33
19cdb494d7c201d2991173d18f2575/brick        49153     0          Y       535  
Brick 10.70.46.150:/var/lib/heketi/mounts/v
g_29d26d418f4ec01cbd8805704313e5e0/brick_0c
b321e2f4b4290bda1c2f9ae5085544/brick        N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       12681
Self-heal Daemon on 10.70.46.219            N/A       N/A        Y       9154 
Self-heal Daemon on 10.70.46.231            N/A       N/A        Y       10348
 
Task Status of Volume vol_9f93ae4c845f3910f5d1558cc5ae9f0a
------------------------------------------------------------------------------
There are no active volume tasks



___________________________________________________________________________________________________


Gluster v heal status
-----------------------
----------------------


[root@dhcp46-137 nitin]# oc rsh glusterfs-storage-9j9nk 
sh-4.2# for i in `gluster v list` ; do echo $i; echo ""; gluster v heal $i info ; done
heketidbstorage

Brick 10.70.46.231:/var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_6ee29707729f1c338abc1604473d6059/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a1297c7a138dcac578308e8afada5161/brick_8484360626b40cc47f707c683391f8b8/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.150:/var/lib/heketi/mounts/vg_6064162e01514ddd000da6dafdc79216/brick_c8dd81dd3761dd8212327131c4009716/brick
Status: Connected
Number of entries: 0

vol_1fda560284e932cae1e384fe779b430f

Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_a42d82892274b1e498d54eb791bec7e5/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.150:/var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_013ad53a5ed578fd8f1275525e2c5916/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.231:/var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_e1fa6fec4ca03e73f937ed35bcfd51a3/brick
Status: Connected
Number of entries: 0

vol_9f93ae4c845f3910f5d1558cc5ae9f0a

Brick 10.70.46.231:/var/lib/heketi/mounts/vg_4525875aacd97d4337fb6a9c5f13eba6/brick_8c0881f317eb11607eff74a16027663d/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_3319cdb494d7c201d2991173d18f2575/brick
Status: Connected
Number of entries: 0

Brick 10.70.46.150:/var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_0cb321e2f4b4290bda1c2f9ae5085544/brick
Status: Connected
Number of entries: 0


________________________________________________________________________________


heketi-cli volume info
--------------------------
------------------------------

[root@dhcp46-137 nitin]# heketi-cli volume list

Id:051fbebfff39a75518aebd9c542db218    Cluster:f1717bd3ca8e511987efcb75bee36753    Name:heketidbstorage
Id:1fda560284e932cae1e384fe779b430f    Cluster:f1717bd3ca8e511987efcb75bee36753    Name:vol_1fda560284e932cae1e384fe779b430f [block]
Id:9f93ae4c845f3910f5d1558cc5ae9f0a    Cluster:f1717bd3ca8e511987efcb75bee36753    Name:vol_9f93ae4c845f3910f5d1558cc5ae9f0a [block]


[root@dhcp46-137 nitin]# heketi-cli volume info 9f93ae4c845f3910f5d1558cc5ae9f0a
Name: vol_9f93ae4c845f3910f5d1558cc5ae9f0a
Size: 100
Volume Id: 9f93ae4c845f3910f5d1558cc5ae9f0a
Cluster Id: f1717bd3ca8e511987efcb75bee36753
Mount: 10.70.46.150:vol_9f93ae4c845f3910f5d1558cc5ae9f0a
Mount Options: backup-volfile-servers=10.70.46.231,10.70.46.219
Block: true
Free Size: 1
Reserved Size: 2
Block Volumes: [0a1d30a3cc52dd406d2d39d5b5e70501 1060a75bff3f7cf713d30e016d32e8d0 1bf335c542c541cb197d1c37551d69d1 1f6b22241e9ad5ff40a0611c45a4a606 1f6d7345b9ab9b8e3addafc52f1e3252 23b5ef91aa47911e0d6d9279d94abf5b 29b61ed7b9065ed7c956a41139feb95d 2eea0f1def7f98d0519bc2c2b4aec85b 323bbd58629cd3bfe5bf1752f7028b45 3422a5369285c26a7e50d5aa3155f4e8 4498fe8c61bac062063acb10d5b95edd 49cd42e8094930437dfa23d8c9238d0a 520786834088c2cbba3a03204cdb4594 52e2b21b8171683a9513f6b922fd7f39 55391004bd3fe04524383638c3e9d6e7 5d64874fce7b916e06cad604dbf79de7 6372adf57d65bd2ac0ac437481c6d6a9 6a3cddaeb400d07ea2e2ef74c0b2f0e3 6e6e33e929fe6acdccdf1235a434927d 6fae3fb6169a9ad9aee34acff87d2019 7a7fe759b41c42642f0120c762923205 7b2a832b41b1994bf99980a344ef0180 7ce28f8c4e93d59cd39477f6d5822389 7e80aed66dcc4060f835cff73f4f9602 8028bcd29d80b7014831b937472116b8 850984510e650f5843e8cfb12274a10c 857ae1d3ede1869a61553d2d4a26aff7 8ab9d447993f857dfa9c374d3e787321 8c04bc6ac677749d44d2a59e3bb3167e 9a42c342c7e7b61b92a2bca3459bbeb5 a1e1912b66a20e26da53b173a396ed1c a43bd472302bebd8f73e1c5a79de5393 bd1586d5c7c756f16f5b93b247ff6474 c5b75d3969b23060285479770250944e cddbf02ea507351d26d1f6554044acb6 d1211f12dcf20021221f7d0bcc9206e7 d2d5ddc1a4621f290258c642b6a8ba5f d357e1167d54d83781c45c2de10f6d24 d7cb8bdbf2a6a7ecd63721d68d1291fb dae0c90207b354ebe87d5c5fc5cbb899 ea44ec1816ae619cef067c79871bf5e6 ed18246b3709d93c6ab0ab64bb7484ca edb6a8d91ffa91074dddccbf54cbd4c0 efd71d9aafff8ed67005458cd7b5ba86 efe026d4f4de819e951e7eead2d825c7 f44ce54dcddf9ceb96ae896e39b5175f f46d5a1c37f3483852b5234a17d26a49 f9c301cd176a9c23203c7d1bc5d14d93 fe9435237c267106ac5c72b3beaf9eff]
Durability Type: replicate
Distributed+Replica: 3


[root@dhcp46-137 ~]# heketi-cli volume info 1fda560284e932cae1e384fe779b430f
Name: vol_1fda560284e932cae1e384fe779b430f
Size: 100
Volume Id: 1fda560284e932cae1e384fe779b430f
Cluster Id: f1717bd3ca8e511987efcb75bee36753
Mount: 10.70.46.150:vol_1fda560284e932cae1e384fe779b430f
Mount Options: backup-volfile-servers=10.70.46.231,10.70.46.219
Block: true
Free Size: 94
Reserved Size: 2
Block Volumes: [3e4524d6023d3bea6607f53da04ace3e c58f20070e1286a5460edbf624ab6c93]
Durability Type: replicate
Distributed+Replica: 3
[root@dhcp46-137 ~]# 




Some outputs from heketi logs
================



# cat heketi_logs |grep bk149
[kubeexec] ERROR 2018/08/20 10:00:56 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block create vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b  ha 3 auth enable prealloc full 10.70.46.231,10.70.46.150,10.70.46.219 2GiB --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "IQN": "iqn.2016-12.org.gluster-block:7d351894-126a-4147-9fb1-d30cb90aab32", "USERNAME": "7d351894-126a-4147-9fb1-d30cb90aab32", "PASSWORD": "4c8bc088-066c-4278-a1b5-072240b0435c", "PORTAL(S)": [ "10.70.46.231:3260", "10.70.46.150:3260", "10.70.46.219:3260" ], "ROLLBACK FAILED ON": [ "10.70.46.150", "10.70.46.219", "10.70.46.231" ], "RESULT": "FAIL" }
[kubeexec] ERROR 2018/08/20 10:01:01 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block delete vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "RESULT": "FAIL" }
[cmdexec] ERROR 2018/08/20 10:01:01 /src/github.com/heketi/heketi/executors/cmdexec/block_volume.go:102: Unable to delete volume bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b: Unable to execute command on glusterfs-storage-9j9nk: 
[kubeexec] ERROR 2018/08/20 10:01:03 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block delete vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "RESULT": "FAIL" }
[cmdexec] ERROR 2018/08/20 10:01:03 /src/github.com/heketi/heketi/executors/cmdexec/block_volume.go:102: Unable to delete volume bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b: Unable to execute command on glusterfs-storage-9j9nk: 



[kubeexec] DEBUG 2018/08/20 10:01:05 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: dhcp46-219.lab.eng.blr.redhat.com Pod: glusterfs-storage-hr8ht Command: gluster --mode=script volume start vol_1fda560284e932cae1e384fe779b430f
Result: volume start: vol_1fda560284e932cae1e384fe779b430f: success
[kubeexec] DEBUG 2018/08/20 10:01:15 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: dhcp46-150.lab.eng.blr.redhat.com Pod: glusterfs-storage-9j9nk Command: gluster-block create vol_1fda560284e932cae1e384fe779b430f/bk_glusterfs_bk149_f765daa2-a45f-11e8-b773-0a580a81040b  ha 3 auth enable prealloc full 10.70.46.219,10.70.46.231,10.70.46.150 2GiB --json



We also hit BZ- https://bugzilla.redhat.com/show_bug.cgi?id=1619264 while the pvcs were getting mapped to app pods


Version-Release number of selected component (if applicable):
++++++++++++++++++++++++

[root@dhcp46-137 ~]# oc version
oc v3.10.14
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://dhcp46-137.lab.eng.blr.redhat.com:8443
openshift v3.10.14
kubernetes v1.10.0+b81c8f8
[root@dhcp46-137 ~]# 


Gluster 3.4.0
==============

[root@dhcp46-137 ~]# oc rsh glusterfs-storage-q22cl rpm -qa|grep gluster
glusterfs-client-xlators-3.12.2-15.el7rhgs.x86_64
glusterfs-cli-3.12.2-15.el7rhgs.x86_64
python2-gluster-3.12.2-15.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-15.el7rhgs.x86_64
glusterfs-libs-3.12.2-15.el7rhgs.x86_64
glusterfs-3.12.2-15.el7rhgs.x86_64
glusterfs-api-3.12.2-15.el7rhgs.x86_64
glusterfs-fuse-3.12.2-15.el7rhgs.x86_64
glusterfs-server-3.12.2-15.el7rhgs.x86_64
gluster-block-0.2.1-24.el7rhgs.x86_64
[root@dhcp46-137 ~]# 


[root@dhcp46-137 ~]# oc rsh heketi-storage-1-px7jd rpm -qa|grep heketi
python-heketi-7.0.0-6.el7rhgs.x86_64
heketi-7.0.0-6.el7rhgs.x86_64
heketi-client-7.0.0-6.el7rhgs.x86_64
[root@dhcp46-137 ~]# 


gluster client version
=========================
[root@dhcp46-65 ~]# rpm -qa|grep gluster
glusterfs-libs-3.12.2-15.el7.x86_64
glusterfs-3.12.2-15.el7.x86_64
glusterfs-fuse-3.12.2-15.el7.x86_64
glusterfs-client-xlators-3.12.2-15.el7.x86_64
[root@dhcp46-65 ~]# 




How reproducible:
++++++++++++++++++++++++
1x1


Steps to Reproduce:
++++++++++++++++++++++++
1. Create an OCP +OCS 3.10 setup.
2. Upgrade the docker version to 1.13.1.74 and also update the gluster client packages. The pods will be restarted as docker is upgraded.
3. Once setup is up, create block pvcs and then bound them to app pods.
4. Check the pod status and the gluster v status. With brick mux enabled, all block-hosting volumes should have a single PID.


Actual results:
++++++++++++++++++++++++
1. there are 2 PIDS for block-hosting volumes instead of 1 
2. The brick for heketidb and block-hosting volume is NOT ONLINE

Expected results:
++++++++++++++++++++++++
Even after pod restart or during pvc creations, the bricks of the volumes should not stay in NOT ONLINE state.

Comment 18 errata-xmlrpc 2018-09-12 10:54:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2688


Note You need to log in before you can comment on or make changes to this bug.