Description of problem: ++++++++++++++++++++++++ We had an OCP 3.10 + OCS 3.10 setup with gluster-bits=3.12.2-15. Gluser-block version = gluster-block-0.2.1-24.el7rhgs.x86_64. The setup has logging pods configured and the metrics pods couldn't come up. # oc get pod -o wide|grep gluster glusterblock-storage-provisioner-dc-1-l567j 1/1 Running 1 3d 10.128.4.11 dhcp47-86.lab.eng.blr.redhat.com glusterfs-storage-9j9nk 1/1 Running 1 3d 10.70.46.150 dhcp46-150.lab.eng.blr.redhat.com glusterfs-storage-hr8ht 1/1 Running 1 3d 10.70.46.219 dhcp46-219.lab.eng.blr.redhat.com glusterfs-storage-q22cl 1/1 Running 1 3d 10.70.46.231 dhcp46-231.lab.eng.blr.redhat.com Steps Performed ================== 1. We updated the docker and gluster client packages on each oc node which resulted in the gluster pod restarts as well. 2. Created around 50 block pvcs in two loops and then attached them to app pods For 50 pvcs of 2 gb each, a total of 2 block hosting vols were created: 1. First block-host = vol_9f93ae4c845f3910f5d1558cc5ae9f0a 2. 2nd block-hosting = vol_1fda560284e932cae1e384fe779b430f Details : ============= A) For pvc bk101 -bk148, space was allocated from vol_9f93ae4c845f3910f5d1558cc5ae9f0a. Each block-volume creation succeeded(as seen from heketi-logs). B) For pvc bk149, a new block-hosting vol was created = vol_1fda560284e932cae1e384fe779b430f and was used for pvcs bk149 and bk150 (Time = 2018/08/20 10:01:10 UTC) Following issues was seen on gluster pod 10.70.46.150 ======================================================== 1. for gluster node 10.70.46.150, the brick process for heketidbstorage was DOWN (even though ps -ef |grep glusterfsd reported it as running) 2. for gluster node 10.70.46.150, the brick process for vol_9f93ae4c845f3910f5d1558cc5ae9f0a is DOWN. 3. In a node, With brick-mux enabled, all block-hosting volumes should have same brick PID. But, in 10.70.46.150, 2 block-hosting volumes had 2 different PIDs. Thus, on creation of 2nd block-hosting volume(vol_1fda560284e932cae1e384fe779b430f) , it should have used PID#540 of 1st block-hosting volume(vol_9f93ae4c845f3910f5d1558cc5ae9f0a). Instead a new PID was used - 12654. This resulted in brick process for vol_9f93ae4c845f3910f5d1558cc5ae9f0a going into NOT ONLINE status. Some outputs from gluster and heketi end ========================================== Brick PIDS from PODS ---------------------------- ----------------------------- [root@dhcp46-137 nitin]# for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- ps -ef|grep glusterfsd; echo ""; done glusterfs-storage-9j9nk +++++++++++++++++++++++ root 540 1 95 06:03 ? 04:18:55 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.150.var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.150-var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick.pid -S /var/run/gluster/6a8f7b2cb8da0e6a3ae398160998af29.socket --brick-name /var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_0cb321e2f4b4290bda1c2f9ae5085544/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_0cb321e2f4b4290bda1c2f9ae5085544-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153 root 558 1 0 06:03 ? 00:00:04 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id heketidbstorage.10.70.46.150.var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.150-var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick.pid -S /var/run/gluster/6fe959498daa7ffa24cb0ec026f845e3.socket --brick-name /var/lib/heketi/mounts/vg_6064162e01514ddd000da6dafdc79216/brick_c8dd81dd3761dd8212327131c4009716/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_6064162e01514ddd000da6dafdc79216-brick_c8dd81dd3761dd8212327131c4009716-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152 root 12654 1 36 10:01 ? 00:11:54 /usr/sbin/glusterfsd -s 10.70.46.150 --volfile-id vol_1fda560284e932cae1e384fe779b430f.10.70.46.150.var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick -p /var/run/gluster/vols/vol_1fda560284e932cae1e384fe779b430f/10.70.46.150-var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick.pid -S /var/run/gluster/d16b6a7370a888823fc43060eeed1b2e.socket --brick-name /var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_013ad53a5ed578fd8f1275525e2c5916/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_29d26d418f4ec01cbd8805704313e5e0-brick_013ad53a5ed578fd8f1275525e2c5916-brick.log --xlator-option *-posix.glusterd-uuid=cd776ee9-6a31-496d-a8af-072f4c23aee4 --brick-port 49154 --xlator-option vol_1fda560284e932cae1e384fe779b430f-server.listen-port=49154 glusterfs-storage-hr8ht +++++++++++++++++++++++ root 527 1 0 07:49 ? 00:00:05 /usr/sbin/glusterfsd -s 10.70.46.219 --volfile-id heketidbstorage.10.70.46.219.var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.219-var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick.pid -S /var/run/gluster/49760ea523e2d92425cd374e21f7c6a6.socket --brick-name /var/lib/heketi/mounts/vg_a1297c7a138dcac578308e8afada5161/brick_8484360626b40cc47f707c683391f8b8/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_a1297c7a138dcac578308e8afada5161-brick_8484360626b40cc47f707c683391f8b8-brick.log --xlator-option *-posix.glusterd-uuid=09aed8ae-858b-4d53-b7fa-745aa9443f18 --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152 root 535 1 99 07:49 ? 04:15:43 /usr/sbin/glusterfsd -s 10.70.46.219 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.219.var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.219-var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick.pid -S /var/run/gluster/c3f1138994a30170b2d5759b8bdbc313.socket --brick-name /var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_3319cdb494d7c201d2991173d18f2575/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_a32d29646c91834eeac64529870c71cd-brick_3319cdb494d7c201d2991173d18f2575-brick.log --xlator-option *-posix.glusterd-uuid=09aed8ae-858b-4d53-b7fa-745aa9443f18 --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153 glusterfs-storage-q22cl +++++++++++++++++++++++ root 549 1 0 07:10 ? 00:00:05 /usr/sbin/glusterfsd -s 10.70.46.231 --volfile-id heketidbstorage.10.70.46.231.var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick -p /var/run/gluster/vols/heketidbstorage/10.70.46.231-var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick.pid -S /var/run/gluster/26c7477db0159fa810a1574951aa87cc.socket --brick-name /var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_6ee29707729f1c338abc1604473d6059/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_357556739aad4d3b81d3e935a27339dc-brick_6ee29707729f1c338abc1604473d6059-brick.log --xlator-option *-posix.glusterd-uuid=2018bea2-c934-4cb1-b19f-df8fb79752cc --brick-port 49152 --xlator-option heketidbstorage-server.listen-port=49152 root 557 1 99 07:10 ? 04:33:11 /usr/sbin/glusterfsd -s 10.70.46.231 --volfile-id vol_9f93ae4c845f3910f5d1558cc5ae9f0a.10.70.46.231.var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick -p /var/run/gluster/vols/vol_9f93ae4c845f3910f5d1558cc5ae9f0a/10.70.46.231-var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick.pid -S /var/run/gluster/e88f8f4b09da12a41b04762db8e06ada.socket --brick-name /var/lib/heketi/mounts/vg_4525875aacd97d4337fb6a9c5f13eba6/brick_8c0881f317eb11607eff74a16027663d/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_4525875aacd97d4337fb6a9c5f13eba6-brick_8c0881f317eb11607eff74a16027663d-brick.log --xlator-option *-posix.glusterd-uuid=2018bea2-c934-4cb1-b19f-df8fb79752cc --brick-port 49153 --xlator-option vol_9f93ae4c845f3910f5d1558cc5ae9f0a-server.listen-port=49153 _________________________________________________________________________________ Gluster v status ------------------- -------------------- glusterfs-storage-9j9nk +++++++++++++++++++++++ #gluster v status Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.231:/var/lib/heketi/mounts/v g_357556739aad4d3b81d3e935a27339dc/brick_6e e29707729f1c338abc1604473d6059/brick 49152 0 Y 549 Brick 10.70.46.219:/var/lib/heketi/mounts/v g_a1297c7a138dcac578308e8afada5161/brick_84 84360626b40cc47f707c683391f8b8/brick 49152 0 Y 527 Brick 10.70.46.150:/var/lib/heketi/mounts/v g_6064162e01514ddd000da6dafdc79216/brick_c8 dd81dd3761dd8212327131c4009716/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 12681 Self-heal Daemon on 10.70.46.231 N/A N/A Y 10348 Self-heal Daemon on 10.70.46.219 N/A N/A Y 9154 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_1fda560284e932cae1e384fe779b430f Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.219:/var/lib/heketi/mounts/v g_a32d29646c91834eeac64529870c71cd/brick_a4 2d82892274b1e498d54eb791bec7e5/brick 49153 0 Y 535 Brick 10.70.46.150:/var/lib/heketi/mounts/v g_29d26d418f4ec01cbd8805704313e5e0/brick_01 3ad53a5ed578fd8f1275525e2c5916/brick 49154 0 Y 12654 Brick 10.70.46.231:/var/lib/heketi/mounts/v g_357556739aad4d3b81d3e935a27339dc/brick_e1 fa6fec4ca03e73f937ed35bcfd51a3/brick 49153 0 Y 557 Self-heal Daemon on localhost N/A N/A Y 12681 Self-heal Daemon on 10.70.46.219 N/A N/A Y 9154 Self-heal Daemon on 10.70.46.231 N/A N/A Y 10348 Task Status of Volume vol_1fda560284e932cae1e384fe779b430f ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_9f93ae4c845f3910f5d1558cc5ae9f0a Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.231:/var/lib/heketi/mounts/v g_4525875aacd97d4337fb6a9c5f13eba6/brick_8c 0881f317eb11607eff74a16027663d/brick 49153 0 Y 557 Brick 10.70.46.219:/var/lib/heketi/mounts/v g_a32d29646c91834eeac64529870c71cd/brick_33 19cdb494d7c201d2991173d18f2575/brick 49153 0 Y 535 Brick 10.70.46.150:/var/lib/heketi/mounts/v g_29d26d418f4ec01cbd8805704313e5e0/brick_0c b321e2f4b4290bda1c2f9ae5085544/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 12681 Self-heal Daemon on 10.70.46.219 N/A N/A Y 9154 Self-heal Daemon on 10.70.46.231 N/A N/A Y 10348 Task Status of Volume vol_9f93ae4c845f3910f5d1558cc5ae9f0a ------------------------------------------------------------------------------ There are no active volume tasks ___________________________________________________________________________________________________ Gluster v heal status ----------------------- ---------------------- [root@dhcp46-137 nitin]# oc rsh glusterfs-storage-9j9nk sh-4.2# for i in `gluster v list` ; do echo $i; echo ""; gluster v heal $i info ; done heketidbstorage Brick 10.70.46.231:/var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_6ee29707729f1c338abc1604473d6059/brick Status: Connected Number of entries: 0 Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a1297c7a138dcac578308e8afada5161/brick_8484360626b40cc47f707c683391f8b8/brick Status: Connected Number of entries: 0 Brick 10.70.46.150:/var/lib/heketi/mounts/vg_6064162e01514ddd000da6dafdc79216/brick_c8dd81dd3761dd8212327131c4009716/brick Status: Connected Number of entries: 0 vol_1fda560284e932cae1e384fe779b430f Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_a42d82892274b1e498d54eb791bec7e5/brick Status: Connected Number of entries: 0 Brick 10.70.46.150:/var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_013ad53a5ed578fd8f1275525e2c5916/brick Status: Connected Number of entries: 0 Brick 10.70.46.231:/var/lib/heketi/mounts/vg_357556739aad4d3b81d3e935a27339dc/brick_e1fa6fec4ca03e73f937ed35bcfd51a3/brick Status: Connected Number of entries: 0 vol_9f93ae4c845f3910f5d1558cc5ae9f0a Brick 10.70.46.231:/var/lib/heketi/mounts/vg_4525875aacd97d4337fb6a9c5f13eba6/brick_8c0881f317eb11607eff74a16027663d/brick Status: Connected Number of entries: 0 Brick 10.70.46.219:/var/lib/heketi/mounts/vg_a32d29646c91834eeac64529870c71cd/brick_3319cdb494d7c201d2991173d18f2575/brick Status: Connected Number of entries: 0 Brick 10.70.46.150:/var/lib/heketi/mounts/vg_29d26d418f4ec01cbd8805704313e5e0/brick_0cb321e2f4b4290bda1c2f9ae5085544/brick Status: Connected Number of entries: 0 ________________________________________________________________________________ heketi-cli volume info -------------------------- ------------------------------ [root@dhcp46-137 nitin]# heketi-cli volume list Id:051fbebfff39a75518aebd9c542db218 Cluster:f1717bd3ca8e511987efcb75bee36753 Name:heketidbstorage Id:1fda560284e932cae1e384fe779b430f Cluster:f1717bd3ca8e511987efcb75bee36753 Name:vol_1fda560284e932cae1e384fe779b430f [block] Id:9f93ae4c845f3910f5d1558cc5ae9f0a Cluster:f1717bd3ca8e511987efcb75bee36753 Name:vol_9f93ae4c845f3910f5d1558cc5ae9f0a [block] [root@dhcp46-137 nitin]# heketi-cli volume info 9f93ae4c845f3910f5d1558cc5ae9f0a Name: vol_9f93ae4c845f3910f5d1558cc5ae9f0a Size: 100 Volume Id: 9f93ae4c845f3910f5d1558cc5ae9f0a Cluster Id: f1717bd3ca8e511987efcb75bee36753 Mount: 10.70.46.150:vol_9f93ae4c845f3910f5d1558cc5ae9f0a Mount Options: backup-volfile-servers=10.70.46.231,10.70.46.219 Block: true Free Size: 1 Reserved Size: 2 Block Volumes: [0a1d30a3cc52dd406d2d39d5b5e70501 1060a75bff3f7cf713d30e016d32e8d0 1bf335c542c541cb197d1c37551d69d1 1f6b22241e9ad5ff40a0611c45a4a606 1f6d7345b9ab9b8e3addafc52f1e3252 23b5ef91aa47911e0d6d9279d94abf5b 29b61ed7b9065ed7c956a41139feb95d 2eea0f1def7f98d0519bc2c2b4aec85b 323bbd58629cd3bfe5bf1752f7028b45 3422a5369285c26a7e50d5aa3155f4e8 4498fe8c61bac062063acb10d5b95edd 49cd42e8094930437dfa23d8c9238d0a 520786834088c2cbba3a03204cdb4594 52e2b21b8171683a9513f6b922fd7f39 55391004bd3fe04524383638c3e9d6e7 5d64874fce7b916e06cad604dbf79de7 6372adf57d65bd2ac0ac437481c6d6a9 6a3cddaeb400d07ea2e2ef74c0b2f0e3 6e6e33e929fe6acdccdf1235a434927d 6fae3fb6169a9ad9aee34acff87d2019 7a7fe759b41c42642f0120c762923205 7b2a832b41b1994bf99980a344ef0180 7ce28f8c4e93d59cd39477f6d5822389 7e80aed66dcc4060f835cff73f4f9602 8028bcd29d80b7014831b937472116b8 850984510e650f5843e8cfb12274a10c 857ae1d3ede1869a61553d2d4a26aff7 8ab9d447993f857dfa9c374d3e787321 8c04bc6ac677749d44d2a59e3bb3167e 9a42c342c7e7b61b92a2bca3459bbeb5 a1e1912b66a20e26da53b173a396ed1c a43bd472302bebd8f73e1c5a79de5393 bd1586d5c7c756f16f5b93b247ff6474 c5b75d3969b23060285479770250944e cddbf02ea507351d26d1f6554044acb6 d1211f12dcf20021221f7d0bcc9206e7 d2d5ddc1a4621f290258c642b6a8ba5f d357e1167d54d83781c45c2de10f6d24 d7cb8bdbf2a6a7ecd63721d68d1291fb dae0c90207b354ebe87d5c5fc5cbb899 ea44ec1816ae619cef067c79871bf5e6 ed18246b3709d93c6ab0ab64bb7484ca edb6a8d91ffa91074dddccbf54cbd4c0 efd71d9aafff8ed67005458cd7b5ba86 efe026d4f4de819e951e7eead2d825c7 f44ce54dcddf9ceb96ae896e39b5175f f46d5a1c37f3483852b5234a17d26a49 f9c301cd176a9c23203c7d1bc5d14d93 fe9435237c267106ac5c72b3beaf9eff] Durability Type: replicate Distributed+Replica: 3 [root@dhcp46-137 ~]# heketi-cli volume info 1fda560284e932cae1e384fe779b430f Name: vol_1fda560284e932cae1e384fe779b430f Size: 100 Volume Id: 1fda560284e932cae1e384fe779b430f Cluster Id: f1717bd3ca8e511987efcb75bee36753 Mount: 10.70.46.150:vol_1fda560284e932cae1e384fe779b430f Mount Options: backup-volfile-servers=10.70.46.231,10.70.46.219 Block: true Free Size: 94 Reserved Size: 2 Block Volumes: [3e4524d6023d3bea6607f53da04ace3e c58f20070e1286a5460edbf624ab6c93] Durability Type: replicate Distributed+Replica: 3 [root@dhcp46-137 ~]# Some outputs from heketi logs ================ # cat heketi_logs |grep bk149 [kubeexec] ERROR 2018/08/20 10:00:56 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block create vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b ha 3 auth enable prealloc full 10.70.46.231,10.70.46.150,10.70.46.219 2GiB --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "IQN": "iqn.2016-12.org.gluster-block:7d351894-126a-4147-9fb1-d30cb90aab32", "USERNAME": "7d351894-126a-4147-9fb1-d30cb90aab32", "PASSWORD": "4c8bc088-066c-4278-a1b5-072240b0435c", "PORTAL(S)": [ "10.70.46.231:3260", "10.70.46.150:3260", "10.70.46.219:3260" ], "ROLLBACK FAILED ON": [ "10.70.46.150", "10.70.46.219", "10.70.46.231" ], "RESULT": "FAIL" } [kubeexec] ERROR 2018/08/20 10:01:01 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block delete vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "RESULT": "FAIL" } [cmdexec] ERROR 2018/08/20 10:01:01 /src/github.com/heketi/heketi/executors/cmdexec/block_volume.go:102: Unable to delete volume bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b: Unable to execute command on glusterfs-storage-9j9nk: [kubeexec] ERROR 2018/08/20 10:01:03 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:242: Failed to run command [gluster-block delete vol_9f93ae4c845f3910f5d1558cc5ae9f0a/bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b --json] on glusterfs-storage-9j9nk: Err[command terminated with exit code 255]: Stdout [{ "RESULT": "FAIL" } [cmdexec] ERROR 2018/08/20 10:01:03 /src/github.com/heketi/heketi/executors/cmdexec/block_volume.go:102: Unable to delete volume bk_glusterfs_bk149_e4f79e2b-a45f-11e8-b773-0a580a81040b: Unable to execute command on glusterfs-storage-9j9nk: [kubeexec] DEBUG 2018/08/20 10:01:05 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: dhcp46-219.lab.eng.blr.redhat.com Pod: glusterfs-storage-hr8ht Command: gluster --mode=script volume start vol_1fda560284e932cae1e384fe779b430f Result: volume start: vol_1fda560284e932cae1e384fe779b430f: success [kubeexec] DEBUG 2018/08/20 10:01:15 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:246: Host: dhcp46-150.lab.eng.blr.redhat.com Pod: glusterfs-storage-9j9nk Command: gluster-block create vol_1fda560284e932cae1e384fe779b430f/bk_glusterfs_bk149_f765daa2-a45f-11e8-b773-0a580a81040b ha 3 auth enable prealloc full 10.70.46.219,10.70.46.231,10.70.46.150 2GiB --json We also hit BZ- https://bugzilla.redhat.com/show_bug.cgi?id=1619264 while the pvcs were getting mapped to app pods Version-Release number of selected component (if applicable): ++++++++++++++++++++++++ [root@dhcp46-137 ~]# oc version oc v3.10.14 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://dhcp46-137.lab.eng.blr.redhat.com:8443 openshift v3.10.14 kubernetes v1.10.0+b81c8f8 [root@dhcp46-137 ~]# Gluster 3.4.0 ============== [root@dhcp46-137 ~]# oc rsh glusterfs-storage-q22cl rpm -qa|grep gluster glusterfs-client-xlators-3.12.2-15.el7rhgs.x86_64 glusterfs-cli-3.12.2-15.el7rhgs.x86_64 python2-gluster-3.12.2-15.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-15.el7rhgs.x86_64 glusterfs-libs-3.12.2-15.el7rhgs.x86_64 glusterfs-3.12.2-15.el7rhgs.x86_64 glusterfs-api-3.12.2-15.el7rhgs.x86_64 glusterfs-fuse-3.12.2-15.el7rhgs.x86_64 glusterfs-server-3.12.2-15.el7rhgs.x86_64 gluster-block-0.2.1-24.el7rhgs.x86_64 [root@dhcp46-137 ~]# [root@dhcp46-137 ~]# oc rsh heketi-storage-1-px7jd rpm -qa|grep heketi python-heketi-7.0.0-6.el7rhgs.x86_64 heketi-7.0.0-6.el7rhgs.x86_64 heketi-client-7.0.0-6.el7rhgs.x86_64 [root@dhcp46-137 ~]# gluster client version ========================= [root@dhcp46-65 ~]# rpm -qa|grep gluster glusterfs-libs-3.12.2-15.el7.x86_64 glusterfs-3.12.2-15.el7.x86_64 glusterfs-fuse-3.12.2-15.el7.x86_64 glusterfs-client-xlators-3.12.2-15.el7.x86_64 [root@dhcp46-65 ~]# How reproducible: ++++++++++++++++++++++++ 1x1 Steps to Reproduce: ++++++++++++++++++++++++ 1. Create an OCP +OCS 3.10 setup. 2. Upgrade the docker version to 1.13.1.74 and also update the gluster client packages. The pods will be restarted as docker is upgraded. 3. Once setup is up, create block pvcs and then bound them to app pods. 4. Check the pod status and the gluster v status. With brick mux enabled, all block-hosting volumes should have a single PID. Actual results: ++++++++++++++++++++++++ 1. there are 2 PIDS for block-hosting volumes instead of 1 2. The brick for heketidb and block-hosting volume is NOT ONLINE Expected results: ++++++++++++++++++++++++ Even after pod restart or during pvc creations, the bricks of the volumes should not stay in NOT ONLINE state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2688