Bricks for heketidb and some other volumes not ONLINE in gluster volume status ============================================================ Description of problem: ++++++++++++++++++++++++ A fresh OCP 3.10 + CNS 3.10 setup (Gluster version - 3.12.2-17) was created. 2 File volumes and 2 block devices were present. Initially, It is seen that 2 bricks for heketidbstorage volume were DOWN. Started pod restart scenarios and ultimately it is seen that all the 3 bricks for the heketidbstorage volume are DOWN. Also,the brick for one another vol is also not ONLINE. Note: glusterfsd process are in running state for all the concerned bricks but gluster volume status lists thoe bricks as NOT ONLINE [root@dhcp46-44 brick-issue]# oc rsh glusterfs-storage-hhtf9 sh-4.2# gluster v status Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.79:/var/lib/heketi/mounts/vg _74eb681e28d6bdbfb38f19d87f30f99d/brick_3f1 9a8b21827f2130f3c8eefa0710cbd/brick N/A N/A N N/A Brick 10.70.46.169:/var/lib/heketi/mounts/v g_c90663cbc57af82cb58eff9b3045c46d/brick_e9 746476ad14c0a8bb7a9c667752fcdf/brick N/A N/A N N/A Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_6d1 4ec6ce7f3136974b85f889f4f55db/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: neha23 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.169:/var/lib/heketi/mounts/v g_c90663cbc57af82cb58eff9b3045c46d/brick_9d 535743ef7658856cfcfb4c5c8181d4/brick 49154 0 Y 587 Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_cb1 368e2b416e7e869b425b5c97204aa/brick 49153 0 Y 536 Brick 10.70.47.79:/var/lib/heketi/mounts/vg _f5ae27a982344f4a9373883753eedc74/brick_27a d4f75671cf62313bf7b98b7873b8c/brick 49153 0 Y 528 Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Task Status of Volume neha23 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_a5aa338f61db999e93d61ad3fa54b424 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.47.79:/var/lib/heketi/mounts/vg _f5ae27a982344f4a9373883753eedc74/brick_bde f74ff3c15bf676df28fe1f414c6f8/brick 49153 0 Y 528 Brick 10.70.46.169:/var/lib/heketi/mounts/v g_ea89a1ed513c4107fbed1a00179b0e95/brick_a5 e2373b6064bc2d27213d9933d0faca/brick 49154 0 Y 587 Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_e5d bff677f9222a68a34a24894677a4f/brick 49153 0 Y 536 Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Task Status of Volume vol_a5aa338f61db999e93d61ad3fa54b424 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_b90072440c75b3cce6697c2c894c19e3 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_cee 5459f87a9fc4133d29eff16863c60/brick 49153 0 Y 536 Brick 10.70.47.79:/var/lib/heketi/mounts/vg _74eb681e28d6bdbfb38f19d87f30f99d/brick_3aa 94e8adbe488f10ac4c019de764a5b/brick 49153 0 Y 528 Brick 10.70.46.169:/var/lib/heketi/mounts/v g_c90663cbc57af82cb58eff9b3045c46d/brick_55 08dbfabcf6de10ccb0ec2085f7da4d/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Task Status of Volume vol_b90072440c75b3cce6697c2c894c19e3 ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_d51a66f32feb990d2adf6b2a96586e8d Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_b29 e7cdf0b376758430a0d50fbdb5ebd/brick 49154 0 Y 544 Brick 10.70.47.79:/var/lib/heketi/mounts/vg _f5ae27a982344f4a9373883753eedc74/brick_e1d f58f067c54ca94807a0d126f5d990/brick 49154 0 Y 537 Brick 10.70.46.169:/var/lib/heketi/mounts/v g_ea89a1ed513c4107fbed1a00179b0e95/brick_86 36988a891dcf539eae0acfe443d321/brick 49153 0 Y 578 Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Task Status of Volume vol_d51a66f32feb990d2adf6b2a96586e8d ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_dce287c3032d56bdf8cf8cc5686c649c Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.46.53:/var/lib/heketi/mounts/vg _39e16b423c2dbf09b112819489a131a1/brick_411 0405d800d9d42b6dabc2a7339622f/brick 49153 0 Y 536 Brick 10.70.47.79:/var/lib/heketi/mounts/vg _74eb681e28d6bdbfb38f19d87f30f99d/brick_b50 0eec3dd5dcf871ee5cbd5f0ad5b5f/brick 49153 0 Y 528 Brick 10.70.46.169:/var/lib/heketi/mounts/v g_c90663cbc57af82cb58eff9b3045c46d/brick_18 49bfd47b8a1b4f0af78545ad2cfdd3/brick 49154 0 Y 587 Self-heal Daemon on localhost N/A N/A Y 25743 Self-heal Daemon on 10.70.46.169 N/A N/A Y 22839 Self-heal Daemon on dhcp46-53.lab.eng.blr.r edhat.com N/A N/A Y 23819 Task Status of Volume vol_dce287c3032d56bdf8cf8cc5686c649c ------------------------------------------------------------------------------ There are no active volume tasks sh-4.2# Steps performed: ==================== 1. Created 2 file and two blockvolumes. # heketi-cli volume list Id:b90072440c75b3cce6697c2c894c19e3 Cluster:225f8b3c6dcc9dcad4fec5800829d246 Name:vol_b90072440c75b3cce6697c2c894c19e3 Id:ca5c0b3b874a65e3c94ed921ea203cf0 Cluster:225f8b3c6dcc9dcad4fec5800829d246 Name:heketidbstorage Id:d51a66f32feb990d2adf6b2a96586e8d Cluster:225f8b3c6dcc9dcad4fec5800829d246 Name:vol_d51a66f32feb990d2adf6b2a96586e8d [block] Id:dce287c3032d56bdf8cf8cc5686c649c Cluster:225f8b3c6dcc9dcad4fec5800829d246 Name:vol_dce287c3032d56bdf8cf8cc5686c649c 2. Re-spinned all the glusterfs pods one by one. 3. Checked the gluster v status and ps -ef|grep glusterfsd 4. Also observed that the 2 file volumes do not seem to use the same glusterfsd PID as that of heketidbstorage. 5. Created some more file volumes. Version-Release number of selected component (if applicable): ++++++++++++++++++++++++ sh-4.2# rpm -qa|grep gluster glusterfs-client-xlators-3.12.2-17.el7rhgs.x86_64 glusterfs-cli-3.12.2-17.el7rhgs.x86_64 python2-gluster-3.12.2-17.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-17.el7rhgs.x86_64 glusterfs-debuginfo-3.12.2-17.el7rhgs.x86_64 glusterfs-libs-3.12.2-17.el7rhgs.x86_64 glusterfs-3.12.2-17.el7rhgs.x86_64 glusterfs-api-3.12.2-17.el7rhgs.x86_64 glusterfs-fuse-3.12.2-17.el7rhgs.x86_64 glusterfs-server-3.12.2-17.el7rhgs.x86_64 gluster-block-0.2.1-25.el7rhgs.x86_64 sh-4.2# [root@dhcp46-44 brick-issue]# oc describe pod glusterfs-storage-hhtf9 |grep -i image Image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7:3.4.0-3 Image ID: docker-pullable://brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7@sha256:3e1a2ed1c7f235989dd02af6386e51d64e7b4b48129960a522e362741b0b40cf [root@dhcp46-44 brick-issue]# How reproducible: ++++++++++++++++++++++++ it is intermittently seen. Actual results: ++++++++++++++++++++++++ The bricks are NOT ONLINE for heketidbstorage and 1 brick for vol_b90072440c75b3cce6697c2c894c19e3(File volume) Expected results: ++++++++++++++++++++++++ All bricks should be UP in gluster volume status, even when the pods are retarted.
Accepting as a blocker for CNS 3.10. Tracking RHGS BZ#1622452
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2688