Bug 1473172 - [Scale Testing] glusterd fails to start in gluster pod when rebooted with around 600 volumes
[Scale Testing] glusterd fails to start in gluster pod when rebooted with aro...
Status: ASSIGNED
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: rhgs-server-container (Show other bugs)
cns-3.6
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Ramakrishna Reddy Yekulla
krishnaram Karthick
:
Depends On:
Blocks: 1445448
  Show dependency treegraph
 
Reported: 2017-07-20 03:42 EDT by krishnaram Karthick
Modified: 2017-08-17 09:47 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description krishnaram Karthick 2017-07-20 03:42:00 EDT
Description of problem:
On a 3 node CNS setup, when one of the node containing gluster pod was rebooted, the gluster pod fails to come up. There are around 600 volumes in the setup. The build used for the testing has brick multiplexing enabled.

I'm not sure where the glusterd fails to come up, but this is the observation.

Lot of brick relates messages are seen in the glusterd logs.

[2017-07-20 07:25:23.026481] I [glusterd-utils.c:5057:attach_brick] 0-management: add brick /var/lib/heketi/mounts/vg_cd31d2bc8e2d9def002ba6d2c5db7e19/brick_7c71f2212339545ca834a72ea6fee8c8/brick to existing process for /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick
[2017-07-20 07:25:23.026935] I [glusterd-utils.c:5057:attach_brick] 0-management: add brick /var/lib/heketi/mounts/vg_ac2b2a4bb2a522005033a9ba9eb6d8a6/brick_abce34cd3481cc4682693d32833669d9/brick to existing process for /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick
[2017-07-20 07:25:23.027449] I [glusterd-utils.c:5057:attach_brick] 0-management: add brick /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_b2870fabb4225bcc35bfe427c92faa1a/brick to existing process for /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick

connection related messages are seen too.

[2017-07-20 07:25:18.690206] W [glusterd-utils.c:5114:attach_brick] 0-management: attach failed for /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick
[2017-07-20 07:25:18.746394] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-07-20 07:25:19.747509] I [glusterd-utils.c:5057:attach_brick] 0-management: add brick /var/lib/heketi/mounts/vg_cd31d2bc8e2d9def002ba6d2c5db7e19/brick_9312d56fb9f5a1c85f7a06725ac27347/brick to existing process for /var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick
[2017-07-20 07:25:19.747816] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet
[2017-07-20 07:25:19.848777] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet
[2017-07-20 07:25:19.967816] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-07-20 07:25:19.970427] I [MSGID: 106005] [glusterd-handler.c:5687:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.47.49:/var/lib/heketi/mounts/vg_161d21b3068c4ea60ecba72c1450aedf/brick_7a11f1c67edee5557a9fbf3d4f5a0495/brick has disconnected from glusterd.
[2017-07-20 07:25:20.776064] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet
[2017-07-20 07:25:20.996851] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet
[2017-07-20 07:25:21.988277] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet
[2017-07-20 07:25:22.185533] I [glusterd-utils.c:4967:send_attach_req] 0-management: not connected yet

From the console I see this,
A start job is running for LVM2 PV scan on device xxx

I'm not able to point out where to start debugging, if the issue is with glusterd(brick-multiplexing) or LVM related.

Version-Release number of selected component (if applicable):
glusterfs-server-3.8.4-33.el7rhgs.x86_64
rhgs3/rhgs-server-rhel7:3.3.0-7

How reproducible:
1/1

Steps to Reproduce:
1. create 600 volumes on a CNS setup using heketi
2. reboot one of the worker node hosting gluster pod


Actual results:
gluster pod fails to get respun

Expected results:
gluster pod should get respun successfully

Additional info:
Logs shall be attached shortly.
Comment 3 Raghavendra Talur 2017-07-20 04:51:41 EDT
On the rebooted node, pvscan --cache && pvs && vgscan --cache && vgs is resulting in these messages:

10.70.47.49
 14:04:27 up  1:45,  1 user,  load average: 0.12, 0.23, 0.52
  PV         VG                                  Fmt  Attr PSize   PFree   
  /dev/sda2  rhel_dhcp47-104                     lvm2 a--  <99.00g    4.00m
  /dev/sdb   vg_rhel_dhcp47--104-var             lvm2 a--  <60.00g 1020.00m
  /dev/sdc1  docker-vg                           lvm2 a--  <50.00g   30.00g
  /dev/sdd   vg_161d21b3068c4ea60ecba72c1450aedf lvm2 a--   99.87g   39.39g
  /dev/sde   vg_cd31d2bc8e2d9def002ba6d2c5db7e19 lvm2 a--   99.87g   92.00m
  /dev/sdg   vg_ac2b2a4bb2a522005033a9ba9eb6d8a6 lvm2 a--  511.87g  <59.36g
  Reading volume groups from cache.
  Found volume group "vg_ac2b2a4bb2a522005033a9ba9eb6d8a6" using metadata type lvm2
  Found volume group "docker-vg" using metadata type lvm2
  Found volume group "vg_161d21b3068c4ea60ecba72c1450aedf" using metadata type lvm2
  Found volume group "rhel_dhcp47-104" using metadata type lvm2
  Found volume group "vg_cd31d2bc8e2d9def002ba6d2c5db7e19" using metadata type lvm2
  Found volume group "vg_rhel_dhcp47--104-var" using metadata type lvm2
  VG                                  #PV #LV #SN Attr   VSize   VFree   
  docker-vg                             1   1   0 wz--n- <50.00g   30.00g
  rhel_dhcp47-104                       1   3   0 wz--n- <99.00g    4.00m
  vg_161d21b3068c4ea60ecba72c1450aedf   1 120   0 wz--n-  99.87g   39.39g
  vg_ac2b2a4bb2a522005033a9ba9eb6d8a6   1 898   0 wz--n- 511.87g  <59.36g
  vg_cd31d2bc8e2d9def002ba6d2c5db7e19   1 196   0 wz--n-  99.87g   92.00m
  vg_rhel_dhcp47--104-var               1   1   0 wz--n- <60.00g 1020.00m
  WARNING: lvmetad connection failed, cannot reconnect.
  WARNING: lvmetad cannot be used due to error: Connection reset by peer
  WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0).
  WARNING: lvmetad connection failed, cannot reconnect.
  WARNING: lvmetad cannot be used due to error: Connection reset by peer
  WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0).
  WARNING: lvmetad connection failed, cannot reconnect.
  WARNING: lvmetad cannot be used due to error: Connection reset by peer
  WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0).
  WARNING: lvmetad connection failed, cannot reconnect.
Comment 4 Raghavendra Talur 2017-07-20 05:22:13 EDT
root@dhcp47-49 ~]# pvscan
  PV /dev/sdg    VG vg_ac2b2a4bb2a522005033a9ba9eb6d8a6   lvm2 [511.87 GiB / <59.36 GiB free]
  PV /dev/sdc1   VG docker-vg                             lvm2 [<50.00 GiB / 30.00 GiB free]
  PV /dev/sdd    VG vg_161d21b3068c4ea60ecba72c1450aedf   lvm2 [99.87 GiB / 39.39 GiB free]
  PV /dev/sda2   VG rhel_dhcp47-104                       lvm2 [<99.00 GiB / 4.00 MiB free]
  PV /dev/sde    VG vg_cd31d2bc8e2d9def002ba6d2c5db7e19   lvm2 [99.87 GiB / 92.00 MiB free]
  PV /dev/sdb    VG vg_rhel_dhcp47--104-var               lvm2 [<60.00 GiB / 1020.00 MiB free]
  Total: 6 [920.60 GiB] / in use: 6 [920.60 GiB] / in no VG: 0 [0   ]
[root@dhcp47-49 ~]# vgscan
  Reading volume groups from cache.
  Found volume group "vg_ac2b2a4bb2a522005033a9ba9eb6d8a6" using metadata type lvm2
  Found volume group "docker-vg" using metadata type lvm2
  Found volume group "vg_161d21b3068c4ea60ecba72c1450aedf" using metadata type lvm2
  Found volume group "rhel_dhcp47-104" using metadata type lvm2
  Found volume group "vg_cd31d2bc8e2d9def002ba6d2c5db7e19" using metadata type lvm2
  Found volume group "vg_rhel_dhcp47--104-var" using metadata type lvm2
Comment 5 Raghavendra Talur 2017-07-20 05:26:24 EDT
I cannot say conclusively but I think this is the explanation

On the vg vg_cd31d2bc8e2d9def002ba6d2c5db7e19, there are 196 LVs. 98 thin pools and 98 bricks. This vg is only of size 99GB and practically full. The metadata for all the LVs are stored on the same VG and there is some resource contention happening which is rendering lvmetad useless.


An easy verification is to have disks which double the required capacity for 600 volumes and try again. If we don't see this problem when disks aren't full, it is surely a LVM bug and we will have to approach LVM experts about the limits.


krk, can we try the above scenario please?
Comment 8 Raghavendra Talur 2017-07-25 01:15:17 EDT
Does the problem occur on all the three machines?
10.70.47.49
10.70.47.72
10.70.46.248
Comment 9 krishnaram Karthick 2017-07-25 07:04:45 EDT
(In reply to Raghavendra Talur from comment #8)
> Does the problem occur on all the three machines?
> 10.70.47.49
> 10.70.47.72
> 10.70.46.248

Yes. This is seen on all three machines.

Note You need to log in before you can comment on or make changes to this bug.