Bug 1473172
Summary: | [Scale Testing] glusterd fails to start in gluster pod when rebooted with around 600 volumes | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | krishnaram Karthick <kramdoss> |
Component: | rhgs-server-container | Assignee: | Ramakrishna Reddy Yekulla <rreddy> |
Status: | CLOSED NOTABUG | QA Contact: | krishnaram Karthick <kramdoss> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | cns-3.6 | CC: | amukherj, asrivast, hchiramm, kramdoss, madam, pprakash, rcyriac, rhs-bugs, rreddy, sankarshan |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-09-18 07:44:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
krishnaram Karthick
2017-07-20 07:42:00 UTC
On the rebooted node, pvscan --cache && pvs && vgscan --cache && vgs is resulting in these messages: 10.70.47.49 14:04:27 up 1:45, 1 user, load average: 0.12, 0.23, 0.52 PV VG Fmt Attr PSize PFree /dev/sda2 rhel_dhcp47-104 lvm2 a-- <99.00g 4.00m /dev/sdb vg_rhel_dhcp47--104-var lvm2 a-- <60.00g 1020.00m /dev/sdc1 docker-vg lvm2 a-- <50.00g 30.00g /dev/sdd vg_161d21b3068c4ea60ecba72c1450aedf lvm2 a-- 99.87g 39.39g /dev/sde vg_cd31d2bc8e2d9def002ba6d2c5db7e19 lvm2 a-- 99.87g 92.00m /dev/sdg vg_ac2b2a4bb2a522005033a9ba9eb6d8a6 lvm2 a-- 511.87g <59.36g Reading volume groups from cache. Found volume group "vg_ac2b2a4bb2a522005033a9ba9eb6d8a6" using metadata type lvm2 Found volume group "docker-vg" using metadata type lvm2 Found volume group "vg_161d21b3068c4ea60ecba72c1450aedf" using metadata type lvm2 Found volume group "rhel_dhcp47-104" using metadata type lvm2 Found volume group "vg_cd31d2bc8e2d9def002ba6d2c5db7e19" using metadata type lvm2 Found volume group "vg_rhel_dhcp47--104-var" using metadata type lvm2 VG #PV #LV #SN Attr VSize VFree docker-vg 1 1 0 wz--n- <50.00g 30.00g rhel_dhcp47-104 1 3 0 wz--n- <99.00g 4.00m vg_161d21b3068c4ea60ecba72c1450aedf 1 120 0 wz--n- 99.87g 39.39g vg_ac2b2a4bb2a522005033a9ba9eb6d8a6 1 898 0 wz--n- 511.87g <59.36g vg_cd31d2bc8e2d9def002ba6d2c5db7e19 1 196 0 wz--n- 99.87g 92.00m vg_rhel_dhcp47--104-var 1 1 0 wz--n- <60.00g 1020.00m WARNING: lvmetad connection failed, cannot reconnect. WARNING: lvmetad cannot be used due to error: Connection reset by peer WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0). WARNING: lvmetad connection failed, cannot reconnect. WARNING: lvmetad cannot be used due to error: Connection reset by peer WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0). WARNING: lvmetad connection failed, cannot reconnect. WARNING: lvmetad cannot be used due to error: Connection reset by peer WARNING: To avoid corruption, restart lvmetad (or disable with use_lvmetad=0). WARNING: lvmetad connection failed, cannot reconnect. root@dhcp47-49 ~]# pvscan PV /dev/sdg VG vg_ac2b2a4bb2a522005033a9ba9eb6d8a6 lvm2 [511.87 GiB / <59.36 GiB free] PV /dev/sdc1 VG docker-vg lvm2 [<50.00 GiB / 30.00 GiB free] PV /dev/sdd VG vg_161d21b3068c4ea60ecba72c1450aedf lvm2 [99.87 GiB / 39.39 GiB free] PV /dev/sda2 VG rhel_dhcp47-104 lvm2 [<99.00 GiB / 4.00 MiB free] PV /dev/sde VG vg_cd31d2bc8e2d9def002ba6d2c5db7e19 lvm2 [99.87 GiB / 92.00 MiB free] PV /dev/sdb VG vg_rhel_dhcp47--104-var lvm2 [<60.00 GiB / 1020.00 MiB free] Total: 6 [920.60 GiB] / in use: 6 [920.60 GiB] / in no VG: 0 [0 ] [root@dhcp47-49 ~]# vgscan Reading volume groups from cache. Found volume group "vg_ac2b2a4bb2a522005033a9ba9eb6d8a6" using metadata type lvm2 Found volume group "docker-vg" using metadata type lvm2 Found volume group "vg_161d21b3068c4ea60ecba72c1450aedf" using metadata type lvm2 Found volume group "rhel_dhcp47-104" using metadata type lvm2 Found volume group "vg_cd31d2bc8e2d9def002ba6d2c5db7e19" using metadata type lvm2 Found volume group "vg_rhel_dhcp47--104-var" using metadata type lvm2 I cannot say conclusively but I think this is the explanation On the vg vg_cd31d2bc8e2d9def002ba6d2c5db7e19, there are 196 LVs. 98 thin pools and 98 bricks. This vg is only of size 99GB and practically full. The metadata for all the LVs are stored on the same VG and there is some resource contention happening which is rendering lvmetad useless. An easy verification is to have disks which double the required capacity for 600 volumes and try again. If we don't see this problem when disks aren't full, it is surely a LVM bug and we will have to approach LVM experts about the limits. krk, can we try the above scenario please? Does the problem occur on all the three machines? 10.70.47.49 10.70.47.72 10.70.46.248 (In reply to Raghavendra Talur from comment #8) > Does the problem occur on all the three machines? > 10.70.47.49 > 10.70.47.72 > 10.70.46.248 Yes. This is seen on all three machines. Test runs with 625 Volumes the node reboot =========================================== - I rebooted the VM, dhcp42-97.lab.eng.blr.redhat.com via virt-manager. - The test run with 625 volumes, with a node reboot is back online in approximately 4 minutes. Thanks a lot Ramky!! Yeah, to summarize 2 scenarios were performed: 1) Pod Restart 2) Node Restart 1) Pod Restart: The Gluster pod which host bricks for 625 volumes came back in 6 minutes in this setup with "deletion of pod" via "oc delete pod <glusterfs-*> command 2) Node Restart: The Gluster pod which host bricks for 625 volumes came up in "3" minutes when we performed NODE reboot! --snip-- [root@dhcp42-112 ~]# oc get pods NAME READY STATUS RESTARTS AGE glusterblock-provisioner-dc-1-nz9mv 1/1 Running 2 20h glusterfs-359cp 1/1 Running 2 20h glusterfs-3bp0t 1/1 Running 0 55m glusterfs-xss4t 1/1 Running 2 20h heketi-1-sq557 1/1 Running 79 18h storage-project-router-1-4kmsq 1/1 Running 2 23h [root@dhcp42-112 ~]# oc get nodes NAME STATUS AGE VERSION dhcp41-193.lab.eng.blr.redhat.com Ready 23h v1.6.1+5115d708d7 dhcp42-112.lab.eng.blr.redhat.com Ready,SchedulingDisabled 23h v1.6.1+5115d708d7 dhcp42-97.lab.eng.blr.redhat.com NotReady 23h v1.6.1+5115d708d7 dhcp43-171.lab.eng.blr.redhat.com Ready 23h v1.6.1+5115d708d7 [root@dhcp42-112 ~]# oc get nodes NAME STATUS AGE VERSION dhcp41-193.lab.eng.blr.redhat.com Ready 23h v1.6.1+5115d708d7 dhcp42-112.lab.eng.blr.redhat.com Ready,SchedulingDisabled 23h v1.6.1+5115d708d7 dhcp42-97.lab.eng.blr.redhat.com Ready 23h v1.6.1+5115d708d7 dhcp43-171.lab.eng.blr.redhat.com Ready 23h v1.6.1+5115d708d7 [root@dhcp42-112 ~]# oc get pods NAME READY STATUS RESTARTS AGE glusterblock-provisioner-dc-1-nz9mv 1/1 Running 2 20h glusterfs-359cp 1/1 Running 2 20h glusterfs-3bp0t 1/1 Running 0 56m glusterfs-xss4t 0/1 Running 3 20h heketi-1-sq557 1/1 Running 79 18h storage-project-router-1-4kmsq 1/1 Running 2 23h [root@dhcp42-112 ~]# oc get pods -w NAME READY STATUS RESTARTS AGE glusterblock-provisioner-dc-1-nz9mv 1/1 Running 2 20h glusterfs-359cp 1/1 Running 2 20h glusterfs-3bp0t 1/1 Running 0 56m glusterfs-xss4t 0/1 Running 3 20h heketi-1-sq557 1/1 Running 79 18h storage-project-router-1-4kmsq 1/1 Running 2 23h glusterfs-xss4t 1/1 Running 3 20h --/snip-- Based on the test results, I am moving back this to ON_QA. I'll answer the latter part of the question first. The test mentioned has been performed with 625 Volumes on a Single Device. [root@dhcp42-112 ~]# heketi-cli topology info | grep "/dev" Id:1bb70c3198d3c4d7d5d7ac529a35ef9b Name:/dev/vdc State:online Size (GiB):1199 Used (GiB):632 Free (GiB):566 Id:2591462d49daa7930d4eb7b770cac00b Name:/dev/vdc State:online Size (GiB):1199 Used (GiB):632 Free (GiB):566 Id:bfa9ce5c8b955141709cdd6658c3c3c5 Name:/dev/vdc State:online Size (GiB):1199 Used (GiB):632 Free (GiB):566 === For the maximum number of volumes per device, there are quite a few co-relations, Amount of memory available in the nodes, amount of disk space. Since most of the dev test environments are created in virtual environments. In all tests above the VM's had about 16G of RAM. I could create additional 300 volumes without any errors. But when the test VM's memory was about 4GB, the volume creations used to fail when with 157 - 185 volumes on a single device. In the present test setup, my gut feeling is that we can go up to another 400 volumes on a single disk. But then here, if you want a deterministic approach, we need to leave the numbers to the performance team, who can give a conclusive answer, as their test is on bare metal and their access to repeatable/ reproducible test environments. I'm closing this bug as "NOTABUG". As on a single device, I have been able to perform pod restart and node restart with the following builds: Build Numbers :: ============================== -[root@dhcp42-112 ~]# rpm -qa | grep 'cns-deploy' cns-deploy-5.0.0-38.el7rhgs.x86_64 [root@dhcp42-112 ~]# rpm -qa | grep 'heketi-client' heketi-client-5.0.0-11.el7rhgs.x86_64 Docker Images :: ================ [root@dhcp42-112 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7 3.3.0-21 7a58514538de 4 days ago 407.7 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager-rhel7 3.3.0-13 f3afaa5a05f7 6 days ago 426.7 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-server-rhel7 3.3.0-20 1d748122341a 6 days ago 407.7 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-gluster-block-prov-docker 3.3.0-10 e571939c4800 2 weeks ago 395.9 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-s3-server-rhel7 3.3.0-7 c684dc64abff 2 weeks ago 447.3 MB Karthick, thanks for verifying this bug ! |