Bug 1575529
Summary: | [RFE] load balancing the IO connections (active paths) across HA nodes | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Prasanna Kumar Kalever <prasanna.kalever> |
Component: | gluster-block | Assignee: | Prasanna Kumar Kalever <prasanna.kalever> |
Status: | CLOSED ERRATA | QA Contact: | Neha Berry <nberry> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | cns-3.10 | CC: | akrishna, asriram, bgoyal, hchiramm, kramdoss, pkarampu, pprakash, prasanna.kalever, rhs-bugs, sankarshan, vbellur, xiubli |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | CNS 3.10 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | gluster-block-0.2.1-19.el7rhgs | Doc Type: | Enhancement |
Doc Text: |
Currently, multipathing priority configuration is set to constant with all the HA paths. Hence it is possible that the load might not be distributed uniformly across the available HA nodes. With this update, priority based load balancing at gluster-block is introduced. The management daemon gluster-block reads the load balance information from the metadata and selects a high priority path from HA based on the data. When the block device is requested for creation, high priority is set on a path whose node is least used. While logging to the device, the initiator side multipath tools picks the high priority path and marks it as active. This way it is possible to distribute the balance across the nodes.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-12 09:25:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1585197 | ||
Bug Blocks: | 1568860 |
Description
Prasanna Kumar Kalever
2018-05-07 08:14:17 UTC
The changes suggested by this RFE are in place since gluster-block version gluster-block-0.2.1-19.el7rhgs. Versions used for verification ============= # for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- rpm -qa | grep targetcli ; done glusterfs-storage-krlwr +++++++++++++++++++++++ targetcli-2.1.fb46-6.el7_5.noarch glusterfs-storage-pngnm +++++++++++++++++++++++ targetcli-2.1.fb46-6.el7_5.noarch glusterfs-storage-v8z6s +++++++++++++++++++++++ targetcli-2.1.fb46-6.el7_5.noarch # # for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- rpm -qa|grep gluster-block ; done glusterfs-storage-krlwr +++++++++++++++++++++++ gluster-block-0.2.1-20.el7rhgs.x86_64 glusterfs-storage-pngnm +++++++++++++++++++++++ gluster-block-0.2.1-20.el7rhgs.x86_64 glusterfs-storage-v8z6s +++++++++++++++++++++++ gluster-block-0.2.1-20.el7rhgs.x86_64 # # for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- rpm -qa|grep tcmu-runner ; done glusterfs-storage-krlwr +++++++++++++++++++++++ tcmu-runner-1.2.0-20.el7rhgs.x86_64 glusterfs-storage-pngnm +++++++++++++++++++++++ tcmu-runner-1.2.0-20.el7rhgs.x86_64 glusterfs-storage-v8z6s +++++++++++++++++++++++ tcmu-runner-1.2.0-20.el7rhgs.x86_64 # # for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- rpm -qa | grep python-configshell ; done glusterfs-storage-krlwr +++++++++++++++++++++++ python-configshell-1.1.fb23-4.el7_5.noarch glusterfs-storage-pngnm +++++++++++++++++++++++ python-configshell-1.1.fb23-4.el7_5.noarch glusterfs-storage-v8z6s +++++++++++++++++++++++ python-configshell-1.1.fb23-4.el7_5.noarch # # for i in `oc get pods -o wide| grep glusterfs|cut -d " " -f1` ; do echo $i; echo +++++++++++++++++++++++; oc exec $i -- rpm -qa | grep python-rtslib ; done glusterfs-storage-krlwr +++++++++++++++++++++++ python-rtslib-2.1.fb63-12.el7_5.noarch glusterfs-storage-pngnm +++++++++++++++++++++++ python-rtslib-2.1.fb63-12.el7_5.noarch glusterfs-storage-v8z6s +++++++++++++++++++++++ python-rtslib-2.1.fb63-12.el7_5.noarch Verified the following: +++++++++++++++++++++++++++++++ 1. created multiple block volumes mounted on app pods 2. Confirmed in targetcli ls, that we now have two target portal groups per storage object i.e. glfs_tg_pt_gp_ao and glfs_tg_pt_gp_ano, active optimized (AO) and active non optimized(ANO). AO has prio 50 and ANO has priority 10. 3. The load balancing works fine once we have changes in the /etc/multipath.conf file to have prio "alua" Some snippet from the setup for one blockvolume +++++++++++++++++++++++++++++ multipath -ll mpathd (3600140589bba72ef49445bf9501b7d9e) dm-34 LIO-ORG ,TCMU device size=5.0G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='round-robin 0' prio=50 status=active | `- 47:0:0:0 sdo 8:224 active ready running |-+- policy='round-robin 0' prio=10 status=enabled | `- 49:0:0:0 sdq 65:0 active ready running `-+- policy='round-robin 0' prio=10 status=enabled `- 48:0:0:0 sdp 8:240 active ready running #ll /dev/disk/by-path/ip*|grep sdo lrwxrwxrwx. 1 root root 9 Jul 4 10:04 /dev/disk/by-path/ip-10.70.43.230:3260-iscsi-iqn.2016-12.org.gluster-block:89bba72e-f494-45bf-9501-b7d9ec328213-lun-0 -> ../../sdo | o- iqn.2016-12.org.gluster-block:89bba72e-f494-45bf-9501-b7d9ec328213 ................................................ [TPGs: 3] | | o- tpg1 ........................................................................................................... [disabled] | | | o- acls .......................................................................................................... [ACLs: 0] | | | o- luns .......................................................................................................... [LUNs: 1] | | | | o- lun0 ........................ [user/test-vol_glusterfs_claim14_7d6cfdf9-7f43-11e8-a6bb-0a580a800209 (glfs_tg_pt_gp_ao)] | | | o- portals .................................................................................................... [Portals: 1] | | | o- 10.70.43.230:3260 ................................................................................................ [OK] | | o- tpg2 ..................................................................................... [gen-acls, tpg-auth, 1-way auth] | | | o- acls .......................................................................................................... [ACLs: 0] | | | o- luns .......................................................................................................... [LUNs: 1] | | | | o- lun0 ....................... [user/test-vol_glusterfs_claim14_7d6cfdf9-7f43-11e8-a6bb-0a580a800209 (glfs_tg_pt_gp_ano)] | | | o- portals .................................................................................................... [Portals: 1] | | | o- 10.70.43.19:3260 ................................................................................................. [OK] | | o- tpg3 ........................................................................................................... [disabled] | | o- acls .......................................................................................................... [ACLs: 0] | | o- luns .......................................................................................................... [LUNs: 1] | | | o- lun0 ....................... [user/test-vol_glusterfs_claim14_7d6cfdf9-7f43-11e8-a6bb-0a580a800209 (glfs_tg_pt_gp_ano)] | | o- portals .................................................................................................... [Portals: 1] | | o- 10.70.43.53:3260 ................................................................................................. [OK] 4. The /etc/multipath.conf file # cat /etc/multipath.conf # LIO iSCSI # TODO: Add env variables for tweaking devices { device { vendor "LIO-ORG" user_friendly_names "yes" path_grouping_policy "failover" path_selector "round-robin 0" failback immediate path_checker "tur" prio "alua" no_path_retry 120 rr_weight "uniform" } } defaults { user_friendly_names yes find_multipaths yes } blacklist { } --------------------------- 5. We have around 35 block devices created, hence confirmed that a counter is maintained to keep track of distributing the AO tpgs equally # attr -l prio.info Attribute "selinux" has a 30 byte value for prio.info Attribute "block.10.70.43.53" has a 1024 byte value for prio.info Attribute "block.10.70.43.19" has a 1024 byte value for prio.info Attribute "block.10.70.43.230" has a 1024 byte value for prio.info [root@dhcp43-29 block-meta]# for i in `attr -l prio.info|grep "block.10"|cut -d "\"" -f2`; do attr -g $i prio.info; done Attribute "block.10.70.43.53" had a 1024 byte value for prio.info: 4 Attribute "block.10.70.43.19" had a 1024 byte value for prio.info: 5 Attribute "block.10.70.43.230" had a 1024 byte value for prio.info: 5 # cd block-meta [root@dhcp43-29 block-meta]# attr -l prio.info Attribute "selinux" has a 30 byte value for prio.info Attribute "block.10.70.43.53" has a 1024 byte value for prio.info Attribute "block.10.70.43.19" has a 1024 byte value for prio.info Attribute "block.10.70.43.230" has a 1024 byte value for prio.info [root@dhcp43-29 block-meta]# for i in `attr -l prio.info|grep "block.10"|cut -d "\"" -f2`; do attr -g $i prio.info; done Attribute "block.10.70.43.53" had a 1024 byte value for prio.info: 7 Attribute "block.10.70.43.19" had a 1024 byte value for prio.info: 7 Attribute "block.10.70.43.230" had a 1024 byte value for prio.info: 8 Attaching the multipath and targetcli output from the setup for further confirmation. This bug is now moved to verified Have updated the doc text field , kindly review. made the required changes. updated that. thankyou Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2691 |