Description of problem: Rebalance panel displays NA when rebalance is run manually from CLI # gluster volume rebalance <volume_name> start volume rebalance: <volume_name>: success: Rebalance on <volume_name> has been started successfully. Use rebalance status command to check status of the rebalance process. ID: b7ab4d78-9e2a-4b44-a456-9a4e1e20440f # gluster volume rebalance <volume_name> status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 7 0 0 completed 0:00:01 <hostname1> 0 0Bytes 5 0 0 completed 0:00:01 ... The same NA status is displayed on volume details page on RHGSWA UI. Version-Release number of selected component (if applicable): tendrl-ansible-1.5.4-1.el7rhgs.noarch tendrl-ui-1.5.4-4.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-5.el7rhgs.noarch tendrl-selinux-1.5.3-2.el7rhgs.noarch tendrl-commons-1.5.4-4.el7rhgs.noarch tendrl-api-1.5.4-2.el7rhgs.noarch tendrl-api-httpd-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-5.el7rhgs.noarch tendrl-grafana-selinux-1.5.3-2.el7rhgs.noarch tendrl-node-agent-1.5.4-5.el7rhgs.noarch tendrl-notifier-1.5.4-3.el7rhgs.noarch How reproducible: 100% Steps to Reproduce: 1. Create volume (e.g. arbiter volume) 2. Start rebalance 3. Check Rebalance panel in Grafana, and rebalance status on Volume Details page Actual results: Rebalance panel shows NA as last rebalance status, the same is displayed on Volume Details page Expected results: Rebalance status should correspond to the status get by gluster volume rebalance <volume_name> status Additional info: Rebalance takes no time as there are no data or very few data loaded.
Tested with: tendrl-gluster-integration-1.5.4-6.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-8.el7rhgs.noarch It was not working for me. There's still NA after more than 20 minutes of waiting.
Tested with: tendrl-gluster-integration-1.5.4-8.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch It was not working for me if the volume was arbiter. Interestingly when the volume is disperse RHGSWA shows the status as it should.
We want to have a look at your setup. Can you run 'gluster volume info' on the cluster where you are trying to test this scenario and paste the output here. What I think is you are trying to run rebalance on an invalid volume type. Just wanted to confirm that before we do any debugging.
(In reply to Nishanth Thomas from comment #8) > We want to have a look at your setup. > Can you run 'gluster volume info' on the cluster where you are trying to > test this scenario and paste the output here. > > What I think is you are trying to run rebalance on an invalid volume type. > Just wanted to confirm that before we do any debugging. OK, makes sense. Here's what I get: # gluster volume info Volume Name: volume_beta_arbiter_2_plus_1x2 Type: Distributed-Replicate Volume ID: 30fc5ce2-8c10-4d28-b7f9-8a3126ef5ff8 Status: Started Snapshot Count: 0 Number of Bricks: 6 x (2 + 1) = 18 Transport-type: tcp Bricks: Brick1: <hostname1>:/mnt/brick_beta_arbiter_1/1 Brick2: <hostname2>:/mnt/brick_beta_arbiter_1/1 Brick3: Mhostname3>:/mnt/brick_beta_arbiter_1/1 (arbiter) Brick4: <hostname4>:/mnt/brick_beta_arbiter_1/1 Brick5: <hostname5>:/mnt/brick_beta_arbiter_1/1 Brick6: <hostname6>:/mnt/brick_beta_arbiter_1/1 (arbiter) Brick7: <hostname1>:/mnt/brick_beta_arbiter_2/2 Brick8: <hostname2>:/mnt/brick_beta_arbiter_2/2 Brick9: Mhostname3>:/mnt/brick_beta_arbiter_2/2 (arbiter) Brick10: <hostname4>:/mnt/brick_beta_arbiter_2/2 Brick11: <hostname5>:/mnt/brick_beta_arbiter_2/2 Brick12: <hostname6>:/mnt/brick_beta_arbiter_2/2 (arbiter) Brick13: <hostname1>:/mnt/brick_beta_arbiter_3/3 Brick14: <hostname2>:/mnt/brick_beta_arbiter_3/3 Brick15: Mhostname3>:/mnt/brick_beta_arbiter_3/3 (arbiter) Brick16: <hostname4>:/mnt/brick_beta_arbiter_3/3 Brick17: <hostname5>:/mnt/brick_beta_arbiter_3/3 Brick18: <hostname6>:/mnt/brick_beta_arbiter_3/3 (arbiter) Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on transport.address-family: inet nfs.disable: on
(In reply to Lubos Trilety from comment #7) > Tested with: > tendrl-gluster-integration-1.5.4-8.el7rhgs.noarch > tendrl-monitoring-integration-1.5.4-11.el7rhgs.noarch > > It was not working for me if the volume was arbiter. Interestingly when the > volume is disperse RHGSWA shows the status as it should. Lubos I tried the same scenario on the arbiter volume. gluster vol info ssssssh Volume Name: ssssssh Type: Distributed-Replicate Volume ID: 66c88f57-5a05-4b15-aeb6-0412b225cf8e Status: Started Snapshot Count: 0 Number of Bricks: 2 x (2 + 1) = 6 Transport-type: tcp Bricks: Brick1: dhcp42-119.lab.eng.blr.redhat.com:/gluster/brick10/first Brick2: dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick10/second Brick3: dhcp42-127.lab.eng.blr.redhat.com:/gluster/brick10/third (arbiter) Brick4: dhcp42-125.lab.eng.blr.redhat.com:/gluster/brick10/fourth Brick5: dhcp42-129.lab.eng.blr.redhat.com:/gluster/brick3/fifth Brick6: dhcp42-127.lab.eng.blr.redhat.com:/gluster/brick3/sixth (arbiter) Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on transport.address-family: inet nfs.disable: on cluster.enable-shared-storage: disable nfs-ganesha: disable I am able to see the information on grafana dashboard for rebalance. Steps I performed: 1. Created arbiter volume with 6 bricks.2*(2+1) 2. Started the rebalance on one of the storage node. 3. I am able to see the information on the dashboard for rebalance Am I missing something here. Please find the attachment helpful
Created attachment 1363571 [details] Rebalance status as completed for the volume ssssssh(arbiter volume)
Hmm, it seems it doesn't matter on the volume type then, but on the speed of rebalance action. Mine was very quick as there were no data. I completely did not see 'In Progress' state. This state I could see, when I tried the same scenario with disperse volume. # gluster volume rebalance volume_beta_arbiter_2_plus_1x2 status Node Rebalanced-files size ... status run time in h:m:s --------- ----------- ----------- ... ------------ -------------- localhost 0 0Bytes ... completed 0:00:01 <hostname1> 0 0Bytes ... completed 0:00:01 <hostname2> 0 0Bytes ... completed 0:00:01 <hostname4> 0 0Bytes ... completed 0:00:01 <hostname5> 0 0Bytes ... completed 0:00:01 <hostname6> 0 0Bytes ... completed 0:00:01
Based on https://bugzilla.redhat.com/show_bug.cgi?id=1516876#c10, moving the bug back to ON_QA. In development setup as well we see the results as expected. Also make sure that the the setup meets the requirements specified at https://github.com/Tendrl/documentation/wiki/Tendrl-release-v1.5.4-(install-guide)#tendrl-server-system-requirements
Bala do you create the volume before RHGSWA install or after? I have the volume present in gluster before RHGSWA install. BTW I checked it and found that rebalance status is different from beginning. When I had disperse volume prepared the status was 'Not started' when I had arbiter volume prepared the status was 'NA'.
Lubos i created volume after RHGSWA install. I haven't tried the scenario which you mentioned and i feel like it doesn't make any different. correct me if i am wrong. I am able to see information like Not started on the rebalance panel. Please find the attachment helpful.
Created attachment 1363649 [details] Rebalance status as not started for the volume arbiter which is just started (arbiter volume)
(In reply to Bala Konda Reddy M from comment #15) > Lubos i created volume after RHGSWA install. I haven't tried the scenario > which you mentioned and i feel like it doesn't make any different. correct > me if i am wrong. > > I am able to see information like Not started on the rebalance panel. > > Please find the attachment helpful. I thought so too, but when I created a new arbiter volume the rebalance status is correct. That said it makes a difference if it is created before or after.
Created attachment 1363662 [details] rebalance status volume_beta_arbiter_2_plus_1x2 arbiter volume created before RHGSWA is installed volume_gamma_arbiter_2_plus_1x2 arbiter volume created after RHGSWA is installed and cluster imported
Created attachment 1363680 [details] On the volumes tab, I am able to see NA and Not started respectively
Please test with the latest builds.
Tested with: tendrl-gluster-integration-1.5.4-14.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-14.el7rhgs.noarch Working properly, rebalance status is displayed on Grafana dashboard.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3478