Created attachment 857054 [details] screenshot Description of problem: ------------------------- Skipped file count is not displayed in the remove-brick status dialog, even though 'gluster volume remove-brick status' shows skipped files. See below - [root@rhs glusterfs_58]# gluster v remove-brick dis_rep_vol 10.70.37.70:/rhs/brick3/b1 10.70.37.162:/rhs/brick4/b1 10.70.37.70:/rhs/brick4/b1 10.70.37.162:/rhs/brick3/b1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 5 2.9GB 36 0 30 in progress 161.00 10.70.37.70 0 0Bytes 650 0 0 completed 5.00 See screenshot for status dialog on the Console. Version-Release number of selected component (if applicable): Red Hat Storage Console Version: 2.1.2-0.35.el6rhs glusterfs 3.4.0.58rhs How reproducible: Always Steps to Reproduce: 1. Start remove-brick on a distribute-replicate volume such that there would be skipped files ( could be lack of space in the destination bricks ) 2. See gluster CLI remove-brick status output for skipped file count. 3. Check the status dialog in the UI. Actual results: Skipped file count not displayed in the UI. Expected results: All the data shown in the UI should match the gluster CLI output. Additional info:
Created attachment 857055 [details] engine logs
There is difference in gluster cli output with --xml flag. Skipped count is not returned in xml output. [root@rhs ~]# gluster volume remove-brick dis_rep_vol 10.70.37.162:/rhs/brick3/b1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 104 33.2GB 751 0 30 completed 772.00 10.70.37.70 0 0Bytes 650 0 0 completed 4.00 [root@rhs ~]# gluster volume remove-brick dis_rep_vol 10.70.37.162:/rhs/brick3/b1 status --xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> <volRemoveBrick> <task-id>97498624-6c35-4ab4-a878-2ce45a52a79d</task-id> <nodeCount>4</nodeCount> <node> <nodeName>localhost</nodeName> <id>706a5135-4737-48ee-9577-300d54b60ff6</id> <files>104</files> <size>35683081216</size> <lookups>751</lookups> <failures>0</failures> <skipped>0</skipped> <status>3</status> <statusStr>completed</statusStr> <runtime>772.00</runtime> </node> <node> <nodeName>10.70.37.70</nodeName> <id>bf570c26-c148-498a-9772-5d943ba81418</id> <files>0</files> <size>0</size> <lookups>650</lookups> <failures>0</failures> <skipped>0</skipped> <status>3</status> <statusStr>completed</statusStr> <runtime>4.00</runtime> </node> <aggregate> <files>104</files> <size>35683081216</size> <lookups>1401</lookups> <failures>0</failures> <skipped>0</skipped> <status>3</status> <statusStr>completed</statusStr> <runtime>772.00</runtime> </aggregate> </volRemoveBrick> </cliOutput> [root@rhs ~]#
KP from gluster team is looking into this bug now.
Upstream Patch Sent at http://review.gluster.org/#/c/6882/ Result of cli-xml output after the change: [root@vm1 home]# gluster v remove-brick test1 192.168.122.240:/brick1/1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 2 0 1 completed 0.00 [root@vm1 home]# gluster v remove-brick test1 192.168.122.240:/brick1/1 status --xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> <volRemoveBrick> <task-id>add66841-b53e-4a85-b8ae-ceceffe35b95</task-id> <nodeCount>1</nodeCount> <node> <nodeName>localhost</nodeName> <id>539073ad-e77c-44ee-bef7-84e3ac232a29</id> <files>0</files> <size>0</size> <lookups>2</lookups> <failures>0</failures> <skipped>1</skipped> <status>3</status> <statusStr>completed</statusStr> <runtime>0.00</runtime> </node> <aggregate> <files>0</files> <size>0</size> <lookups>2</lookups> <failures>0</failures> <skipped>1</skipped> <status>3</status> <statusStr>completed</statusStr> <runtime>0.00</runtime> </aggregate> </volRemoveBrick> </cliOutput> [root@vm1 home]#
I see the following in the patch cli/cli-xml : skipped files should be treated as failures for remove-brick operation. Fix: For remove-brick operation skipped count is included into failure count. clixml-output : skipped count would be zero always for remove-brick status. If the above is the case, why do we need to have the skipped file count field? we can simply hide it right?
RamaKasturi , The cli (non-xml) part uses the same function for remove-brick as well as rebalance status(To keep the code simple). And the intention was to keep the status fields for rebalance and remove-brick same.
Hi Dusmant , I see that the following change with this fix. cli/cli-xml : skipped files should be treated as failures for remove-brick operation. Fix: For remove-brick operation skipped count is included into failure count. clixml-output : skipped count would be zero always for remove-brick status. Is it required to show the skipped file count field if that does not get updated at all in the status dialog for remove-brick.
As per the patch submitted and comment 10, it works fine with the build RHSC : rhsc-2.1.2-0.36.el6rhs.noarch glusterfs : glusterfs-server-3.4.0.59rhs-1.el6rhs.x86_64 vdsm : vdsm-4.13.0-24.el6rhs.x86_64 Skipped file count is always displayed as zero, and for remove-brick operation skipped count is included into failure count. Raised a new bug to remove the skipped file count field from the remove-brick status dialog. https://bugzilla.redhat.com/show_bug.cgi?id=1064712
We already discussed this and provided the info.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html