Hide Forgot
Created attachment 834735 [details] test automation log Description of problem: This is an issue with the rest api: Cannot stop volume after s•tarting rebalance while migration in progress. test automation log attached Version-Release number of selected component (if applicable): rhsc-cb10 How reproducible: intermittent Steps to Reproduce: 1. create a distributed volume with data on each brick 2. start migration on a single brick 3. start rebalance on the volume => rebalance fails as expected 4. call stop migration => stop migration succeeds 5. stop volume via rest api Actual results: HTTP 400 received <?xml version="1.0" encoding="UTF-8" standalone="yes"?>[\n]" <action>[\n]" <status>[\n]" <state>failed</state>[\n]" </status>[\n]" <fault>[\n]" <reason>Operation Failed</reason>[\n]" <detail>[volume stop failed[\n]" error: staging failed on 10.14.16.158. error: rebalance session is in progress for the volume 'rebalwhilemigration'[\» return code: -1]</detail>[\n]" </fault>[\n]" </action>[\n]" Running 'gluster vol status' on a gluster host before shows now tasks in progress: Status of volume: rebalwhilemigration Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick latest-a:/tmp/201312100720580051942637805 49152 Y 11012 Brick latest-b:/tmp/20131210072058976850249563 49152 Y 7315 NFS Server on localhost 2049 Y 11024 NFS Server on latest-b 2049 Y 11679 Task Status of Volume rebalwhilemigration ------------------------------------------------------------------------------ There are no active volume tasks Expected results: stop succeeds Additional info:
It just might be a timing issue : some file is in migration, even though gluster says the migration stopped. We might need to investigate this. If the file in migration is completed and then this is tried, this might work out fine. If that's confirmed by Dustin, then we will mark it CLOSED.
Dustin, please check if rebalance process is running on the nodes when you get this error. thanks!
Hi Sahina, Running `Gluster vol status` before stopping the volume shows that there are no tasks in progress.
Dusmant, I don't think this bug should be closed out even if I put a long delay between steps 4 and 5. Looks like there are a few issues occurring in this bug: * One issue is that the error message is reporting rebalance is in progress when it should read migration is in progress. * stop migration does not stop migration immediately even though stop migration is not an asynchronous task.
Dustin, Regarding your point 2 - Gluster vol status will not show stopped tasks. So though a task is stopped, it will only stop once the migration of file that is in progress is completed. The only way to know if rebalance is finished for now, is to grep for rebalance process on the node where it was stopped. Regarding point 1 - Gluster treats migration of data during rebalance and that during remove-brick as rebalancing data. And the error you see is one reported from gluster.
Dustin, This is the expected behaviour from Gluster and as you and Sahina discussed, it's not a bug as such. Hence moving it to CLOSED state. We are going to document the behaviour of Gluster and it's impact on RHSC through the following bug 1022955
I believe that both point2 and point 1 need to be documented. I especially think point 2 needs to be documented because I don't believe the user will expect that status of steps and jobs in rhsc do not reflect gluster's real status.
^ was hoping documentation would be specific to the rhsc SDK.