Description of problem: Rebalance icon in the activities column changes to unknown (?) during race condition. Version-Release number of selected component (if applicable): rhsc-2.1.2-0.36.el6rhs.noarch How reproducible: Intermittently Steps to Reproduce: 1. Login to rhs console. 2. create 2 distribute and 2 distribute replicate volumes. 3. Now start rebalance on the volumes. Actual results: after a few sec user can see that rebalance icon in the activities column changes to unknown (?) with an event message saying "Could not find information for rebalance on volume <volName> of Cluster cluster_test_setup from CLI. Marking it as unknown Expected results: Rebalance icon should not be changed to "?" and it should run sucessfully. Additional info:
Attaching the sos reports for the same http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/rhsc/1065227/
Some more additional info from the logs: 2014-02-13 18:38:22,157 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-89) [75a8cdae] START , GlusterTasksListVDSCommand(HostName = 10.70.37.44, HostId = 74f68cf1-8053-457b-b1ee-0aac2f6e50ab), log id: 3fa47293 2014-02-13 18:38:25,303 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] (DefaultQuartzScheduler_Worker-65) START, GlusterServersListVDSCommand(HostName = 10.70.37.185, HostId = 3d8aa03b-3d0f-45ff-a588-611713ccbaf6), log id: 58a81ae0 2014-02-13 18:38:25,339 INFO [org.ovirt.engine.core.vdsbroker.gluster.StartRebalanceGlusterVolumeVDSCommand] (pool-5-thread-39) [227c84b6] FINISH, StartRebalanceGlusterVolumeVDSCommand, return: GlusterAsyncTask[673fdc6b-ef0a-4791-9dfe-d0da90adb623-null-null], log id: 74669829 2014-02-13 18:38:25,375 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-39) [227c84b6] Correlation ID: 227 c84b6, Job ID: 8f655b94-9395-4970-810b-159827706e8d, Call Stack: null, Custom Event ID: -1, Message: Gluster Volume vol_dis_one rebalance started. 2014-02-13 18:38:25,408 INFO [org.ovirt.engine.core.bll.gluster.StartRebalanceGlusterVolumeCommand] (pool-5-thread-39) [1cecda6a] Running command: StartRebalanceGlusterVolumeCommand internal: false. Entities affected : ID: f215adce-1799-4c38-8543-46b8ceb4da97 Type: GlusterVolume 2014-02-13 18:38:25,417 INFO [org.ovirt.engine.core.vdsbroker.gluster.StartRebalanceGlusterVolumeVDSCommand] (pool-5-thread-39) [1cecda6a] START, StartRebalanceGlusterVolumeVDSCommand(HostName = 10.70.37.185, HostId = 3d8aa03b-3d0f-45ff-a588-611713ccbaf6), log id: 3f526e09 2014-02-13 18:38:25,808 INFO [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-89) [75a8cdae] FINISH, GlusterTasksListVDSCommand, return: [GlusterAsyncTask[673fdc6b-ef0a-4791-9dfe-d0da90adb623-REBALANCE-STARTED]], log id: 3fa47293 2014-02-13 18:38:25,883 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirect
This happens when 1. GlusterTasksLists called on host1 2. StartRebalance has created a task id on host2 3. GlusterTasksList returns from host1 before being aware of the new taskid Introducing a minimum duration of 10 minutes before any task is cleared from engine database due to gluster CLI not returning task information
Verified and works fine with build Created two distribute and distribute replicated volumes. Triggered rebalance on all the volumes at once from UI. Rebalance completed successfully on all the volumes.
Hi Sahina, Please review the edited text for technical accuracy and sign off.
+1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1277.html