Description of problem: Having the lock in the Command scope causes a race condition if an operation runs after LSM tasks are done but the engine lock was not released yet. Version-Release number of selected component (if applicable): 4.2 How reproducible: race condition happens sporadically Steps to Reproduce: 1. run ost in ovirt-jenkins 2. 3. Actual results: some of the runs would fail on conflict error for disk lock Expected results: hotunplug should not fail on conflict with disk lock Additional info: https://gerrit.ovirt.org/#/c/77504/ https://gerrit.ovirt.org/#/c/77499/h
Benny, I see the "see also" bugs are closed and verified, this issue is fixed as well as part of those fixes?
(In reply to Tal Nisan from comment #1) > Benny, I see the "see also" bugs are closed and verified, this issue is > fixed as well as part of those fixes? No, this issue currently waits for this: https://bugzilla.redhat.com/show_bug.cgi?id=1460701, so a sensible solution can be merged (it's currently fixed with a time.sleep(3))
(In reply to Benny Zlotnik from comment #2) > (In reply to Tal Nisan from comment #1) > > Benny, I see the "see also" bugs are closed and verified, this issue is > > fixed as well as part of those fixes? > > No, this issue currently waits for this: > https://bugzilla.redhat.com/show_bug.cgi?id=1460701, so a sensible solution > can be merged (it's currently fixed with a time.sleep(3)) So this is actually an OST bug? (i.e., OST isn't using the right mechanism to track LSM's completion)?
(In reply to Allon Mureinik from comment #3) > (In reply to Benny Zlotnik from comment #2) > > (In reply to Tal Nisan from comment #1) > > > Benny, I see the "see also" bugs are closed and verified, this issue is > > > fixed as well as part of those fixes? > > > > No, this issue currently waits for this: > > https://bugzilla.redhat.com/show_bug.cgi?id=1460701, so a sensible solution > > can be merged (it's currently fixed with a time.sleep(3)) > > So this is actually an OST bug? (i.e., OST isn't using the right mechanism > to track LSM's completion)? Yes, I confused it with something else. The lock changing solution didn't work and caused another bug. I'll changed the description
Eyal - once OST code is merged, it's essentially live. Is there any point to leave this BZ on MODIFIED? I propose the following: 1. Move the ON_QA now. 2. If the LSM test passes for the next X build (let's say a week?), it can be marked as CLOSED CURRENTRELEASE. Does this make sense to you?
Sounds right, though maybe we should wait a bit more if it was a known race? anyway @Dafna can track it and if the test is failing again we can think if to reopen the bug. I'm OK with moving to ON_QA.
Moving back to POST for now as the patch was reverted since the 4.2 SDK wasn't built with the updated ovirt-engine-api-model yet
The master and 4.2 suites have been separated (see commit 9a861d4 for details). Can we un-revert this patch?
(In reply to Allon Mureinik from comment #8) > The master and 4.2 suites have been separated (see commit 9a861d4 for > details). Can we un-revert this patch? Done
Patch has been merged. Setting BZ to ON_QA, and if we don't see any issues for a couple of weeks, I'll move it to CLOSED CURRENTRELEASE