Bug 1077216
Summary: | Some "uid" left in tx log after crash recovery. JTS only. | ||||||
---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Hayk Hovsepyan <hhovsepy> | ||||
Component: | Transaction Manager | Assignee: | Gytis Trikleris <gtrikler> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Hayk Hovsepyan <hhovsepy> | ||||
Severity: | low | Docs Contact: | Russell Dickenson <rdickens> | ||||
Priority: | unspecified | ||||||
Version: | TBD EAP 6 | CC: | ochaloup | ||||
Target Milestone: | --- | ||||||
Target Release: | EAP 6.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-08-14 15:28:18 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Hayk Hovsepyan
2014-03-17 13:40:44 UTC
Does this only fail on hqstore? It fails for standard store as well. Hi Hayk, Sorry for the delay. I can explain what is happening. With JTS we have what is known as top down and bottom up recovery. When a resource calls replay completion on the coordinator the return value tells it whether to commit or not. Simultaneously the coordinator takes the opportunity to complete the entire transaction. Therefore there is a small race between the (threaded) coordinator and the resources recovery manager to complete the resource. If the coordinator completes the resource, it will be able to know the outcome and automatically clean up its transaction log. If the resource completes itself, the coordinator when it tries to gets an receives an warning status so leaves the transaction in the store. After 3 attempts to commit the transaction and get OBJECT_NOT_EXIST a transaction is assumed to have fully committed its resources. In the debugger it looks like depending on timing it is easy for this counter to not reach 3 so the entries will still be in the object store. Each time a branch completes the counter is reset and in total you only have 3 recovery scans so by default it should be impossible for (bottom-up completed resources) recovery to remove the entry. It only passes when top-down recovery won the race. Tom Hi Tom, Thanks for the detailed description. So what can be the solution or workaround here not to leave any uid in log? I tried to call "recovery" 3 times, assuming that after 3 attempts it will consider as fully committed and log will be emptied, but it is still there. /Hayk Hi Hayk, _After_ it has recovered the HQ x2 and TestXAResource, if you have three recovery calls it should be fine. You don't need the minute wait between recovery scans if you are calling it yourself I wouldn't think. Tom Did you try three recovery calls? Yes it calls recovery 3 times, and still the problem exists. The problem was in test framework. Thanks Gytis for doing research on this. |