Bug 1060202
| Summary: | Set timeout for every DTX transaction | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pavel Moravec <pmoravec> | ||||||
| Component: | qpid-cpp | Assignee: | Pavel Moravec <pmoravec> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Leonid Zhaldybin <lzhaldyb> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 3.0 | CC: | esammons, iboverma, jross, lzhaldyb, pmoravec, sauchter, vhubeika | ||||||
| Target Milestone: | 3.0 | Keywords: | EasyFix, Improvement, Patch | ||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | qpid-cpp-0.22-35 | Doc Type: | Enhancement | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2014-09-24 15:10:16 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 785156 | ||||||||
| Attachments: |
|
||||||||
|
Description
Pavel Moravec
2014-01-31 12:50:12 UTC
Created attachment 857827 [details]
Patch for qpid-txtest to create orphaned DTX transactions
Necessary for reproducer.
Created attachment 857888 [details]
Proposed patch
--dtx-default-timeout option added with default value 3600 seconds.
Tried the reproducer on broker with --dtx-default-timeout=60, and the transaction was gone after one minute.
Committed to upstream as r1564694. 1) store_chk returning "Operation on non-existent record: operation=unlock; rid=.." - that is a bug in store_chk, see bz1060114. Let apply the fix https://bugzilla.redhat.com/attachment.cgi?id=858071&action=diff to /usr/lib64/python2.6/site-packages/qpidstore/janal.py to let pass store_chk. Alternative reproducer on linearstore: after sending the DTX transaction without commit, prepare (& commit) further transactions and observe number of journal files: a) qpid-txtest --queues=1 --total-messages=1 --dtx=1 --dtx-commit=no b) while true; do ./qpid-txtest --queues=1 --total-messages=1000 --tx-count=10 --queue-base-name=MyTestTx; done c) After a while (i.e. some time after DTX default timeout applies), check number of files in /var/lib/qpidd/qls/tpl/ directory. Current behaviour: The very first journal file created to keep DTX record due to step a) will persist there forever. Number of journal files will grow, with no file to be deleted ever. Expected behaviour: There should be one or maximally two (if while-cycle is still running). The very first file there created to keep DTX record due to step a) will be already gone. 2) Wrong QMF statistics: That is low priority issue (esp. compared to the original one preventing any (D)TX work completely) irrelevant on the original, I agree with filing it as a separate bug (and I am happy to have a look on it from devel perspective). Changing back to ON_QA. (sorry for the missing info as above, when I filed the BZ I did not know linear/legacy store status in 3.0 and expected the bz1060114 to be fixed in parallel) Tested on RHEL6.5 (both i386 and x86_64) using both testing scenarios suggested by Pavel, the original one from comment 0, which uses legacy store and store_chk utility, which I copied from the stable 0.18 version, and the one for linear store from comment 7 point 1. This issue has been fixed. The new dtx-default-timeout option provides the capability of setting a time interval after which the broker removes unfinished transactions from the store. Packages used for testing: python-qpid-0.22-12.el6.noarch python-qpid-qmf-0.22-28.el6.i686 qpid-cpp-client-0.22-36.el6.i686 qpid-cpp-client-devel-0.22-36.el6.i686 qpid-cpp-client-devel-docs-0.22-36.el6.noarch qpid-cpp-server-0.22-36.el6.i686 qpid-cpp-server-devel-0.22-36.el6.i686 qpid-cpp-server-linearstore-0.22-36.el6.i686 qpid-cpp-server-store-0.22-36.el6.i686.rpm qpid-cpp-server-xml-0.22-36.el6.i686 qpid-java-client-0.22-6.el6.noarch qpid-java-common-0.22-6.el6.noarch qpid-java-example-0.22-6.el6.noarch qpid-jca-0.22-2.el6.noarch qpid-jca-xarecovery-0.22-2.el6.noarch qpid-proton-c-0.6-1.el6.i686 qpid-qmf-0.22-28.el6.i686 qpid-tools-0.22-9.el6.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1296.html |