Bug 1053749
| Summary: | [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Kim van der Riet <kim.vdriet> | ||||
| Component: | qpid-cpp | Assignee: | Kim van der Riet <kim.vdriet> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Frantisek Reznicek <freznice> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | Development | CC: | esammons, freznice, iboverma, jross, kim.vdriet, zkraus | ||||
| Target Milestone: | 3.0 | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qpid-cpp-0.22-35 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-01-21 12:56:04 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 709325 | ||||||
| Attachments: |
|
||||||
Upstream bug: https://issues.apache.org/jira/browse/QPID-5480 Fixed in r.1564877. This checkin also contains fixes for several other recovery edge cases discovered while testing this fix. The fix for this bug makes a change to the way files are recycled to the empty file pool. The bug was caused by the fact that files that contained only transactionally dequeued records for which the transaction had not yet committed were being returned to the empty file pool prematurely. This should not happen until the transaction has committed. On recovery, the enqueue records for which the open transactional dequeues were missing, and hence the error message JERR_MAP_NOTFOUND was thrown. QE notes: --------- The implication of this bug and its fix means that the example store supplied with this bug cannot be used to test the fix, as the nature of the error is in the missing journal files, not in how the store handles the recovery with existing data as is the case with many other bugs. This bug was found by soak-testing the store using a test similar to QEs qpid-txtest soak. I suggest that if saok-testing does not turn up a similar bug, then the issue is resolved. According to Comment 4 the bug 1052518 have to be retested. Extended run of transaction integrity tests (qpid_test_transaction_integrity, qpid_txtest_fails_bz458053) on three individual bare-metal machines (RHEL 6.5 i686 / x86_64) proved that issue has been reliably resolved. No broker JERR_MAP_NOTFOUND issue detected out of more than 1400 testing cycles. Testing packageset: perl-qpid-0.22-11.el6.i686 perl-qpid-debuginfo-0.22-11.el6.i686 python-qpid-0.22-12.el6.noarch python-qpid-proton-0.6-1.el6.i686 python-qpid-qmf-0.22-28.el6.i686 qpid-cpp-client-0.22-36.el6.i686 qpid-cpp-client-devel-0.22-36.el6.i686 qpid-cpp-client-devel-docs-0.18-20.el6.noarch qpid-cpp-client-rdma-0.22-36.el6.i686 qpid-cpp-debuginfo-0.22-36.el6.i686 qpid-cpp-server-0.22-36.el6.i686 qpid-cpp-server-devel-0.22-36.el6.i686 qpid-cpp-server-ha-0.22-36.el6.i686 qpid-cpp-server-linearstore-0.22-36.el6.i686 qpid-cpp-server-rdma-0.22-36.el6.i686 qpid-cpp-server-xml-0.22-36.el6.i686 qpid-java-client-0.22-6.el6.noarch qpid-java-common-0.22-6.el6.noarch qpid-java-example-0.22-6.el6.noarch qpid-jca-0.18-8.el6.noarch qpid-jca-xarecovery-0.18-8.el6.noarch qpid-proton-c-0.6-1.el6.i686 qpid-proton-c-devel-0.6-1.el6.i686 qpid-proton-debuginfo-0.6-1.el6.i686 qpid-qmf-0.22-28.el6.i686 qpid-qmf-debuginfo-0.22-28.el6.i686 qpid-qmf-devel-0.22-28.el6.i686 qpid-snmpd-1.0.0-16.el6.i686 qpid-snmpd-debuginfo-1.0.0-16.el6.i686 qpid-tests-0.22-14.el6.noarch qpid-tools-0.22-9.el6.noarch rh-qpid-cpp-tests-0.22-36.el6.i686 ruby-qpid-0.7.946106-2.el6.i686 ruby-qpid-qmf-0.22-28.el6.i686 -> VERIFIED |
Created attachment 850597 [details] Store files that need to be expanded While running a qpid-txtest soak test, a store was encountered which could not be recovered. The broker stopped with the following error message: [Broker] critical Unexpected error: Error commitjexception 0x0b01 wmgr::dequeue() threw JERR_MAP_NOTFOUND: Key not found in map. (rid=0x3e4f0) (/home/kpvdr/RedHat/qpid/cpp/src/qpid/linearstore/TxnCtxt.cpp:55) The store in question is attached. An additional similar error message is seen during store shutdown. To reproduce: 1. Expand the store into a store directory. Note that the store directory is the one that contains the qls directory in the attached archive, and is supplied to the broker using the --store-dir parameter. 2. Start the broker so as to recover the store in the store directory: ./qpidd --load-module linearstore.so -m no --auth no --default-flow-stop-threshold 0 --default-flow-resume-threshold 0 --default-queue-limit 0 --store-dir <path-to-store-dir> --log-enable info+ --truncate no The store does not recover, and the broker exits with the above error message.