Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1053749

Summary: [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message
Product: Red Hat Enterprise MRG Reporter: Kim van der Riet <kim.vdriet>
Component: qpid-cppAssignee: Kim van der Riet <kim.vdriet>
Status: CLOSED CURRENTRELEASE QA Contact: Frantisek Reznicek <freznice>
Severity: high Docs Contact:
Priority: high    
Version: DevelopmentCC: esammons, freznice, iboverma, jross, kim.vdriet, zkraus
Target Milestone: 3.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-0.22-35 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-21 12:56:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 709325    
Attachments:
Description Flags
Store files that need to be expanded none

Description Kim van der Riet 2014-01-15 17:27:47 UTC
Created attachment 850597 [details]
Store files that need to be expanded

While running a qpid-txtest soak test, a store was encountered which could not be recovered. The broker stopped with the following error message:

[Broker] critical Unexpected error: Error commitjexception 0x0b01 wmgr::dequeue() threw JERR_MAP_NOTFOUND: Key not found in map. (rid=0x3e4f0) (/home/kpvdr/RedHat/qpid/cpp/src/qpid/linearstore/TxnCtxt.cpp:55)

The store in question is attached. An additional similar error message is seen during store shutdown.

To reproduce:

1. Expand the store into a store directory. Note that the store directory is the one that contains the qls directory in the attached archive, and is supplied to the broker using the --store-dir parameter.

2. Start the broker so as to recover the store in the store directory:

./qpidd --load-module linearstore.so -m no --auth no --default-flow-stop-threshold 0 --default-flow-resume-threshold 0 --default-queue-limit 0 --store-dir <path-to-store-dir> --log-enable info+ --truncate no

The store does not recover, and the broker exits with the above error message.

Comment 1 Kim van der Riet 2014-01-15 17:28:51 UTC
Upstream  bug: https://issues.apache.org/jira/browse/QPID-5480

Comment 2 Kim van der Riet 2014-02-05 19:16:09 UTC
Fixed in r.1564877.

This checkin also contains fixes for several other recovery edge cases discovered while testing this fix.

Comment 4 Kim van der Riet 2014-03-10 20:42:32 UTC
The fix for this bug makes a change to the way files are recycled to the empty file pool. The bug was caused by the fact that files that contained only transactionally dequeued records for which the transaction had not yet committed were being returned to the empty file pool prematurely. This should not happen until the transaction has committed. On recovery, the enqueue records for which the open transactional dequeues were missing, and hence the error message JERR_MAP_NOTFOUND was thrown.

QE notes:
---------

The implication of this bug and its fix means that the example store supplied with this bug cannot be used to test the fix, as the nature of the error is in the missing journal files, not in how the store handles the recovery with existing data as is the case with many other bugs.

This bug was found by soak-testing the store using a test similar to QEs qpid-txtest soak. I suggest that if saok-testing does not turn up a similar bug, then the issue is resolved.

Comment 6 Zdenek Kraus 2014-03-11 05:28:29 UTC
According to Comment 4 the bug 1052518 have to be retested.

Comment 11 Frantisek Reznicek 2014-03-19 12:59:51 UTC
Extended run of transaction integrity tests (qpid_test_transaction_integrity, qpid_txtest_fails_bz458053) on three individual bare-metal machines (RHEL 6.5 i686 / x86_64) proved that issue has been reliably resolved.
No broker JERR_MAP_NOTFOUND issue detected out of more than 1400 testing cycles.

Testing packageset:
perl-qpid-0.22-11.el6.i686
perl-qpid-debuginfo-0.22-11.el6.i686
python-qpid-0.22-12.el6.noarch
python-qpid-proton-0.6-1.el6.i686
python-qpid-qmf-0.22-28.el6.i686
qpid-cpp-client-0.22-36.el6.i686
qpid-cpp-client-devel-0.22-36.el6.i686
qpid-cpp-client-devel-docs-0.18-20.el6.noarch
qpid-cpp-client-rdma-0.22-36.el6.i686
qpid-cpp-debuginfo-0.22-36.el6.i686
qpid-cpp-server-0.22-36.el6.i686
qpid-cpp-server-devel-0.22-36.el6.i686
qpid-cpp-server-ha-0.22-36.el6.i686
qpid-cpp-server-linearstore-0.22-36.el6.i686
qpid-cpp-server-rdma-0.22-36.el6.i686
qpid-cpp-server-xml-0.22-36.el6.i686
qpid-java-client-0.22-6.el6.noarch
qpid-java-common-0.22-6.el6.noarch
qpid-java-example-0.22-6.el6.noarch
qpid-jca-0.18-8.el6.noarch
qpid-jca-xarecovery-0.18-8.el6.noarch
qpid-proton-c-0.6-1.el6.i686
qpid-proton-c-devel-0.6-1.el6.i686
qpid-proton-debuginfo-0.6-1.el6.i686
qpid-qmf-0.22-28.el6.i686
qpid-qmf-debuginfo-0.22-28.el6.i686
qpid-qmf-devel-0.22-28.el6.i686
qpid-snmpd-1.0.0-16.el6.i686
qpid-snmpd-debuginfo-1.0.0-16.el6.i686
qpid-tests-0.22-14.el6.noarch
qpid-tools-0.22-9.el6.noarch
rh-qpid-cpp-tests-0.22-36.el6.i686
ruby-qpid-0.7.946106-2.el6.i686
ruby-qpid-qmf-0.22-28.el6.i686

-> VERIFIED