Bug 1224300

Summary: [MRG 3.2] linearstore raising JERR_LFCR_SEQNUMNOTFOUND after sending many DTX transactions
Product: Red Hat Enterprise MRG Reporter: Mike Cressman <mcressma>
Component: qpid-cppAssignee: Irina Boverman <iboverma>
Status: CLOSED ERRATA QA Contact: Eric Sammons <esammons>
Severity: high Docs Contact:
Priority: high    
Version: 3.0CC: esammons, iboverma, jross, kim.vdriet, messaging-bugs, messaging-qe-bugs, pematous, pmoravec, smumford
Target Milestone: 3.2   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: qpid-cpp-0.32-3 Doc Type: Bug Fix
Doc Text:
In previous releases of the product, customers could encounter a `JERR_LFCR_SEQNUMNOTFOUND` error. This is because after sufficient journal files have been used, the journal sequence number overflows to 0 which is a reserved and non-existent file number. This is an artifact of the older legacystore, where file numbers were limited in a circular buffer to 64 files. In this release, all use of the journal file sequence number has been updated to use 64-bit unsigned integers, preventing the error from occurring in any practical usage situation.
Story Points: ---
Clone Of: 1223789 Environment:
Last Closed: 2015-10-08 13:10:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1223789    
Bug Blocks: 1223795    

Description Mike Cressman 2015-05-22 13:34:33 UTC
+++ This bug was initially created as a clone of Bug #1223789 +++

Description of problem:
Please backport QPID-6551 to 3.0.*.


Version-Release number of selected component (if applicable):
qpid-cpp-server-linearstore-0.22-51


How reproducible:
100%


Steps to Reproduce:
qpidd --efp-file-size=32 --log-to-file=/tmp/qpidd.log &
# --efp-file-size just for faster reproducer
qpid-txtest --dtx=yes --check=no --init=yes --tx-count=10 --total-messages=1000 --size=1
nohup qpid-txtest --dtx=yes --check=no --init=no --tx-count=200000 --size=1 &

now wait 1-2 hours until the nohup finishes


Actual results:
- Latest command fails
- qpidd logs exception:
jexception 0x0500 LinearFileController::find() threw JERR_LFCR_SEQNUMNOTFOUND: File sequence number not found (fileSeqNumber=0)


followed by various other journal exceptions.


Expected results:
- Latest command succeeds
- qpid-stat -q shows 2M enqueues and 2M dequeues in both queues tx-test-1 and tx-test-2 (just to confirm all transactions were executed)


Additional info:
Backport https://svn.apache.org/r1680861.

Comment 1 Mike Cressman 2015-05-22 13:40:22 UTC
Definitely need to fix this in 3.2, and hotfix or fix in 3.0/3.1.

Comment 4 Jitka Kocnova 2015-08-19 13:45:51 UTC
This issue is fixed.

Verified on RHEL 6 (x86_64, i386) with packages:

qpid-cpp-client-0.34-1.el6
qpid-cpp-server-rdma-0.34-1.el6
qpid-cpp-server-ha-0.34-1.el6
qpid-proton-c-0.9-4.el6
qpid-cpp-server-0.34-1.el6
qpid-cpp-client-devel-0.34-1.el6
qpid-cpp-server-linearstore-0.34-1.el6
qpid-cpp-server-devel-0.34-1.el6
qpid-qmf-0.34-1.el6.x86_64
qpid-tools-0.34-1.el6.noarch
qpid-cpp-client-rdma-0.34-1.el6
qpid-cpp-server-xml-0.34-1.el6
qpid-cpp-debuginfo-0.34-1.el6

and

qpid-cpp-server-0.32-3.el6
qpid-cpp-server-ha-0.32-3.el6
qpid-cpp-server-devel-0.32-3.el6
qpid-cpp-client-0.32-3.el6
qpid-cpp-client-rdma-0.32-3.el6
qpid-cpp-server-rdma-0.32-3.el6
qpid-cpp-server-linearstore-0.32-3.el6
qpid-cpp-debuginfo-0.32-3.el6
qpid-proton-c-0.9-4.el6
qpid-cpp-client-devel-0.32-3.el6
qpid-cpp-server-xml-0.32-3.el6


-> VERIFIED

Comment 10 errata-xmlrpc 2015-10-08 13:10:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1879.html