Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1051924

Summary: [linearstore] Recovery of journal in which last logical file contains truncated record causes crash
Product: Red Hat Enterprise MRG Reporter: Kim van der Riet <kim.vdriet>
Component: qpid-cppAssignee: Kim van der Riet <kim.vdriet>
Status: CLOSED CURRENTRELEASE QA Contact: Zdenek Kraus <zkraus>
Severity: high Docs Contact:
Priority: high    
Version: DevelopmentCC: esammons, iboverma, jross, zkraus
Target Milestone: 3.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-cpp-0.22-33 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-21 12:57:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 709325    

Description Kim van der Riet 2014-01-12 22:34:47 UTC
When the store recovery process encounters a journal where the last logical file is full and the last record in that file is also incomplete (ie this record would ordinarily span this journal file and the next if it were present), then the recovery process crashes with a segfault.

There is no specific reproducer, but using large message sizes in a test will increase the probability that this error will occur, and killing the broker suddenly (ie with signal SIGKILL) increases the probability that this error will occur.

Although untested, this error could also be synthesized by deleting the last file(s) in a valid journal provided that the last record in the remaining journal files is still undequeued and spans to a deleted file. ie Enqueue enough large records to fill 2 or 3 journal files. Recover the journal using info level logging, and look at the table of recovered files and their logical ordering. Starting at the second-last file, look for a journal file using hexdump or a journal analysis tool where the last record spans that journal file and the next. Then delete the following journal file(s).

Comment 1 Kim van der Riet 2014-01-12 22:35:37 UTC
Tracked in upstream bug https://issues.apache.org/jira/browse/QPID-5473.

Comment 2 Kim van der Riet 2014-01-12 22:43:36 UTC
Fixed in r.1557620

Comment 5 Zdenek Kraus 2014-03-19 13:52:02 UTC
this issue was tested on RHEL 6.5 i686 & x86_64 with following packages:

perl-qpid-0.22-11.el6
python-qpid-0.22-12.el6
python-qpid-qmf-0.22-28.el6
qpid-cpp-client-0.22-36.el6
qpid-cpp-client-devel-0.22-36.el6
qpid-cpp-client-devel-docs-0.22-36.el6
qpid-cpp-debuginfo-0.22-36.el6
qpid-cpp-server-0.22-36.el6
qpid-cpp-server-devel-0.22-36.el6
qpid-cpp-server-ha-0.22-36.el6
qpid-cpp-server-linearstore-0.22-36.el6
qpid-cpp-server-xml-0.22-36.el6
qpid-java-client-0.22-6.el6
qpid-java-common-0.22-6.el6
qpid-java-example-0.22-6.el6
qpid-jca-0.22-2.el6
qpid-jca-xarecovery-0.22-2.el6
qpid-proton-c-0.6-1.el6
qpid-proton-c-devel-0.6-1.el6
qpid-proton-debuginfo-0.6-1.el6
qpid-qmf-0.22-28.el6
qpid-qmf-debuginfo-0.22-28.el6
qpid-snmpd-1.0.0-16.el6
qpid-snmpd-debuginfo-1.0.0-16.el6
qpid-tools-0.22-9.el6
ruby-qpid-qmf-0.22-28.el6


-> VERIFIED