Bug 622889
Summary: | Store resize operation causes Qpid broker recovery to fail with JERR_FCNTL_RDOFFSOVFL | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Kim van der Riet <kim.vdriet> |
Component: | qpid-cpp | Assignee: | Kim van der Riet <kim.vdriet> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | Development | CC: | gsim, ppecka |
Target Milestone: | 1.3 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | 620676 | Environment: | |
Last Closed: | 2012-12-07 17:41:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 620676 | ||
Bug Blocks: |
Description
Kim van der Riet
2010-08-10 17:28:28 UTC
This bug is to deal with the broker recovery failure (JERR_FCNTL_RDOFFSOVFL) resulting from attempting to restore the partly written journal in the scenario above. Fixed in r.4199 This represents a corner case previously undiscovered for recovery of corrupted journals. In this case, a large record which spans more than one file was incompletely written in the second file. When the read pipeline is validated, it reads the file header from the first non-empty file (or last file written if there are no records to read), and sets the write pointer at the first available record in the file which is located at the position indicated in the fro (first record offset) field in the header. This places it after the point where the last record was corrupted. The write pointers, however, were (correctly) set at the start of the file. Since the read pointers are constrained by the write pointers, this operation caused a JERR_FCNTL_RDOFFSOVFL error. The error was fixed by placing an additional check in the file header read logic which processes the initial read pointer locations to ensure that the value set is constrained by the value of the write pointers. QE: You will need a saved part-written journal from Bug #620676 to test this. Make sure you save the complete store. Also, do not attempt to recover your copy directly, as the store will overwrite the store when it attempts to recover it. Use a fresh copy of the part-written store each time you test this. If you cannot get a store that fails correctly, contact me, I have a "working" copy that you can use. VERIFIED on RHEL 5.6 / 6.2 - (i686/x86_64) rpm -qa | grep qpid | sort -u python-qpid-0.10-1.el6.noarch python-qpid-qmf-0.10-10.el6.x86_64 qpid-cpp-client-0.10-6.el6.x86_64 qpid-cpp-client-devel-0.10-6.el6.x86_64 qpid-cpp-client-devel-docs-0.10-6.el6.noarch qpid-cpp-client-rdma-0.10-6.el6.x86_64 qpid-cpp-client-ssl-0.10-6.el6.x86_64 qpid-cpp-server-0.10-6.el6.x86_64 qpid-cpp-server-cluster-0.10-6.el6.x86_64 qpid-cpp-server-devel-0.10-6.el6.x86_64 qpid-cpp-server-rdma-0.10-6.el6.x86_64 qpid-cpp-server-ssl-0.10-6.el6.x86_64 qpid-cpp-server-store-0.10-6.el6.x86_64 qpid-cpp-server-xml-0.10-6.el6.x86_64 qpid-java-client-0.10-6.el6.noarch qpid-java-common-0.10-6.el6.noarch qpid-java-example-0.10-6.el6.noarch qpid-java-jca-0.10-6.el6.noarch qpid-qmf-0.10-10.el6.x86_64 qpid-qmf-devel-0.10-10.el6.x86_64 qpid-tests-0.10-1.el6.noarch qpid-tools-0.10-4.el6.noarch rh-qpid-cpp-tests-0.10-6.el6.x86_64 --> VERIFIED |