Bug 468824

Summary: rmgr::read() throws JERR_RMGR_UNKNOWNMAGIC on recovery
Product: Red Hat Enterprise MRG Reporter: Gordon Sim <gsim>
Component: qpid-cppAssignee: Kim van der Riet <kim.vdriet>
Status: CLOSED DUPLICATE QA Contact: Kim van der Riet <kim.vdriet>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 1.0   
Target Milestone: 1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-12-05 15:41:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gordon Sim 2008-10-28 09:22:28 UTC
Mail from freznice:

A triggered this after few days running on dell-pe2850-04.rhts.bos.redhat.com, RHEL52_x86_64
MRG was build from trunk on 'Oct 24 11:45' Boston timezone

PART A] ------------------------------------------------------------ [(A) 79/500]

Shorten Exception: (qpid_test_transaction_integrity/qpidd_txtest.transcript.log line 10816)
2008-oct-24 15:21:07 info Loaded Module: /root/mrg_installed/lib/qpid/daemon/ssl.so
2008-oct-24 15:21:07 info Loaded Module: /root/mrg_installed/lib/qpid/daemon/cluster.so
2008-oct-24 15:21:07 info Loaded Module: /root/mrg_installed/lib/qpid/daemon/acl.so
2008-oct-24 15:21:07 info Loaded Module: /root/mrg_installed/lib/qpid/daemon/msgstore.so
2008-oct-24 15:21:07 info Management enabled
2008-oct-24 15:21:07 notice Journal "TplStore": Created
2008-oct-24 15:21:07 notice Store module initialized; dir=/tmp/rhts_qpidd/20081024_125525/broker.090
2008-oct-24 15:21:07 info > Default files per journal: 8
2008-oct-24 15:21:07 info > Auto-expand enabled
2008-oct-24 15:21:07 info > Max auto-expand journal files: 16
2008-oct-24 15:21:07 info > Default jrournal file size: 24 (wpgs)
2008-oct-24 15:21:07 info > Default write cache page size: 32 (Kib)
2008-oct-24 15:21:07 info > Default number of write cache pages: 32
2008-oct-24 15:21:07 info > TPL files per journal: 8
2008-oct-24 15:21:07 info > TPL jrournal file size: 24 (wpgs)
2008-oct-24 15:21:07 info > TPL write cache page size: 4 (Kib)
2008-oct-24 15:21:07 info > TPL number of write cache pages: 64
2008-oct-24 15:21:07 notice Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-1": Created
2008-oct-24 15:21:08 info Recovered queue "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-1": 12739 messages recovered; 0 messages in-doubt.
2008-oct-24 15:21:08 notice Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-2": Created
2008-oct-24 15:21:09 info Recovered queue "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-2": 1168 messages recovered; 0 messages in-doubt.
2008-oct-24 15:21:09 notice Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-3": Created
2008-oct-24 15:21:09 warning Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-3": Bad record alignment found at fid=0x5 offs=0x180180 (likely jour
nal overwrite boundary); 1 filler record(s) required.
2008-oct-24 15:21:09 notice Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-3": Recover phase write: Wrote filler record at offs=0x180180
2008-oct-24 15:21:09 info Journal "5c56eadec00cf07e74533bc3baa7a1b8af4e06c-3": Bad record alignment fixed.
Queue 5c56eadec00cf07e74533bc3baa7a1b8af4e06c-3: recoverMessages() failed: jexception 0x0900 rmgr::read() threw JERR_RMGR_UNKNOWNMAGIC: Found reco
rd with unknown magic. (Magic=0x00000000) (MessageStoreImpl.cpp:931)


Failing case is stored in (including pre-recovery journals):
mrg5.lab.bos.redhat.com:/root/qpid_test_transaction_integrity_fails081027.tar.bz2

see qpid_test_transaction_integrity/qpidd_txtest.transcript.log file for details...


PART B] ------------------------------------------------------------

There are two another kind of problems:
B1] JERR_RRFC_OPENRD exception because of errno=24 - most probably test error [(B) 14/500]
  qpid_test_transaction_integrity/qpid_test_transaction_integrity.log line 15349 and
  qpid_test_transaction_integrity/qpidd_txtest.transcript.log line 73144

error Connection 127.0.0.1:51615 closed by error: Queue 1b42e4d18edb3f6a9e20de1d049513b6743c7f26817bea564e447c18de099d309-3:
create() failed: jexception 0x0600 rrfc::open_fh() threw JERR_RRFC_OPENRD: Unable to open file for read. (file="/tmp/rhts_qpidd/20081024_125525/broker.417/rhm/jrnl/0008/1b42e4d18edb3f6a9e20de1d049513b6743c7f26817bea564e447c18de099d309-3//JournalData.0000.jdat" errno=24 (Too many open files))
 (MessageStoreImpl.cpp:438)(501)


B2] Db::open: Cannot allocate memory [(B) 55/500]
  qpid_test_transaction_integrity/qpidd_txtest.transcript.log line 93837

  This is most probably consequence of B1], but it's suspicious because this message appears when bdb database is corrupted. (I'm not running db_recover anymore)

Comment 1 Kim van der Riet 2008-12-05 15:41:25 UTC

*** This bug has been marked as a duplicate of bug 466533 ***