Bug 486418 - qpidd+store The extra xids encountered after qpidd recovery from journal
qpidd+store The extra xids encountered after qpidd recovery from journal
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
1.1
All Linux
high Severity high
: 1.1.1
: ---
Assigned To: Kim van der Riet
Frantisek Reznicek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-19 12:28 EST by Frantisek Reznicek
Modified: 2015-11-15 19:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-06-27 16:53:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Frantisek Reznicek 2009-02-19 12:28:58 EST
Description of problem:

During journal testing very very rarely txtest --check ends with message: (ecode=1)
The following extra ids were encountered:
<follows list of extra messages found>

It happened in transaction integrity test part B where journal is not trashed before each run. It seems that xids are reused.

Version-Release number of selected component (if applicable):
qpidd-0.4.743861-1.el4, rhm-0.4.3116-2.el4

How reproducible:
rarely (~3%)

Steps to Reproduce:
1. run RHTS qpid_test_transaction_integrity (part B)
  
Actual results:
Very rarely txtest finds extra messages after qpidd recovery.

Expected results:
No extra messages after qpidd recovery should be seen.

Additional info:

The data are stored in RHTS system
search 'The following extra ids were encountered:' in
https://rhts.redhat.com/testlogs/46435/157993/1320331/TESTOUT.log (rough log)
and in
https://rhts.redhat.com/testlogs/46435/157993/1320331/qpidd_txtest.transcript.log (fine log)
corresponding journals are here:
https://rhts.redhat.com/testlogs/46435/157993/1320331/qpidd_journal_b0020-0023.tar.bz2
Comment 1 Kim van der Riet 2009-02-19 14:02:28 EST
Analysis of the journals shows that this bug occurs when a local transaction id (tid) is reused for a transaction after it was left incomplete in a previous test. The records from the previous test are discarded if there is no matching entry in the transaction prepared list (TPL). However, as soon as a transaction using the same tid is committed, recover will also include the records from the earlier test as the journal has no way of knowing if these were part of the same transaction.

The class TxnCtxt was using a string "tid-" followed by the memory address of itself as a quick and cheap (ie not costly in performance) tid. However, memory addresses can be reused in a pattern such that the same address is allocated on various tests.

The problem was solved by generating a genuine xid using ::uuid_generate() to create a new xid for each broker instance. A 64-bit counter is incremented and the value pre-pended to the uuid to create a final tid that is guaranteed unique without the expense of generating a new uuid for each transaction.

Fixed in r.3124.

QA: This bug cannot be reliably reproduced; thorough soak testing should verify that there is no recurrence.
Comment 2 Frantisek Reznicek 2009-03-09 09:11:51 EDT
The issue has been fixed, validated on RHEl 4.7 / 5.3 i386 / x86_64 on packages:
qpidd-0.4.750054-1.el5, rhm-0.4.3138-2.el5.

->VERIFIED
Comment 3 Justin Ross 2011-06-27 16:53:59 EDT
Fixed and verified; closing.

Note You need to log in before you can comment on or make changes to this bug.