Bug 1039505 - Node counters are zeroed on broker restart
Summary: Node counters are zeroed on broker restart
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: messaging-bugs
QA Contact: Messaging QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-09 10:06 UTC by Petr Matousek
Modified: 2021-03-16 12:46 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Petr Matousek 2013-12-09 10:06:50 UTC
Description of problem:

After broker restart the qmf counters (ie. msgIn/msgOut/etc..) are cleared. I believe that the counters of durable nodes shall not be cleared on broker restart.

Note: counters are zeroed only if the msgIn == msgout.

Version-Release number of selected component (if applicable):
qpid-cpp-*-0.22-29
qpid-qmf-*-0.22-24
python-qpid-qmf-0.22-24

How reproducible:
100%

Steps to Reproduce:
1. $cppapi/spout "q;{create:always, node: {durable:true}}"
2. $cppapi/drain q
3. qpid-stat -q
counters values displayed as expected
4. restart the broker
3. qpid-stat -q
counters are zeroed 

Actual results:
Reset of the qmf counters on broker restart

Expected results:
Counters shall not be reset on broker restart

Additional info:

# ./qc2_spout --durable yes "q;{create:always, node: {durable:true}}"
# ./qc2_drain q
Message(properties={'spout-id': 'bd86a8a9-b5ca-4339-b5f3-7ec7684e045e:0', 'x-amqp-0-10.routing-key': 'q'}, content='')
# qpid-stat -q
Queues
  queue                                     dur  autoDel  excl  msg   msgIn  msgOut  bytes  bytesIn  bytesOut  cons  bind
  =========================================================================================================================
  2ab75b3d-3a39-4cbf-9dfe-3c76e65c139c:0.0       Y        Y        0     0      0       0      0        0         1     2
  q                                       	  Y                      0     1      1       0     91       91         0     1
# service qpidd restart
Stopping Qpid AMQP daemon:                                 [  OK  ]
Starting Qpid AMQP daemon:                                 [  OK  ]
# qpid-stat -q
Queues
  queue                                     dur  autoDel  excl  msg   msgIn  msgOut  bytes  bytesIn  bytesOut  cons  bind
  =========================================================================================================================
  751b57d0-92d7-4a1e-aed4-093ce0de528e:0.0       Y        Y        0     0      0       0      0        0         1     2
  q                                         Y                      0     0      0       0      0        0         0     1

Comment 1 Frantisek Reznicek 2013-12-09 14:44:43 UTC
I can see the same behavior in HA environment where I believe this is the problem.


Let's assume following failover scenario:
 0] HA up and running on nodes A B C (A is primary)
 1] sender sends message to queue Q with certain rate (Q is durable, messages are not)
    qc2_spout  --log-msg-dict --broker <virt-ip>:5672 --connection-options "{  reconnect_limit : '50', reconnect_timeout : '120', reconnect_interval : '3', protocol : 'amqp0-10', reconnect : 'true' }" --count 100 --content "msg_%06d" --durable True --duration 36  'test_a1; {create: always, node:{durable: True}}'
 2] in the meanwhile 1] sender is sending A qpidd-primary is killed
    qc2_spout appears to sent arount 22 messages
 3] new primary is promoted (B)
 4] client continues to send and finishes sending N messages
 5] QMF keeps inaccure data about queue depth and/or other statistics (on the primary)

# qpid-stat -q | grep test_
  test_a1                                              Y                     88    88      0    10.3k  10.3k       0         0     1


It looks that object statistics are not sent to backup brokers.
Object statistics should be populated to HA backups at minimum.

Comment 2 Frantisek Reznicek 2013-12-09 14:48:32 UTC
Another case showing that QMF data are not in sync:

node-B
# qpid-stat -q | grep test_a1
  test_a1                                              Y                     88    88      0    10.3k  10.3k       0         0     1
  test_a1_c1_mcL_msL_md1_txrxkill1_sigkill_vip_00      Y                      0     0      0       0      0        0         0     1

node-A
# qpid-stat -q --ha-admin| grep test_a1
  test_a1                                   Y                      0     0      0       0      0        0         0     1

node-C
# qpid-stat -q --ha-admin| grep test_a1
  test_a1_c1_mcL_msL_md1_txrxkill1_sigkill_vip_00  Y                      0     0      0       0      0        0         0     1


This clearly shows that QMF queries to primary (B) and backups are not even showing the same queues.

Comment 3 Frantisek Reznicek 2013-12-09 14:50:37 UTC
The issue has to be considered from two different deployment conditions:
 a] standalone broker
 b] HA environment

Comment 4 Ted Ross 2013-12-10 21:02:06 UTC
This should be considered a feature-request.  It is not a regression since the statistics/counters are neither persisted nor synchronized to cluster peers.

Please note that queue-depth statistics are always correct and synchronized.


Note You need to log in before you can comment on or make changes to this bug.