Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 614132 - qpid python client [high level API] failover not functional (reconnect from failover exchange)
qpid python client [high level API] failover not functional (reconnect from f...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: python-qpid (Show other bugs)
6.0
All Linux
high Severity high
: rc
: ---
Assigned To: Rafael H. Schloming
MRG Quality Engineering
:
Depends On: 614054
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-13 14:01 EDT by Rafael H. Schloming
Modified: 2010-11-10 16:23 EST (History)
5 users (show)

See Also:
Fixed In Version: python-qpid-0.7.946106-9.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 614054
Environment:
Last Closed: 2010-11-10 16:23:30 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rafael H. Schloming 2010-07-13 14:01:42 EDT
+++ This bug was initially created as a clone of Bug #614054 +++

Description of problem:

There seems to be the issue in the high-level API failover/reconnect mechanism (from failover exchange)

The testing scenario:
1] three machines C and A and B
2] C issues commands
3] A and B are two physical machines in (qpidd) cluster

The python clients can be downloaded from bug 613912 attachment
( https://bugzilla.redhat.com/attachment.cgi?id=431389 )

The drain and spout clients are set to enable reconnect & define reconnect limit and interval together with reconnection from failover exchange:
  qpid.messaging.util.auto_fetch_reconnect_urls(conn);

It was seen that reconnection from remote machine/host to current machine suffers from above listed defect, but if machine C and B are the same (i.e. you are reconnecting from current machine to another remote one works as expected)


Version-Release number of selected component (if applicable):
python-qpid-0.7.946106-4.el5

How reproducible:
100%

Steps to Reproduce:
1. set-up openais and qpidd, create msg.in with some text data
2. start qpidd on A and B, verify whether they are in cluster
3. C issues: 
   cat ./msgs.in | ../spout.py --address "addr1; \
       {create: sender, delete: receiver}"  --broker B \
       --reconnect --reconnect-interval 3 --reconnect-limit 10 \
       --reconnect-urls-from-failover-exchange
4. C issues: 
   ../drain.py --address "addr1; {create: sender, delete: receiver}" \
   --broker B  --reconnect --reconnect-interval 2 --reconnect-limit 5 \
   --reconnect-urls-from-failover-exchange --print "%(C)s"> msgs.out

5. while running 4] SIGINT the broker on B
6. expect that client from C will reconnect to A, which is not the case
  
Actual results:
Qpid python high-level API reconnection from failover exchange does not work.

Expected results:
Qpid python high-level API reconnection from failover exchange should work

Additional info:

# in this case C (dhcp-26-251) is F8 machine 
# with packages which match the A and B ( dhcp-26-227 and dhcp-27-226)
[freznice@dhcp-26-251 test]$ cat ./msgs.in | ../spout.py --address "addr1; {create: sender, delete: receiver}"   --broker dhcp-27-226 --reconnect --reconnect-interval 3 --reconnect-limit 10   --reconnect-urls-from-failover-exchange
2010-07-13 16:55:42,349 WARNING set reconnect_urls for conn <qpid.messaging.endpoints.Connection instance at 0x27657a0>: [u'10.34.26.227:5672,tcp', u'10.34.27.226:5672,tcp']
Exception in thread auto-fetch-reconnect-urls:
Traceback (most recent call last):
  File "/usr/lib64/python2.5/threading.py", line 460, in __bootstrap
    self.run()
  File "/usr/lib64/python2.5/threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/util.py", line 36, in main
    msg = rcv.fetch()
  File "<string>", line 6, in fetch
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 1002, in fetch
    self.check_closed()
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 963, in check_closed
    raise LinkClosed()
LinkClosed: None

[freznice@dhcp-26-251 test]$ echo $?
0
[freznice@dhcp-26-251 test]$ ../qc_list_objects.py --broker dhcp-26-227 -qv --queue-filter=addr                                addr1  durable msg|byteEnq/Deq:2000/0|104935/0 unacked:0 consumer_cnt:0 binding_cnt:1
[freznice@dhcp-26-251 test]$ ../drain.py --address "addr1; {create: sender, delete: receiver}" --broker dhcp-27-226   --reconnect --reconnect-interval 2 --reconnect-limit 5   --reconnect-urls-from-failover-exchange --print "%(C)s"> msgs.out
2010-07-13 16:56:19,382 WARNING set reconnect_urls for conn <qpid.messaging.endpoints.Connection instance at 0x11039e0>: [u'10.34.26.227:5672,tcp', u'10.34.27.226:5672,tcp']
2010-07-13 16:56:21,073 WARNING recoverable error[attempt 0]: connection aborted
2010-07-13 16:56:21,073 WARNING sleeping 2 seconds
2010-07-13 16:56:23,073 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:23,075 WARNING recoverable error[attempt 1]: (111, 'Connection refused')
2010-07-13 16:56:23,075 WARNING sleeping 2 seconds
2010-07-13 16:56:25,075 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:25,077 WARNING recoverable error[attempt 2]: (111, 'Connection refused')
2010-07-13 16:56:25,077 WARNING sleeping 2 seconds
2010-07-13 16:56:27,077 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:27,079 WARNING recoverable error[attempt 3]: (111, 'Connection refused')
2010-07-13 16:56:27,079 WARNING sleeping 2 seconds
2010-07-13 16:56:29,079 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:29,081 WARNING recoverable error[attempt 4]: (111, 'Connection refused')
2010-07-13 16:56:29,081 WARNING sleeping 2 seconds
2010-07-13 16:56:31,081 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:31,083 WARNING recoverable error[attempt 5]: (111, 'Connection refused')
2010-07-13 16:56:31,083 WARNING sleeping 2 seconds
2010-07-13 16:56:33,083 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:33,085 WARNING trying: dhcp-27-226:5672
2010-07-13 16:56:33,086 WARNING trying: dhcp-27-226:5672
Traceback (most recent call last):
  File "../drain.py", line 114, in <module>
    msg = rcv.fetch(timeout=timeout)
  File "<string>", line 6, in fetch
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 1005, in fetch
    self._ecwait(lambda: not self.draining)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait
    result = self._ewait(lambda: self.closed or predicate(), timeout)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 957, in _ewait
    result = self.session._ewait(lambda: self.error or predicate(), timeout)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 550, in _ewait
    result = self.connection._ewait(lambda: self.error or predicate(), timeout)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 194, in _ewait
    self.check_error()
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 187, in check_error
    raise self.error
qpid.messaging.exceptions.ConnectError: (111, 'Connection refused')
2010-07-13 16:56:33,089 WARNING trying: dhcp-27-226:5672
Exception in thread auto-fetch-reconnect-urls:
Traceback (most recent call last):
  File "/usr/lib64/python2.5/threading.py", line 460, in __bootstrap
    self.run()
  File "/usr/lib64/python2.5/threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/util.py", line 36, in main
    msg = rcv.fetch()
  File "<string>", line 6, in fetch
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 1000, in fetch
    msg = self.session._get(self, timeout=timeout)
  File "<string>", line 6, in _get
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 633, in _get
    timeout):
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 550, in _ewait
    result = self.connection._ewait(lambda: self.error or predicate(), timeout)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 194, in _ewait
    self.check_error()
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 187, in check_error
    raise self.error
ConnectError: (111, 'Connection refused')

[freznice@dhcp-26-251 test]$ ../qc_list_objects.py --broker dhcp-26-227 -qv --queue-filter=addr                                addr1  durable msg|byteEnq/Deq:2000/395|104935/20812 unacked:0 consumer_cnt:0 binding_cnt:1
[freznice@dhcp-26-251 test]$ ../drain.py --address "addr1; {create: sender, delete: receiver}" --broker dhcp-26-227   --reconnect --reconnect-interval 2 --reconnect-limit 5   --reconnect-urls-from-failover-exchange --print "%(C)s"> msgs.out
2010-07-13 16:57:08,881 WARNING set reconnect_urls for conn <qpid.messaging.endpoints.Connection instance at 0x128d9e0>: [u'10.34.26.227:5672,tcp']
Exception in thread auto-fetch-reconnect-urls:
Traceback (most recent call last):
  File "/usr/lib64/python2.5/threading.py", line 460, in __bootstrap
    self.run()
  File "/usr/lib64/python2.5/threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.5/site-packages/qpid/messaging/util.py", line 36, in main
    msg = rcv.fetch()
  File "<string>", line 6, in fetch
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 1002, in fetch
    self.check_closed()
  File "/usr/lib/python2.5/site-packages/qpid/messaging/endpoints.py", line 963, in check_closed
    raise LinkClosed()
LinkClosed: None

[freznice@dhcp-26-251 test]$ ../qc_list_objects.py --broker dhcp-26-227 -qv --queue-filter=addr                                [freznice@dhcp-26-251 test]$ rpm -q python-qpid
python-qpid-0.7.946106-4.fc8

# retested once more in pure rhel5 environmanet which proved above issue:
[root@dhcp-27-226 test]# cat ./msgs.in | ../spout --address "addr1; {create: sender, delete: receiver}"   --broker dhcp-26-227 --reconnect --reconnect-interval 3 --reconnect-limit 10   --reconnect-urls-from-failover-exchange
[root@dhcp-27-226 test]# ../drain.py --address "addr1; {create: sender, delete: receiver}" --broker dhcp-26-227   --reconnect --reconnect-interval 2 --reconnect-limit 5   --reconnect-urls-from-failover-exchange --print "%(C)s"> msgs.out
2010-07-13 17:21:06,180 WARNING set reconnect_urls for conn <qpid.messaging.endpoints.Connection instance at 0x2b98e960d6c8>: [u'10.34.26.227:5672,tcp', u'10.34.27.226:5672,tcp']
2010-07-13 17:21:07,819 WARNING recoverable error[attempt 0]: connection aborted
2010-07-13 17:21:07,819 WARNING sleeping 2 seconds
2010-07-13 17:21:09,819 WARNING trying: dhcp-26-227:5672
2010-07-13 17:21:09,821 WARNING recoverable error[attempt 1]: (111, 'Connection refused')
2010-07-13 17:21:09,821 WARNING sleeping 2 seconds
2010-07-13 17:21:11,821 WARNING trying: dhcp-26-227:5672
2010-07-13 17:21:11,823 WARNING recoverable error[attempt 2]: (111, 'Connection refused')
2010-07-13 17:21:11,823 WARNING sleeping 2 seconds
2010-07-13 17:21:13,823 WARNING trying: dhcp-26-227:5672
2010-07-13 17:21:13,825 WARNING recoverable error[attempt 3]: (111, 'Connection refused')
2010-07-13 17:21:13,825 WARNING sleeping 2 seconds
2010-07-13 17:21:15,825 WARNING trying: dhcp-26-227:5672
2010-07-13 17:21:15,827 WARNING recoverable error[attempt 4]: (111, 'Connection refused')
2010-07-13 17:21:15,827 WARNING sleeping 2 seconds
2010-07-13 17:21:17,827 WARNING trying: dhcp-26-227:5672
2010-07-13 17:21:17,829 WARNING recoverable error[attempt 5]: (111, 'Connection refused')
2010-07-13 17:21:17,829 WARNING sleeping 2 seconds
2010-07-13 17:21:19,829 WARNING trying: dhcp-26-227:5672
... // thousands of messages like this
2010-07-13 17:21:21,275 WARNING trying: dhcp-26-227:5672
Traceback (most recent call last):
  File "../drain.py", line 119, in ?
    ssn.acknowledge()
  File "<string>", line 6, in acknowledge
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 678, in acknowledge
    self._ecwait(lambda: not [m for m in messages if m in self.acked])
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait
    result = self._ewait(lambda: self.closed or predicate(), timeout)
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 550, in _ewait
    result = self.connection._ewait(lambda: self.error or predicate(), timeout)
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 194, in _ewait
    self.check_error()
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 187, in check_error
    raise self.error
qpid.messaging.exceptions.ConnectError: (111, 'Connection refused')
Exception in thread auto-fetch-reconnect-urls:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/threading.py", line 442, in __bootstrap
    self.run()
  File "/usr/lib64/python2.4/threading.py", line 422, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.4/site-packages/qpid/messaging/util.py", line 36, in main
    msg = rcv.fetch()
  File "<string>", line 6, in fetch
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 1000, in fetch
    msg = self.session._get(self, timeout=timeout)
  File "<string>", line 6, in _get
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 633, in _get
    timeout):
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 550, in _ewait
    result = self.connection._ewait(lambda: self.error or predicate(), timeout)
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 194, in _ewait
    self.check_error()
  File "/usr/lib/python2.4/site-packages/qpid/messaging/endpoints.py", line 187, in check_error
    raise self.error
ConnectError: (111, 'Connection refused')

2010-07-13 17:21:21,292 WARNING trying: dhcp-26-227:5672
[root@dhcp-27-226 test]# ls
ecode.ecode     msgs.in          test_api_tc.sh       test_reconnect.sh.log
e.exp           msgs.out         test_reconnect.sh    test_reconnect.sh.log_
functions.sh    run_connect.sh_  test_reconnect.sh~   test_reconnect.sh.log_fail
functions.sh.0  test_api.sh      test_reconnect.sh.0  test_reconnect.sh.log_fail2
[root@dhcp-27-226 test]# ls -lA ./msgs.* ; diff ./msgs.in msgs.out | head            -rw-rw-r-- 1 root root 106935 Jul 13 16:43 ./msgs.in                                 -rw-rw-r-- 1 root root   7614 Jul 13 17:21 ./msgs.out
138,2000d137
< 7vVEkJVUt0B70dOtGKb0F4RRwQ04gmRsKUwGTnm4aExIuBBy1Te402rSYjHlukd9YccDCRbjqtF2GNJ52
< jELZTjTpCXPkbiIwW1fo6WL4CobgrjVB3gxygesCEJsF7olh5rxpr60
< 3JDMOnoMUUoP18W9nECW
< wlOO9BvU7kFkNjAPxBAftHyeqEkSzoR5
< QZYMT3TXOmiX
< MhlQsoLHFAzi3EdIzglObwNFTUaEdsDoYjNMdQhRmfTaAXnN5nQDTInu6ESSEDLVNG4WVGuo6RD4WBINXsSn8qv3rLFcG5qA
< GOECFMlTM1kqfW1fDzgMnUiTrFAvHLOYUw373zLduHr115ipo0Llzqy0A6wySKyLLvRDIu8OjImh38VkxbHJxPoYKk
< XZKLihWjEQbuKs2NPQUet6eFqA89E6xY9QCIE8z6NfX7i9lJEtaorcReO0EbkdvrUIvP8kOwXo047C
< k0oLfefQ0Fk0mqBz
Comment 2 releng-rhel@redhat.com 2010-11-10 16:23:30 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.