| Summary: | sesame restart occasionly fails with qpid::types::InvalidConversion exception | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Frantisek Reznicek <freznice> |
| Component: | sesame | Assignee: | grid-maint-list <grid-maint-list> |
| Status: | CLOSED WONTFIX | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | low | ||
| Version: | 2.0 | CC: | esammons, jross, matt, sgraf, tmckay, tross |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-05-26 20:23:21 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 834645 | ||
MRG-Grid is in maintenance and only customer escalations will be considered. This issue can be reopened if a customer escalation associated with it occurs. |
Description of problem: Cluster configuration test which in loops cluster configuration found that occasionly sesame does not restart / start due to Varian's qpid::types::InvalidConversion exception : [23:30:50] ERROR:dhcp-37-124...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:2, dur.:3.26 [23:30:50] DEBUG:stdout: sesame (pid 811) is running... Stopping Sesame daemon: [ OK ] Starting Sesame daemon: [ OK ] sesame dead but subsys locked [23:30:50] DEBUG:stderr: terminate called after throwing an instance of 'qpid::types::InvalidConversion' what(): invalid conversion: Cannot convert 2fc1d1a2-6cc6-139e-4b72-43004da2f700 (qpid/types/Variant.cpp:130) The above exception is sometimes shown, sometimes not. The behavior was seen few times on 4 node RHEL 5.7 cluster running cluster_test. Version-Release number of selected component (if applicable): openais-0.80.6-30.el5_7.4 openais-devel-0.80.6-30.el5_7.4 python-qpid-0.14-1.el5 python-qpid-qmf-0.14-2.el5 python-saslwrapper-0.10-4.el5 qpid-cpp-client-0.14-3.el5 qpid-cpp-client-devel-0.14-3.el5 qpid-cpp-client-devel-docs-0.14-3.el5 qpid-cpp-client-rdma-0.14-3.el5 qpid-cpp-client-ssl-0.14-3.el5 qpid-cpp-server-0.14-3.el5 qpid-cpp-server-cluster-0.14-3.el5 qpid-cpp-server-devel-0.14-3.el5 qpid-cpp-server-rdma-0.14-3.el5 qpid-cpp-server-ssl-0.14-3.el5 qpid-cpp-server-store-0.14-3.el5 qpid-cpp-server-xml-0.14-3.el5 qpid-java-client-0.14-1.el5 qpid-java-common-0.14-1.el5 qpid-java-example-0.14-1.el5 qpid-qmf-0.14-2.el5 qpid-qmf-debuginfo-0.14-2.el5 qpid-qmf-devel-0.14-2.el5 qpid-tests-0.14-1.el5 qpid-tools-0.14-1.el5 rgmanager-2.0.52-21.el5 rh-qpid-cpp-tests-0.14-3.el5 ruby-qpid-qmf-0.14-2.el5 ruby-saslwrapper-0.10-4.el5 saslwrapper-0.10-4.el5 saslwrapper-debuginfo-0.10-4.el5 saslwrapper-devel-0.10-4.el5 sesame-1.0-2.el5 sesame-debuginfo-1.0-2.el5 How reproducible: < 5 % (low reproducibility) Steps to Reproduce: Having N cluster nodes [ex. 4], loop below steps 1. configure cluster configuration to (2^N)-1 [15 == all machines up] 2. for nodes which are up issue: service sesame status && service sesame restart || service sesame start ; service sesame status for nodes which are down issue: service sesame status && service sesame stop || service sesame start Expect exit code of 0 from above command[s] 3. configure cluster to (2^N)-2 [14 == last machine down] 4. perform 2 ... 5. configure cluster to 1 [just last machine up] 6. perform 2 7. go to 1 Actual results: Sesame rarely fails to start Expected results: Sesame should reliably start / restart Additional info: ... [23:29:59] INFO:dhcp-37-125...:'qpid-cluster', result:0, dur.:1.28 [23:29:59] INFO:localhost:'qpid-cluster cluster size check (exp. 3)', result:True, dur.:-1.00 [23:30:01] INFO:dhcp-37-128...:'qpid-cluster', result:0, dur.:1.12 [23:30:01] INFO:localhost:'qpid-cluster cluster size check (exp. 3)', result:True, dur.:-1.00 [23:30:01] DEBUG:Go to cluster state: 12 [True, True, False, False] [23:30:11] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1.00 [23:30:12] INFO:localhost:'Cluster state: 2 (exp. 2) of 4 up and running', result:True, dur.:-1.00 [23:30:16] INFO:dhcp-37-124...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:3.24 [23:30:20] INFO:dhcp-37-125...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:3.24 [23:30:21] INFO:dhcp-37-127...:'service sesame status && service sesame stop || service sesame start', result:0, dur.:0.19 [23:30:22] INFO:dhcp-37-128...:'service sesame status && service sesame stop || service sesame start', result:0, dur.:0.13 [23:30:24] INFO:dhcp-37-124...:'qpid-cluster', result:0, dur.:1.29 [23:30:24] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1.00 [23:30:26] INFO:dhcp-37-125...:'qpid-cluster', result:0, dur.:1.35 [23:30:26] INFO:localhost:'qpid-cluster cluster size check (exp. 2)', result:True, dur.:-1.00 [23:30:27] DEBUG:Go to cluster state: 11 [True, False, True, True] [23:30:45] INFO:localhost:'cluster topology change (ecode: True exp. True)', result:True, dur.:-1.00 [23:30:46] INFO:localhost:'Cluster state: 3 (exp. 3) of 4 up and running', result:True, dur.:-1.00 [23:30:50] ERROR:dhcp-37-124...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:2, dur.:3.26 [23:30:50] DEBUG:stdout: sesame (pid 811) is running... Stopping Sesame daemon: [ OK ] Starting Sesame daemon: [ OK ] sesame dead but subsys locked [23:30:50] DEBUG:stderr: terminate called after throwing an instance of 'qpid::types::InvalidConversion' what(): invalid conversion: Cannot convert 2fc1d1a2-6cc6-139e-4b72-43004da2f700 (qpid/types/Variant.cpp:130) [23:30:51] INFO:dhcp-37-125...:'service sesame status && service sesame stop || service sesame start', result:0, dur.:0.13 [23:30:55] INFO:dhcp-37-127...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:3.25 [23:30:56] INFO:dhcp-37-128...:'service sesame status && service sesame restart || service sesame start ; service sesame status', result:0, dur.:0.19 [23:30:58] INFO:dhcp-37-124...:'qpid-cluster', result:0, dur.:1.06 [23:30:58] INFO:localhost:'qpid-cluster cluster size check (exp. 3)', result:True, dur.:-1.00 [23:31:00] INFO:dhcp-37-127...:'qpid-cluster', result:0, dur.:1.40 [23:31:00] INFO:localhost:'qpid-cluster cluster size check (exp. 3)', result:True, dur.:-1.00 [23:31:03] INFO:dhcp-37-128...:'qpid-cluster', result:0, dur.:1.18 [23:31:03] INFO:localhost:'qpid-cluster cluster size check (exp. 3)', result:True, dur.:-1.00 [23:31:03] DEBUG:Go to cluster state: 10 [True, False, True, False] ...